Lindholmen Dataset

This initiative aims at providing researchers with a list of open source projects that use UML. The dataset includes links to more than 93 000 UML files (spread across more than 24 000 GitHub repositories).

The dataset can be found here.

The mirrored information on the dataset can be found here.

The work is based on multiple publications including:

- The Quest for Open Source Projects that Use UML: Mining GitHub; Hebig, R. & Ho-Quang, T. & Robles, G. & Fernandez, M.A. & Chaudron, M.R.V. (2016). In proceedings, ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems, pages 173-183, Saint-Malo, France, October 2-7, 2016.

- Practices and perceptions of UML use in open source projects; Ho-Quang, T. & Hebig, R. & Robles, G. & Chaudron, M.R.V. & Fernandez, M.A. (2017). In proceedings of the 39th International Conference on Software Engineering: Software Engineering in Practice Track, Pages 203-212, Buenos Aires, Argentina — May 20 - 28, 2017.

- An extensive dataset of UML models in GitHub; Robles, G. & Ho-Quang, T. & Hebig, R. & Chaudron, M.R.V. & Fernandez, M.A. (2017). In Proceedings of the 14th International Conference on Mining Software Repositories, Pages 519-522, Buenos Aires, Argentina — May 20 - 28, 2017