Gitome: A curated dataset for GitHub README-related tasks
Creators
- 1. University of L'Aquila
- 2. Università degli Studi dell'Aquila
Description
About
This repository contains the source code implementation used to replicate the experimental results obtained in the submitted to the 21st International Conference on Mining Software Repositories (MSR204).
"Gitome: A curated dataset for GitHub README-related tasks"
authored by:
Claudio Di Sipio, Juri Di Rocco, Riccardo Rubei, Phuong Than Nguyen, and Davide Di Ruscio,
Università degli Studi dell'Aquila, Italy
Data description
The dataset is structured as follows:
- emf_metamodel.zip: It contains the Ecore project with the Gitome data model
- existing_dumps.zip: It contains the existing datasets used to build Gitome
- lang_aggr_stats.csv: It contains the language data to compute the statistics presented in the paper
- langs.csv: It contains all the languages and their frequency
- output_dataset.zip: It contains the benchmarking dataset obtained by parsing the README files
- repository_lists.zip: It contains the list of repositories for each considered dataset (with possible duplicates)
- topics.csv: It contains all the topics and their frequency
- topics_aggr_stats.csv: It contains the topics data to compute the statistics presented in the paper
- gitome_repo.txt: It contains the list of the URLs of the considered GitHub repositories
How to collect Gitome
To collect all the data stored in this archive, please refer to the supporting Github repository https://github.com/MDEGroup/Gitome-MSR2024.
Files
emf_datamodel.zip
Files
(85.2 MB)
Name | Size | Download all |
---|---|---|
md5:a82b1b6f05340f443f3510e4f7f7ffdd
|
5.6 MB | Preview Download |
md5:9d151fe8bd63cd469213f1ae3bdccef3
|
24.4 MB | Preview Download |
md5:53f4a2fba4e1a9357cf7927cb52bdb3a
|
308.4 kB | Preview Download |
md5:acf550b22a0436b01c1abfe086adfb26
|
4.5 kB | Preview Download |
md5:b60dd24aa832a33e1ad238989eb6095d
|
3.9 kB | Preview Download |
md5:a68e5092ec4140bcfd08d41fc284f2b4
|
54.4 MB | Preview Download |
md5:94705f13ac4ea57f3c424c80cad1053a
|
189.9 kB | Preview Download |
md5:e7ba7a067f3fc48a2e47082da9689d87
|
183.4 kB | Preview Download |
md5:5aa8612879af6b194b04a84e6a185f0d
|
5.9 kB | Preview Download |