Published March 14, 2022 | Version v1
Dataset Open

Mining Fork-Including Software Development Traces

  • 1. University of Haifa, Israel

Description

This dataset relates to the paper: Mining Fork-Including Development Traces (abstract below)
Authors: Iris Reinhartz-Berger and Amir Tomer
Starting point: readme.txt

Open-source software development is a common practice that encourages collaborative development and reuse across projects. Forking is a way to make a copy of an existing project and explore it for different purposes. Two types of forks are commonly mentioned in the literature: contributing forks which continue the development lines of the forked projects and aim at merging the contribution back to the forked projects; and independently developed forks which open new lines of development deviating from the forked projects. In this study, we aim to explore characteristics of fork-involving software development traces. Analyzing 880 Java projects and their related action and observation events, with process mining and statistical techniques, we found that the occurrence of certain event types may predict the fork type, while the creation of certain fork types increase the involvement of users in the forked projects.

Files

dataset_events_all.csv

Files (1.0 GB)

Name Size Download all
md5:831e9c2318de1ce0086fc01e834823b3
546.6 MB Preview Download
md5:fe1dea34890a747b71cfeab34aaa3531
461.9 MB Preview Download
md5:9c118f80649ac63cc6fe92ad8818dfa2
100.3 kB Download
md5:8e9d3ea2060e3169e10ee2af2923677e
40.5 MB Download
md5:f4e8cf30f022dba6575fbdf0fbd42c82
870 Bytes Preview Download
md5:8196d013461afb68eee51f2a9c0a0a0c
102.9 kB Download
md5:effc25cf74a118b563b6879bf97ba144
98.4 kB Download