Machine Learning for Change-Prone Class Prediction: A History-Based Approach

Published: 05 October 2022 Publication History


Classes have a very dynamic life cycle in object-oriented software projects. They can be created, modified or removed due to different reasons. The prediction of prone-change classes in the early stages of the project positively impact the team’s productivity, the allocation of resources, and the quality of the software developed. Existing work uses Machine Learning (ML) and different kind of class metrics. But a limitation of existing work that they do not consider the temporal dependency between instances in the datasets. To fulfill such gap, this work introduces an approach based on the change history of the class in different releases from public repositories. The approach uses the Sliding Window method, and adopts as predictors structural and evolutionary metrics, as well as frequency and diversity of smells. Five projects and four ML algorithms are used in the evaluation. In the great majority of the cases our approach overcomes a traditional approach considering all the indicators. Random Forest presents the best performance and the use of smell-related information does not impact the results.


Thomas G. Dietterich. 2002. Machine Learning for Sequential Data: A Review. In Structural, Syntactic, and Statistical Pattern Recognition, Terry Caelli, Adnan Amin, Robert P. W. Duin, Dick de Ridder, and Mohamed Kamel (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 15–30.
    Author Tags

    1. class change proneness
    2. machine learning
    3. temporal dependency


    Funding Sources

    • CAPES and CNPq Brazil


    SBES 2022
    SBES 2022: XXXVI Brazilian Symposium on Software Engineering
    October 5 - 7, 2022
    Virtual Event, Brazil

