Abstract
Key class identification approaches aim at identifying the most important classes to help developers, especially newcomers, start the software comprehension process. So far, many supervised and unsupervised approaches have been proposed; however, they have not considered the effort to comprehend classes. In this article, we identify the challenge of “effort-aware key class identification”; to partially tackle it, we propose an approach,
- [1] . 2018. Is “better data” better than “better data miners”?: On the benefits of tuning SMOTE for defect prediction. In Proceedings of the 40th International Conference on Software Engineering (ICSE’18), , , , and (Eds.). ACM, 1050–1061.Google ScholarDigital Library
- [2] . 2007. Efficient time-aware prioritization with knapsack solvers. In Proceedings of the 1st ACM International Workshop on Empirical Assessment of Software Engineering Languages and Technologies: Held in Conjunction with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). 13–18.Google ScholarDigital Library
- [3] . 2014. When to use the Bonferroni correction. Ophthalm. Physiol. Optics 34, 5 (2014), 502–508.Google ScholarCross Ref
- [4] . 2020. An empirical validation of cognitive complexity as a measure of source code understandability. In Proceedings of the ACM/IEEEInternational Symposium on Empirical Software Engineering and Measurement (ESEM’20), , , , and (Eds.). ACM, 5:1–5:12.Google Scholar
- [5] . 1999. Analysing software systems by using combinations of metrics. In Object-oriented Technology, ECOOP’99 Workshop Reader, ECOOP’99 Workshops, Panels, and Posters, Lisbon, Portugal, June 14–18, 1999, Proceedings (Lecture Notes in Computer Science), Vol. 1743. Springer, 170–171.Google Scholar
- [6] . 1999. A unified framework for coupling measurement in object-oriented systems. IEEE Trans. Softw. Eng. 25, 1 (1999), 91–121.Google ScholarDigital Library
- [7] . 1997. An investigation into coupling measures for C++. In Proceedings of the 19th International Conference on Software Engineering. ACM, 412–421.Google ScholarDigital Library
- [8] . 2018. Cognitive complexity: An overview and evaluation. In Proceedings of the International Conference on Technical Debt (TechDebt@ICSE’18), , , and (Eds.). ACM, 57–58.Google ScholarDigital Library
- [9] . 2017. Automatic clustering constraints derivation from object-oriented software using weighted complex network with graph theory analysis. J. Syst. Softw. 133 (2017), 28–53.Google ScholarCross Ref
- [10] . 2014. Ordinal Methods for Behavioral Data Analysis. Psychology Press.Google ScholarCross Ref
- [11] . 2007. Power-laws in a large object-oriented software system. IEEE Trans. Softw. Eng. 33, 10 (2007), 687–708.Google ScholarDigital Library
- [12] . 2009. A systematic survey of program comprehension through dynamic analysis. IEEE Trans. Softw. Eng. 35, 5 (2009), 684–702.Google ScholarDigital Library
- [13] . 2001. An exploratory study of program comprehension strategies of procedural and object-oriented programmers. Int. J. Hum. Comput. Stud. 54, 1 (2001), 1–23.Google ScholarDigital Library
- [14] . 2009. Reading the documentation of invoked API functions in program comprehension. In Proceedings of the 17th IEEE International Conference on Program Comprehension (ICPC’09). IEEE Computer Society, 168–177.Google ScholarCross Ref
- [15] . 2010. The effects of time constraints on test case prioritization: A series of controlled experiments. IEEE Trans. Softw. Eng. 36, 5 (2010), 593–617.Google ScholarDigital Library
- [16] . 2019. Key classes in object-oriented systems: Detection and assessment. Int. J. Softw. Eng. Knowl. Eng. 29, 10 (2019), 1439–1463.Google ScholarCross Ref
- [17] . 2001. Coupling and cohesion as modularization drivers: Are we being over-persuaded? In Proceedings of the 5th Conference on Software Maintenance and Reengineering (CSMR’01), and (Eds.). IEEE Computer Society, 47–57.Google ScholarCross Ref
- [18] . 2000. A coupling-guided cluster analysis approach to reengineer the modularity of object-oriented systems. In Proceedings of the 4th European Conference on Software Maintenance and Reengineering (CSMR’00). IEEE Computer Society, 13–22.Google ScholarCross Ref
- [19] . 2011. Exploring software measures to assess program comprehension. In Proceedings of the 5th International Symposium on Empirical Software Engineering and Measurement (ESEM’11). IEEE Computer Society, 127–136.Google ScholarDigital Library
- [20] . 2012. Identifying thresholds for object-oriented software metrics. J. Syst. Softw. 85, 2 (2012), 244–257.Google ScholarDigital Library
- [21] . 2019. Assessment of the effectiveness of seven biometric feature normalization techniques. IEEE Trans. Inf. Forens. Secur. 14, 10 (2019), 2528–2536.Google ScholarCross Ref
- [22] . 2014. Degree-of-knowledge: Modeling a developer’s knowledge of code. ACM Trans. Softw. Eng. Methodol. 23, 2 (2014), 14:1–14:42.Google ScholarDigital Library
- [23] . 2010. Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf. Sci. 180, 10 (2010), 2044–2064.Google ScholarDigital Library
- [24] . 2008. An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J. Mach. Learn. Res. 9 (2008), 2677–2694.Google Scholar
- [25] . 2009. Power-law distributions of component size in general software systems. IEEE Trans. Softw Eng. 35, 4 (2009), 566–572.Google ScholarDigital Library
- [26] . 2005. An index to quantify an individual’s scientific research output. Proc. Natl. Acad. Sci. USA 102, 46 (2005), 16569–16572.Google ScholarCross Ref
- [27] . 2015. A PageRank based recommender system for identifying key classes in software systems. In Proceedings of the 10th IEEE Jubilee International Symposium on Applied Computational Intelligence and Informatics (SACI’15). IEEE, 495–500.Google ScholarCross Ref
- [28] . 2007. The R- and AR-indices: Complementing the h-index. Chinese Sci. Bull. 52, 6 (2007), 855–863.Google ScholarCross Ref
- [29] . 2010. Revisiting common bug prediction findings using effort-aware models. In Proceedings of the IEEE International Conference on Software Maintenance. IEEE, 1–10.Google ScholarDigital Library
- [30] . 2012. A large-scale empirical study of just-in-time quality assurance. IEEE Trans. Softw. Eng. 39, 6 (2012), 757–773.Google ScholarDigital Library
- [31] . 2004. A complexity measure for ontology based on UML. In Proceedings of the 10th IEEE International Workshop on Future Trends of Distributed Computing Systems (FTDCS’04). IEEE Computer Society, 222–228.Google Scholar
- [32] . 2008. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. Softw. Eng. 34, 4 (2008), 485–496.Google ScholarDigital Library
- [33] . 2003. How software engineers use documentation: The state of the practice. IEEE Softw. 20, 6 (2003), 35–39.Google ScholarDigital Library
- [34] . 2006. Test strategies for cost-sensitive decision trees. IEEE Trans. Knowl. Data Eng. 18, 8 (2006), 1055–1067.Google ScholarDigital Library
- [35] . 2018. Are smell-based metrics actually useful in effort-aware structural change-proneness prediction? An empirical study. In Proceedings of the 25th Asia-Pacific Software Engineering Conference (APSEC’18). IEEE, 315–324.Google ScholarCross Ref
- [36] . 2008. Power laws in software. ACM Trans. Softw. Eng. Methodol. 18, 1 (2008), 2:1–2:26.Google ScholarDigital Library
- [37] . 2010. A hybrid set of complexity metrics for large-scale object-oriented software systems. J. Comput. Sci. Technol. 25, 6 (2010), 1184–1201.Google ScholarCross Ref
- [38] . 2001. Supporting program comprehension using semantic and structural information. In Proceedings of the 23rd International Conference on Software Engineering (ICSE’01). IEEE Computer Society, 103–112.Google ScholarDigital Library
- [39] . 2008. Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans. Softw. Eng. 34, 2 (2008), 287–300.Google ScholarDigital Library
- [40] . 2018. Towards prioritizing documentation effort. IEEE Trans. Softw. Eng. 44, 9 (2018), 897–913.Google ScholarDigital Library
- [41] . 1976. A complexity measure. IEEE Trans. Softw. Eng. 2, 4 (1976), 308–320.Google ScholarDigital Library
- [42] . 2010. Effort-aware defect prediction models. In Proceedings of the 14th European Conference on Software Maintenance and Reengineering. IEEE, 107–116.Google ScholarDigital Library
- [43] . 2014. Identifying important classes of large software systems through k-Core decomposition. Adv. Complex Syst. 17, 7-8 (2014).Google Scholar
- [44] . 2015. I know what you did last summer: An investigation of how developers spend their time. In Proceedings of the IEEE 23rd International Conference on Program Comprehension (ICPC’15). IEEE Computer Society, 25–35.Google ScholarDigital Library
- [45] . 2013. Transfer defect learning. In Proceedings of the 35th International Conference on Software Engineering (ICSE’13). 382–391.Google ScholarCross Ref
- [46] . 2004. Expectation-based, inference-based, and bottom-up software comprehension. J. Softw. Mainten. Res. Pract. 16, 6 (2004), 427–447.Google ScholarDigital Library
- [47] . 2013. An analysis of machine learning algorithms for condensing reverse engineered class diagrams. In Proceedings of the IEEE International Conference on Software Maintenance. IEEE Computer Society, 140–149.Google ScholarDigital Library
- [48] . 2023. Identifying key classes for initial software comprehension: Can we do it better? In Proceedings of the IEEE/ACM 45th International Conference on Software Engineering (ICSE’23). 1878–1889.Google ScholarDigital Library
- [49] . 2021. ElementRank: Ranking Java software classes and packages using a multilayer complex network-based approach. IEEE Trans. Softw. Eng. 47, 10 (2021), 2272–2295.Google ScholarCross Ref
- [50] . 2023. Pride: Prioritizing documentation effort based on a PageRank-like algorithm and simple filtering rules. IEEE Trans. Softw. Eng. 49, 3 (2023), 1118–1151.Google ScholarDigital Library
- [51] . 2022. Comments on “Using k-core Decomposition on class dependency networks to improve bug prediction model’s practical performance.” IEEE Trans. Softw. Eng. 48, 12 (2022), 5176–5187.Google Scholar
- [52] . 2018. Identifying key classes in object-oriented software using generalized k-core decomposition. Fut. Gen. Comput. Syst. 81 (2018), 188–202.Google ScholarDigital Library
- [53] . 2010. Ranking software artifacts. In Proceedings of the 4th Workshop on FAMIX and Moose in Reengineering (FAMOOSr’10), Vol. 120. Citeseer.Google Scholar
- [54] . 2005. Scale-free geometry in OO programs. Commun. ACM 48, 5 (2005), 99–103.Google ScholarDigital Library
- [55] . 2017. Improving modular structure of software system using structural and lexical dependency. Inf. Softw. Technol. 82 (2017), 96–120.Google ScholarCross Ref
- [56] . 2021. Using K-core decomposition on class dependency networks to improve bug prediction model’s practical performance. IEEE Trans. Softw. Eng. 47, 2 (2021), 348–366.Google ScholarDigital Library
- [57] . 2002. The role of concepts in program comprehension. In Proceedings of the 10th International Workshop on Program Comprehension (IWPC’02). IEEE Computer Society, 271–278.Google ScholarCross Ref
- [58] . 2012. How do professional developers comprehend software? In Proceedings of the 34th International Conference on Software Engineering (ICSE’12), , , and (Eds.). IEEE Computer Society, 255–265.Google ScholarDigital Library
- [59] . 2021. Automatically assessing code understandability. IEEE Trans. Softw. Eng. 47, 3 (2021), 595–613.Google ScholarCross Ref
- [60] . 2017. Automatically assessing code understandability: How far are we? In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). IEEE Computer Society, 417–427.Google ScholarDigital Library
- [61] . 2016. Improving code readability models with textual features. In Proceedings of the 24th IEEE International Conference on Program Comprehension (ICPC’16). IEEE Computer Society, 1–10.Google ScholarCross Ref
- [62] . 1984. Empirical studies of programming knowledge. IEEE Trans. Softw. Eng. 10, 5 (1984), 595–609.Google ScholarDigital Library
- [63] . 2015. Finding the right needles in hay—Helping program comprehension of large software systems. In Proceedings of the 10th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE’15), and (Eds.). SciTePress, 129–140.Google ScholarDigital Library
- [64] . 2019. Finding key classes in object-oriented software systems by techniques based on static analysis. Inf. Softw. Technol. 116 (2019).Google ScholarDigital Library
- [65] . 2012. Using network analysis for recommendation of central software classes. In Proceedings of the 19th Working Conference on Reverse Engineering (WCRE’12). IEEE Computer Society, 93–102.Google ScholarDigital Library
- [66] . 2010. The Qualitas Corpus: A curated collection of Java code for empirical studies. In Proceedings of the 17th Asia Pacific Software Engineering Conference (APSEC’10), and (Eds.). IEEE Computer Society, 336–345.Google ScholarDigital Library
- [67] . 2014. Condensing class diagrams by analyzing design and network metrics using optimistic classification. In Proceedings of the 22nd International Conference on Program Comprehension (ICPC’14). ACM, 110–121.Google ScholarDigital Library
- [68] . 2000. Types of cost in inductive concept learning. In Proceedings of the Workshop on Cost-sensitive Learning at the 17th International Conference on Machine Learning (ICML’00). IEEE Computer Society, 15–21.Google Scholar
- [69] . 2006. TimeAware test suite prioritization. In Proceedings of the International Symposium on Software Testing and Analysis. 1–12.Google ScholarDigital Library
- [70] . 2011. Identifying key classes using h-index and its variants. J. Front. Comput. Sci. Technol. 5, 10 (2011), 891–903.Google Scholar
- [71] . 2013. Using class imbalance learning for software defect prediction. IEEE Trans. Reliab. 62, 2 (2013), 434–443.Google ScholarCross Ref
- [72] . 2011. Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition. Morgan Kaufmann, Elsevier.Google Scholar
- [73] . 2021. The mind is a powerful place: How showing code comprehensibility metrics influences code understanding. In Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering (ICSE’21). IEEE, 512–523.Google ScholarDigital Library
- [74] . 2016. Condensing class diagrams with minimal manual labeling cost. In Proceedings of the 40th IEEE Annual Computer Software and Applications Conference (COMPSAC’16). IEEE Computer Society, 22–31.Google ScholarCross Ref
- [75] . 2015. Are slice-based cohesion metrics actually useful in effort-aware post-release fault-proneness prediction? An empirical study. IEEE Trans. Softw. Eng. 41, 4 (2015), 331–357.Google ScholarDigital Library
- [76] . 2005. Applying webmining techniques to execution traces to support the program comprehension process. In Proceedings of the 9th European Conference on Software Maintenance and Reengineering (CSMR’05). IEEE Computer Society, 134–142.Google ScholarDigital Library
- [77] . 2008. Automatic identification of key classes in a software system using webmining techniques. J. Softw. Mainten. Res. Pract. 20, 6 (2008), 387–417.Google ScholarDigital Library
- [78] . 2009. Time-aware test-case prioritization using integer linear programming. In Proceedings of the 18th International Symposium on Software Testing and Analysis. 213–224.Google ScholarDigital Library
Index Terms
- EASE: An Effort-aware Extension of Unsupervised Key Class Identification Approaches
Recommendations
Identifying key classes in object-oriented software using generalized k-core decomposition
Identifying key classes can help developers familiarize with a previously unknown software system. Complex network research opens new opportunities for identifying key classes, and many approaches have been proposed. However, the software network that ...
Improving the Condensing of Reverse Engineered Class Diagrams using Weighted Network Metrics
ICSE-Companion '24: Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion ProceedingsReverse engineered class diagrams (REDs) are helpful to ease the comprehension of complex software. However, the original REDs might contain many details and thus provide little benefit. Condensing REDs by identifying the most important classes (aka key ...
Finding key classes in object-oriented software systems by techniques based on static analysis
Highlights- We define different class attributes that potentially characterize key classes.
Abstract ContextSoftware maintenance is burdened by program comprehension activities which consume a big part of project resources. Program comprehension is difficult because the code to be analyzed is very large and the ...
Comments