Analyzing the structure of Java software systems by weighted -core decomposition
Introduction
Over the past few years, the study of complex networks has gained overwhelming popularity [[1], [2], [3], [4]]. It provides a unifiedperspective for studying various complex systems simply bymodeling them as a network. Software systems, no matter object-orientation (OO) and structured programming, can be mapped to a network (or graph), also known as software network, where network nodes represent the software entities such as methods/attributes, classes/interfaces, or packages, and network edges (or links), couplings between them [5]. With the software becoming ever larger and complex, the idea of applying complex network theory to model large software and further to interpret their global statistical properties is viable [[5], [6]]. Great efforts have been made to understand the topological structure of software and many shared physics-like laws of software systems have been revealed such as scale-free [[5], [7], [8], [9]], small-world [[5], [9], [10]], and fractal properties [6].
-core structure [[11], [12], [13]] is another interesting structural property that are not captured by scale-free, small-world, or other simple topological properties. An in-depth investigation of the -core structures of software networks is very important for deeply understanding the inner characteristics of software systems[[12], [13], [14]]. Several related studies have been performed [[12], [13], [14], [15]]. However, one major limitation of these methods is that the software networks they used are un-weighted, which does not conform to the reality of a piece of software [[9], [16]]. Another limitation of the existing methods is that the software systems they analyzed are mainly written in C++ language. Little attention has been paid to the analysis of -core structure of weighted software networks extracted from Java software systems.
The objective of this paper is to explore the characteristics of -core structure in weighted software networks extracted from Java software systems. First, we formally represent the topological structure of Java software at the class level of granularity using a weighted software network, which takes into consideration the coupling frequencies between classes as weights. Second, we introduce the -core decomposition method for weighted complex networks proposed in [17] (hereinafter referred to as ) and use it to calculate the -core structure of the weighted software network. will partition the weighted software network into a layered structure which will be further measured by amount of relevant properties by statistical parameters. Our approach could potentially uncover some characteristics enclosed in the topological structure of software systems, which can help developers to improve software understanding, propose new metrics for software measurement and evaluate the quality of the system in development.
The primary contributions of the current paper are as follows:
-
We propose an approach to empirically investigate the static and evolving topological properties enclosed in the weighted software networks by using weighted -core decomposition.
-
We propose a weighted software network to represent the topological structure of a software system at the class level, which uses the coupling frequencies to assign weights to the edges.
-
Our approach is illustrated using a set of 16 open source software systems and several interesting observations are obtained.
The rest of this paper is structured as follows. Section 2 gives a brief overview of the related work on investigation of the -core structures of software networks. In Section 3, we describes our approach in detail, with focus on the definition of the weighted software network and . In Section 4, we use to partition the weighted software network into a layered structure and use some statistical parameters to uncover some characteristics enclosed in the topological structure of software systems. In Section 5, we discuss the implications of the results obtained in the current work to software engineering. And we conclude this paper in Section 6.
Section snippets
Related work
To the best of our knowledge, there are only several research studies that have been performed to investigate the -core structures of software networks. They are all published before the year of 2016.
Zhang et al. [[12], [14]] investigated the topological properties of a set of un-weighted software networks extracted from software systems at the class level, and found some noticeable properties such as small software coreness, high-core connecting tendency of classes, and evolution stability of
Method
Our approach works as follows. First, we will parse the .java files of a Java software system to extract meaningful structural information in the source code and propose a weighted software network to formally represent the extracted information. Second, we will employ to obtain the -core structure of the weighted software network. Finally, the -core structure is characterized by a amount of relevant properties via statistical parameters. The following subsections will discuss the
Empirical study
We designed and conducted a set of experiments to investigate the topological structure and its evolution of real-world software systems using weighted -core decomposition method. Our experiments were carried out on a PC at 2.6 GHz with 8 GB of RAM.
In the following sections, we describe in detail the objects of study (Section 4.1) and our analysis of the results (Section 4.2).
Implications for software engineering
Complex systems and complexity science are viewed as the ‘21st Century Science’ [40]. Its basic view is that the topological structure determines the function, emphasizing the view of the system as a whole. Software networks represent another important class of complex networks which can also be studied using complex network theory. It provides a different dimension to our understanding of software from the perspective of software as a whole, ignoring the microscopic details. Research on
Conclusions
In this work, we propose an approach to uncover the properties enclosed in the weighted software networks to help developers improve software understanding, propose new metrics for software measurement, and evaluate the quality of the system in development. To analyze the topological properties of software, we first propose a weighted class coupling network (WCCN) to represent a piece of software at the class level of granularity which takes into consideration the coupling frequency to assign
Acknowledgment
This work was supported by the National Key Research and Development Program of China (Nos. 2016YFB0800400 and 2014CB340404), the National Natural Science Foundation of China (Nos. 61273216, 61572371 and 61402406), the Zhejiang Provincial Nature Science Foundation of China (No. LY15F020004) and the Commonweal Project of Science and Technology Department of Zhejiang Province (No. 2014C23008).
Weifeng Pan received his Ph.D. degree from School of Computer at Wuhan University, China, in 2011. He is presently an associate professor in School of Computer Science and Information Engineering at Zhejiang Gongshang University. He is also a member of China Computer Federation (CCF) and ACM. His current research interests include software engineering, service computing, complex networks, and intelligent computation.
References (51)
- et al.
Pervasive social networking forensics: Intelligence and evidence from mobile device extracts
J. Netw. Comput. Appl.
(2017) - et al.
Hypergraph partitioning for social networks based on information entropy modularity
J. Netw. Comput. Appl.
(2017) - et al.
The fractal dimension of software networks as a global quality metric
Inform. Sci.
(2013) - et al.
Context-oriented web application protection model
Appl. Math. Comput.
(2016) - et al.
Web application protection techniques: A taxonomy
J. Netw. Comput. Appl.
(2016) - et al.
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power
Inform. Sci.
(2010) - et al.
A modular attachment mechanism for software network evolution
Physica A
(2013) - et al.
Multi-level formation of complex software systems
Entropy
(2016) - et al.
Object-oriented metrics that predict maintainability
J. Syst. Softw.
(1993) - et al.
Software architecture graphs as complex networks: A novel parttion scheme to measure stability and evolution
Inform. Sci.
(2007)
Identification of influential spreaders in online social networks using interaction weighted k-core decomposition method
Physica A
Virtual community detection through the association between prime nodes in online social networks and its application to ranking algorithms
IEEE Access
Software systems as complex networks: Structure, function, and evolvability of software collaboration graphs
Phys. Rev. E
Scale-free geometry in OO programs
Commun. ACM
Power-laws in a large object-oriented software system
IEEE Trans. Softw. Eng.
Multi-granularity evolution analysis of software using complex network theory
J. Syst. Sci. Complex.
A hybrid set of complexity metrics for large-scale object-oriented software systems
J. Comput. Sci. Tech.
Generalized cores
Adv. Data Anal. Classif.
Using the -core decomposition to analyze the static structure of large-scale software systems
J. Supercomput.
Research on hierarchy of large-scale software macro-topology base on -core
Chin. J. Electron.
Extraction and analysis of crucial fraction in software networks
Int. J. Softw. Eng. Knowl. Eng.
Measuring structural quality of object-oriented softwares via bug propagation analysis on weighted software networks
J. Comput. Sci. Tech.
A -shell decomposition method for weighted networks
New J. Phys.
Intent-based extensible real-time PHP supervision framework
IEEE Trans. Inf. Forensics Secur.
Cited by (38)
An analysis on the spatiotemporal behavior of inbound tourists in Jiaodong Peninsula based on Flickr geotagged photos
2023, International Journal of Applied Earth Observation and GeoinformationAn improved Nyström spectral graph clustering using k-core decomposition as a sampling strategy for large networks
2022, Journal of King Saud University - Computer and Information SciencesCitation Excerpt :Then, Batagelj and Zaversnik (2003) proposed an efficient k-core decomposition algorithm with a complexity of only O(m) to quickly obtain the k-core of networks. The k-core decomposition, which decomposes the entire graph into several k-core subgraphs is an efficient graph partition method, and the k-core of a graph called the densest core is a maximal size subgraph where each node has at least k neighbors in the subgraph (Pan et al., 2018; Al-garadi et al., 2017). Based on this idea, the densest cores of a graph roughly maintain their clustering structure (Alvarez-Hamelin et al., 2017).
Enhancing artificial bee colony algorithm with multi-elite guidance
2021, Information SciencesCitation Excerpt :The experiments show that our approach can achieve promising results on most of the test functions, which are better or at least comparable to its competitors. In the future, the MGABC can be applied to more real-world problems, such as the software modular clustering problem [43–45]. Xinyu Zhou: Conceptualization, Methodology, Writing - original draft, Writing - review & editing.
Indicator & crowding distance-based evolutionary algorithm for combined heat and power economic emission dispatch
2020, Applied Soft Computing JournalCitation Excerpt :In addition, in the process of searching the best solution, we intend to adopt more efficient methods, such as parallel computing, to improve the practicability of the algorithm. When we use large-scale software in our future work, we will use new technologies in literature [35,36] to speed up the understanding of the software and conduct large-scale experiments better. Jiaze Sun: Conceptualization, Funding acquisition, Investigation, Software, Writing - review.
Comments on 'Using k-Core Decomposition on Class Dependency Networks to Improve Bug Prediction Model's Practical Performance'
2022, IEEE Transactions on Software EngineeringSoftware Module Clustering: An In-Depth Literature Analysis
2022, IEEE Transactions on Software Engineering
Weifeng Pan received his Ph.D. degree from School of Computer at Wuhan University, China, in 2011. He is presently an associate professor in School of Computer Science and Information Engineering at Zhejiang Gongshang University. He is also a member of China Computer Federation (CCF) and ACM. His current research interests include software engineering, service computing, complex networks, and intelligent computation.
Bing Li received his Ph.D., M.S., and B.A. degrees from Huazhong University of Science and Technology, China, in 2003, 1997 and 1990 respectively, all in computer science. He is presently a Professor and Ph.D. supervisor in International School of Software and Research Center for Complex Network at Wuhan University. He is also a senior member of China Computer Federation (CCF) and a member of ACM. His main research interests include requirements engineering, cloud computing, complex network, and semantic web service.
Jing Liu is now an associate professor of State Key Laboratory of Software Engineering at Wuhan University. She is a member of China Computer Federation (CCF) and ACM. She received the Ph.D. degree from Wuhan University in 2007. Her current research interests include software metrics, software evolution and the interdisciplinary research between software engineering and complex networks.
Yutao Ma is now an associate professor of State Key Laboratory of Software Engineering at Wuhan University. He is a member of China Computer Federation (CCF) and ACM. He received the Ph.D. degree from Wuhan University in 2007. His current research interests include software metrics, software evolution and the interdisciplinary research between software engineering and complex networks.
Bo Hu is presently a researcher in Kingdee Research, Kingdee International Software Group Co. Ltd. He received his Ph.D. degree from State Key Laboratory of Software Engineering at Wuhan University, China, in 2011. His current research interests include software metrics, cloud computing, and complex networks.