The WGB method to recover implemented architectural rules

doi:10.1016/j.infsof.2018.06.012

Information and Software Technology

Volume 103, November 2018, Pages 125-137

https://doi.org/10.1016/j.infsof.2018.06.012 Get rights and content

Abstract

Context: The identification of architectural rules, which specify allowed dependencies among architectural modules, is a key challenge in software architecture recovery. Existing approaches either retrieve a large set of rules, compromising their practical use, or are limited to supporting the understanding of such rules, which are manually recovered.

Objective: To propose and evaluate a method to recover architectural rules, focusing on those implemented in the source code, which may differ from planned or conceptual rules.

Method: We propose the WGB method, which analyzes dependencies among architectural modules as a graph, adding weights that correspond to the proposed module dependency strength (MDS) metric and identifies the set of implemented architectural rules by solving a mathematical optimization problem. We evaluated our method with a case study and an empirical study that compared rules extracted by the method with the conceptual architecture and source code dependencies of six systems. These comparisons considered efficiency and effectiveness of our method.

Results: Regarding efficiency, our method took 45.55 s to analyze the largest system evaluated. Considering effectiveness, our method captured package dependencies as extracted rules with a reduction of 87.6%, on average, to represent this information. Using allowed architectural dependencies as a reference point (but not a gold standard), provided rules achieved 37.1% of precision and 37.8% of recall.

Conclusion: Our empirical evaluation shows that the implemented architectural rules recovered by our method consist of abstract representations of (a large number of) module dependencies, providing a concise view of dependencies that can be inspected by developers to identify occurrences of architectural violations and undocumented rules.

Introduction

Software architecture recovery has been largely investigated to support the development of software systems, which often have missing or outdated architectural documentation [1]. This recovered information helps understand the software structure as well as rules that govern the interaction among its modules. The lack of (an updated) documented knowledge regarding the architecture causes a disorganized software evolution, which leads to major maintenance problems [2], [3]. For example, the introduction of architectural violations leads to the known problems of architecture drift and architecture erosion [4]. Moreover, the inspection of implemented architectural rules may reveal unplanned rules introduced by developers, which typically remain undocumented [5]. Consequently, retrieving this kind of architectural information mitigates the knowledge vaporization [6] problem.

The support that architecture recovery approaches provide varies in nature. They can, for example, identify architectural modules [7], [8], [9], [10] by clustering software elements, e.g. classes, when the implemented software structure is inconsistent with the code. With respect to architectural rules, visualizations [11], [12] have been developed to help developers understand rules that are implemented in the code. Moreover, other solutions aim to automatically recover implemented rules, but they are restricted to limited scenarios. For example, rules mined by the PR-Miner [13] tool are limited to coding rules among procedures and functions implemented with the procedural paradigm, and the approach proposed by Hora et al. [14], [15] focuses only on patterns related to external APIs.

In this paper, we focus on recovering rules that specify allowed dependencies between architectural modules. Such recovered rules are those implemented, i.e. those reflecting what is actually present in the code, which may be inconsistent with rules that are in developers’ mindset or documentation. Our proposal consists of a three-step method, named the weighted-graph-based (WGB) method, which identifies implemented rules by means of the construction of a weighted graph using as input source code dependencies and their posterior analysis. In short, for every module pair, we calculate the module dependency strength (MDS) metric, which represents how much a module depends on another, considering its sub-modules and surrounding modules. Therefore, our proposed metric is not limited to counting method calls, but takes into account many factors, such as the number of sibling modules and usage ratios. This metric is used as weights of a graph in which nodes represent modules and arrows dependencies. This graph has some of its arrows removed based on a pairwise analysis of sets of modules, and then an optimization problem is solved to give the recovered implemented architectural rules.

Section snippets

Definitions and problem

There are many ways of representing a software architecture and specifying architectural rules. In our work, we assume that the architecture is decomposed into modules, which may be further refined into sub-modules, leading to a module hierarchy in the form of a tree. This is often the way that modules are implemented or expressed in a documented software architecture. Each (sub-)module may contain software elements, e.g. classes. Architectural rules explicitly specify dependencies between

WGB method

Our WGB method chooses a set of architectural rules to represent an implemented software architecture taking as input a given software module organization (e.g., package structure) and dependencies among module elements (e.g., classes). These recovered rules are a coarse-grained representation of the implemented architecture that are an architectural view of the system. Our method is composed of three sequential steps: (i) calculation of a metric that captures the dependency strength between

Evaluation

Our WGB method was designed based on an intuitive reasoning regarding the choice of adequate granularity to represent implemented architectural rules and the analysis of different alternatives in many distinct hypothetical scenarios. To provide evidence that our method recovers an adequate set of architectural rules, we first present a case study and then an empirical evaluation of the method. The case study consists of the use of the method to recover rules from an existing system and a

Discussion

In addition to the discussions made above, we present in this section further insights that emerged from our proposal and its evaluation, including threats to validity of our empirical study.

Undocumented rules vs. architectural violations. Our results show many recovered implemented rules that are inconsistent with conceptual rules—i.e. mismatch cases, which correspond to 48.8% of the rules, on average. Rules in this category occur due to two causes: (i) an undocumented rule, which should be

Related work

Much work has been done to identify and reduce differences between the conceptual and implemented software architectures. These can be classified into two main groups.

The first group consists of approaches that perform architecture conformance, e.g. [16], [17], [23], [24], [25], [26], [27], [28]. These approaches require the specification of the software architecture that is compared to what is implemented in the code. Some of which include sophisticated means of specifying architectural

Conclusion

The lack of understanding of architectural rules that are implemented in the code is a key barrier to a healthy software development and evolution, leading to many maintainability problems. In this paper, we proposed a novel method that automatically recovers implemented architectural rules, named the WGB method. Our method does not require the specification of any threshold, or system-specific customizations. Our method includes the calculation of a proposed metric, module dependency strength

Acknowledgment

Vanius Zapalowski would like to thank CAPES for research grants 1311715. Ingrid Nunes would like to thank for research grants CNPq ref. 303232/2015-3, CAPES ref. 7619-15-4, and Alexander von Humboldt, ref. BRA 1184533 HFSTCAPES-P.

References (42)

A. Hora et al.
Automatic detection of system-specific conventions unknown to developers
J. Syst. Softw.
(2015)
R. van Solingen, V. Basili, G. Caldiera, H.D. Rombach, Goal Question Metric (GQM) Approach, John Wiley & Sons, Inc....
R.A. Bittencourt et al.
Improving automated mapping in reflexion models using information retrieval techniques
2010 17th Working Conference on Reverse Engineering
(2010)
C. Stringfellow et al.
Comparison of software architecture reverse engineering methods
Inf. Softw. Technol.
(2006)
C.Y. Chong et al.
Efficient software clustering technique using an adaptive and preventive dendrogram cutting approach
Inf. Softw. Technol.
(2013)
N. Sangal et al.
Using dependency models to manage complex software architecture
ACM SIGPLAN Notices
(2005)
R. Tvedt et al.
Does the code match the design? A process for architecture evaluation
International Conference on Software Maintenance, 2002. Proceedings.
(2002)
L. Xiao
Quantifying architectural debts
Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering
(2015)
Y. Lin et al.
Interactive and guided architectural refactoring with search-based recommendation
Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering
(2016)
D.E. Perry et al.
Foundations for the study of software architecture
ACM SIGSOFT Software Engineering Notes
(1992)

C. Riva et al.

Combining static and dynamic views for architecture reconstruction

Proceedings of the Sixth European Conference on Software Maintenance and Reengineering

(2002)

J. Bosch

Software architecture the next step

Software Architecture: First European Workshop, EWSA 2004, St Andrews, UK, May 21–22, 2004. Proceedings

(2004)

A. Corazza et al.

Investigating the use of lexical information for software system clustering

2011 15th European Conference on Software Maintenance and Reengineering

(2011)

Y. Cai et al.

Leveraging design rules to improve software architecture recovery

Proceedings of the 9th international ACM Sigsoft conference on Quality of software architectures

(2013)

V. Zapalowski et al.

Revealing the relationship between architectural elements and source code characteristics

Proceedings of the 22nd International Conference on Program Comprehension - ICPC 2014

(2014)

T. Lutellier et al.

Comparing software architecture recovery techniques using accurate dependencies

2015 IEEE/ACM 37th IEEE International Conference on Software Engineering

(2015)

S. Huynh et al.

Automatic modularity conformance checking

Proceedings of the 13th international conference on Software engineering - ICSE ’08

(2008)

R. Paiva et al.

Exploring the combination of software visualization and data clustering in the software architecture recovery process

Proceedings of the 31st Annual ACM Symposium on Applied Computing - SAC ’16

(2016)

Z. Li et al.

PR-Miner

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering - ESEC/FSE-13

(2005)

A. Hora et al.

Mining system specific rules from change patterns

2013 20th Working Conference on Reverse Engineering (WCRE)

(2013)

N. Medvidovic et al.

A classification and comparison framework for software architecture description languages

IEEE Trans. Softw. Eng.

(2000)

Cited by (0)

View full text