Semi-automatic architectural pattern identification and documentation using architectural primitives

https://doi.org/10.1016/j.jss.2014.12.042Get rights and content

Highlights

  • Documentation of architectural patterns based on architectural primitives.

  • DSL-based documentation of pattern instances.

  • DSL-based pattern catalog.

  • Supports traceability between all artifacts.

Abstract

In this article, we propose an interactive approach for the semi-automatic identification and documentation of architectural patterns based on a domain-specific language. To address the rich concepts and variations of patterns, we firstly propose to support pattern description through architectural primitives. These are primitive abstractions at the architectural level that can be found in realizations of multiple patterns, and they can be leveraged by software architects for pattern annotation during software architecture documentation or reconstruction. Secondly, using these annotations, our approach automatically suggests possible pattern instances based on a reusable catalog of patterns and their variants. Once a pattern instance has been documented, the annotated component models and the source code get automatically checked for consistency and traceability links are automatically generated. To study the practical applicability and performance of our approach, we have conducted three case studies for existing, non-trivial open source systems.

Introduction

During maintenance and evolution of a software system, a deep understanding of the system’s architecture is essential. This knowledge about a system’s architecture tends to erode over time (Jansen et al., 2007) or even get lost. In a recent study Rost et al. (2013) found that architecture documentation is frequently outdated, updated only with strong delays, and inconsistent in detail and form. They also found that developers prefer interactive (navigable) documentation compared to static documents. This also reflects our personal experiences as well as those of others. For instance, our colleague Neil Harrison shared the following story from his experiences with large-scale industrial systems (shortened): “Once upon a time I worked on a large system that was already a few years old. It had a well-defined architecture. When I started, I was given copies of three or four documents that described the architecture. In addition, I watched several videotapes in which the architects described the architecture. As a result, I gained a good understanding of the architecture of the system. After a few years, I left the project to work on other things. But several years later I returned. The system was still being used and was under active development. Of course, it had changed greatly to add new capabilities and support changes in technology. Underneath it all, the original architecture was largely intact, but it was much more obscure. I wanted to refresh my architectural memory, so I asked around for the original memos and videotapes. Nobody had even heard of them. Critical architectural knowledge had been lost. People were actually afraid to change the original code, because they did not understand how it worked.”

Software architecture documentation or, in case of lost architectural knowledge, software architecture reconstruction (Ducasse and Pollet, 2009) techniques can be used to (re)establish the proper architectural documentation of the software system. An essential part of today’s architectural knowledge is information about the patterns used in a system’s architecture. Patterns can be seen as building blocks for the composition of a system’s architecture (Beck, Johnson, 1994, Buschmann, Meunier, Rohnert, Sommerlad, Stal, 1996). This is especially valid for architectural patterns or styles which describe a system’s fundamental structure and behavior (Lange and Nakamura, 1995). A considerable number of software architecture reconstruction approaches support software pattern identification (Beck, Johnson, 1994, Bergenti, Poggi, 2000, Shull, Melo, Basili, 1996). Most of these approaches (see e.g. Bergenti, Poggi, 2000, Heuzeroth, Holl, Högström, Löwe, 2003, Krämer, Prechelt, 1996, Philippow, Streitferdt, Riebisch, 2003) focus on automatically detecting design patterns in the source code. Such pattern identification approaches are often restricted to design patterns that were identified by Gamma et al. (1995) (GoF patterns). Architectural patterns, in contrast, convey broader information about a system’s architecture as they usually are described at a larger scale than GoF patterns.

There are a number of important problems in automatic pattern identification in general and especially in architectural pattern identification. Existing approaches often only focus on the task of identifying a system’s design patterns while the documentation of the reconstructed patterns and the future evolution of the system are not considered (which is just as essential as identifying an architectural pattern).

In addition, architectural patterns are often much harder to detect directly in the source code than GoF design patterns as there is often a large number of classes involved in the implementation of the pattern and the variations between different instances of the patterns are very large. As a consequence of the large number of involved classes there is a possibly huge search space for these patterns that grows with every class and increases execution times (Ducasse and Pollet, 2009).

A big problem of pattern identification is the variability in pattern implementations. Only a very few pattern identification approaches consider pattern variations at all, and they are usually focused on GoF design patterns only (Wendehals, 2003, Wendehals, Niere, Wadsack, 2001). For instance, hardly any implementation of a system strictly adheres to the Layers pattern (Buschmann et al., 1996) as described in the textbook, but a huge number of systems are designed based on Layers. To give a concrete example, in the definition of the Layers pattern, a layer only has access to the functionality provided by the layer below it. However, this rule is often violated for cross-cutting concerns like performance, security, or logging. As a consequence, many layered architectures contain parts that do not strictly adhere to the Layers pattern. In addition to this, the Layers pattern suggests but does not in any way enforce clean interfaces between the layers. For these reasons, it is hard to automatically detect architectural patterns like Layers.

Another problem of automatic pattern identification is the accuracy of the approaches, which is often not sufficient. That is, some approaches treat pattern instances they find as candidates (Wendehals, 2003). However the likelihood of false positives increases with system size and can lead to precision values around 40% (Krämer and Prechelt, 1996) which means that 60% of the found pattern instances are false positives. This requires substantial manual effort to review the found pattern instances.

In the light of the aforementioned problems, we formulated the following research questions:

  • RQ1

    How far can a semi-automatic architectural pattern approach go toward the goal of identifying the patterns in architectural reconstruction?

  • RQ2

    How far can a semi-automatic architectural pattern approach go toward the goal of maintaining documented patterns during the further evolution of a reconstructed architecture?

  • RQ3

    In how far are the concepts and tools applicable in existing real-life systems?

  • RQ4

    How efficient are the actual pattern instance matching algorithms that are based on primitives?

  • RQ5

    Are primitives and an adaptable pattern catalog adequate means to handle the variability inherent to architectural patterns?

The main contributions of this article are, first, to suggest a novel semi-automatic architectural pattern identification approach that tackles the aforementioned problems that arise during the documentation and evolution of architectural patterns like the variability inherent to patterns, consistency between the documented architecture and the source code, and the large number of source code artifacts that are related to the implementation of architectural patterns. Second, we show the approach’s feasibility in terms of tool support (in the context of three open source case studies), and to study the performance of the approach (also in the context of these cases). We aim to assist the software architect during the reconstruction of architectural knowledge as well as supporting the architect in the documentation of the reconstructed architectural knowledge. After the architectural knowledge has been reconstructed and documented with our approach, we support the software architect in keeping the created architectural documentation in sync with the source code of the application. As Clements et al. (2002) state, a strong architecture is only useful if it is properly documented in order to allow others to quickly find information about it.

Our proposed solution is an interactive approach for the semi-automatic identification and documentation of architectural patterns based on a set of Domain Specific Languages (DSLs). It consists of the following main components:

  • Architecture Abstraction DSL: In our main DSL, the Architecture Abstraction DSL, the software engineers can semi-automatically create an abstraction of an architectural component view based on design models or during architecture reconstruction. To address the rich concepts and variations of patterns, we propose to use architectural primitives (Zdun and Avgeriou, 2005) that can be leveraged by software engineers for pattern annotation during software architecture documentation and reconstruction. Architectural primitives are primitive abstractions at the architectural level (i.e. defined for components, connectors, and other architectural abstractions1) that can be found in realizations of multiple patterns.

  • Pattern Instance Documentation Tool: Using the architectural primitive annotations, our approach provides a Pattern Instance Documentation Tool which automatically suggests possible pattern instances based on the architectural component view of a system and a pattern catalog.

  • Pattern Catalog DSL: The pattern catalog contains templates of the architectural patterns to be identified. It is customizable, reusable and integrates support for pattern variability. Our approach leads to a reduced search space for patterns, as we search for patterns only in the created architectural component view instead of the source code.

  • Pattern Instance DSL: Identified pattern instances are documented using the Pattern Instance DSL which uses the artifacts defined in the Architecture Abstraction DSL and the Pattern Catalog DSL to permanently store pattern instance documentations.

We automatically generate traceability links between the architectural abstractions and the source code (more specifically, the automatically generated class models of the source code), the architectural abstractions and the selected pattern instances, and the pattern instances and the pattern catalog. When artifacts are changed, the traceability links are used to automatically check the consistency of all the artifacts. Automated consistency checking aids the software engineers during the incremental architecture documentation process, when new artifacts are identified and documented. For example, the system automatically detects when the pattern catalog is used to customize an existing pattern and these changes cause an existing instance of this pattern to be no longer valid. The consistency checks are used throughout the evolution of the documented system and report any occurring violations within seconds.

This article is structured as follows: In Section 2 we briefly explain architectural patterns and architectural primitives as our background. We give an overview of our approach in Section 3, and present it in detail in Section 4. In Section 5 we present three case studies of open source systems in which we have applied our approach to test its applicability. As it is crucial for our approach that it works smoothly in the working environment of the software designer during software design and development, we evaluate the execution time of our prototype in Section 6. In Section 7 we discuss lessons learned from the case studies and the performance evaluation as well as limitations of our approach. We compare to the related work in Section 8 and conclude in Section 9.

Section snippets

Background: patterns and architectural primitives

A significant aspect of documenting software architectures is the representation of architectural patterns (Avgeriou, Zdun, 2005, Buschmann, Meunier, Rohnert, Sommerlad, Stal, 1996) and the closely related architectural styles (Shaw and Garlan, 1996). In general, a pattern is a problem-solution pair in a given context. A pattern does not only document “how” a solution solves a problem but also “why” it is solved, i.e., the rationale behind this particular solution. Architectural patterns help

Approach overview

Fig. 1 shows the most important steps and tools in our approach. The central tool is the Pattern Instance Documentation Tool. Its goal is to document architectural pattern instances based on an architectural component and connectors view of a system that is annotated with architectural primitives.

The tool is semi-automatic, as it also receives manually edited inputs developed using the Architecture Abstraction DSL. In our previous work (Haitzer and Zdun, 2012) we developed a basic version of

Detailed description of the approach

Our approach introduces a reusable pattern catalog that contains architectural patterns, an architectural component view that is annotated with architectural primitives, and pattern instances based on the pattern catalog and the architectural component view. In this section we describe the concepts and languages used for realizing these different parts of our approach in more detail. To illustrate our approach, we use a running example based on the open source game FreeCol (The Freecol Team,

Case studies

In this section we present three open source system case studies to better illustrate our approach and to study the practical applicability of our approach to do architecture reconstruction and documentation for existing, non-trivial software systems. Finally, the case studies are also used as a basis for the performance evaluations in Section 6. In the first case study we documented the architecture of the open source game FreeCol, which was partly presented as a running example before. In the

Performance evaluation of the pattern instance documentation tool

For the practical applicability of our approach it is crucial that it works smoothly in the working environment of the software designer during software design and development. To test the applicability of our approach in practice we measured the time it takes our Pattern Instance Documentation Tool to find pattern instances for our case studies and in five larger (with respect to number of components) synthetic component models. For the synthetic component models we used the basic structure of

Discussion

In this section we briefly discuss the lessons learned from the three case studies and the performance evaluation.

Related work

In this section we compare our work to related approaches that focus on modeling or detecting patterns or other approaches that utilize patterns during software architecture reconstruction or software architecture evolution. In Table 5 we give an overview of the related work discussed in this section and also provide a short comparison of this related work. Most of the related works focus on automatic design pattern identification while only a limited number focuses on finding architectural

Conclusion

In this article we have presented an approach for the semi-automatic documentation of architectural patterns based on architectural primitives. While other approaches automatically detect design patterns in the source code, we require the architect to semi-automatically create an abstraction of an architectural component view that is annotated with architecture primitive information. This raises the abstraction level of the input on which we automatically search for patterns. It also reduces

Thomas Haitzer is a research assistant with the Software Architecture Group at the University of Vienna and is currently working on his doctoral thesis in the area of model-based software architecture recovery and evolution. Before, he completed his Masters studies at the Vienna University of Technology.

References (70)

  • MackworthA.K.

    Consistency in networks of relations

    Artif. Intell.

    (1977)
  • Apache CXF, 2011....
  • AlnusairA. et al.

    Automatic recognition of design motifs using semantic conditions

    Proceedings of the 28th Annual ACM Symposium on Applied Computing

    (2013)
  • ArevaloG. et al.

    Detecting implicit collaboration patterns

    Proceedings of the 11th Working Conference on Reverse Engineering

    (2004)
  • AvgeriouP. et al.

    Architectural patterns revisited – A pattern language

    Proceedings of the 10th European Conference on Pattern Languages of Programs (EuroPlop 2005)

    (2005)
  • BalanyiZ. et al.

    Mining design patterns from C++ source code

    Proceedings of the International Conference on Software Maintenance, ICSM 2003

    (2003)
  • BeckK. et al.

    Patterns generate architectures

    Proceedings of the 8th European Conference on Object-Oriented Programming

    (1994)
  • BergentiF. et al.

    Improving UML designs using automatic design pattern detection

    Proceedings of the 12th International Conference on Software Engineering and Knowledge Engineering (SEKE 2000)

    (2000)
  • BuschmannF. et al.

    Pattern-Oriented Software Architecture: A System of Patterns

    (1996)
  • ClementsP. et al.

    Documenting Software Architectures: Views and Beyond

    (2002)
  • CurryE. et al.

    Extending message-oriented middleware using interception

    3rd International Workshop on Distributed Event-Based Systems (DEBS’04)

    (2004)
  • De LuciaA. et al.

    Improving behavioral design pattern detection through model checking

    2010 14th European Conference on Software Maintenance and Reengineering (CSMR)

    (2010)
  • von DettenM.

    Towards systematic, comprehensive trace generation for behavioral pattern detection through symbolic execution

    Proceedings of the 10th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools

    (2011)
  • Di PentaM. et al.

    Discovery of SOA patterns via model checking

    2nd International Workshop on Service Oriented Software Engineering: In Conjunction with the 6th ESEC/FSE Joint Meeting

    (2007)
  • DucasseS. et al.

    Software architecture reconstruction: a process-oriented Taxonomy

    IEEE Trans. Software Eng.

    (2009)
  • EdenA.H. et al.

    LePUS – Symbolic logic modeling of object oriented architectures: a case study

    Second Nordic Workshop on Software Architecture – NOSA’99

    (1999)
  • FowlerM.

    Patterns of Enterprise Application Architecture

    (2002)
  • FreemanE. et al.

    Head First Design Patterns

    (2004)
  • GammaE. et al.

    Design Patterns: Elements of Reusable Object-Oriented Software

    (1995)
  • GuéhéneucY.G. et al.

    Using explanations for design-patterns identification

    proceedings of the 1st IJCAI Workshop on Modeling and Solving Problems with Constraints

    (2001)
  • GuoG.Y. et al.

    A software architecture reconstruction method

    Proceedings of the TC2 First Working IFIP Conference on Software Architecture (WICSA1)

    (1999)
  • HaitzerT. et al.

    DSL-based support for semi-automated architectural component model abstraction throughout the software lifecycle

    Proceedings of the 8th International ACM SIGSOFT Conference on Quality of Software Architectures

    (2012)
  • HarrisD.R. et al.

    Reverse engineering to the architectural level

    Proceedings of the 17th International Conference on Software Engineering

    (1995)
  • HeitmeyerC.L. et al.

    Automated consistency checking of requirements specifications

    ACM Trans. Softw. Eng. Methodol.

    (1996)
  • HeuzerothD. et al.

    Automatic design pattern detection

    Proceedings of the 11th IEEE International Workshop on Program Comprehension

    (2003)
  • JansenA. et al.

    Tool support for architectural decisions

    Proceedings of the Sixth Working IEEE/IFIP Conference on Software Architecture

    (2007)
  • JavedM.A. et al.

    The supportive effect of traceability links in architecture-level software understanding: two controlled experiments

    2014 IEEE/IFIP Conference on Software Architecture, WICSA 2014, 7–11 April 2014, Sydney, Australia

    (2014)
  • KaczorO. et al.

    Efficient identification of design patterns with bit-vector algorithm

    Proceedings of the 10th European Conference on Software Maintenance and Reengineering, CSMR 2006

    (2006)
  • KamalA.W. et al.

    Modeling architectural patterns’ behavior using architectural primitives

    Proceedings of the 2nd European conference on Software Architecture

    (2008)
  • KellerR.K. et al.

    Pattern-based reverse-engineering of design components

    Proceedings of the 21st International Conference on Software Engineering

    (1999)
  • KitchenhamB.A. et al.

    Preliminary guidelines for empirical research in software engineering

    IEEE Trans. Softw. Eng.

    (2002)
  • KleinbergJ.M.

    Authoritative sources in a hyperlinked environment

    J. ACM

    (1999)
  • KrämerC. et al.

    Design recovery by automated search for structural design patterns in object-oriented software

    Proceedings of the 3rd Working Conference on Reverse Engineering (WCRE ’96)

    (1996)
  • LangeD.B. et al.

    Interactive visualization of design patterns can help in framework understanding

    Proceedings of the Tenth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications

    (1995)
  • LunguM. et al.

    Package patterns for visual architecture recovery

    Proceedings of the Conference on Software Maintenance and Reengineering

    (2006)
  • Thomas Haitzer is a research assistant with the Software Architecture Group at the University of Vienna and is currently working on his doctoral thesis in the area of model-based software architecture recovery and evolution. Before, he completed his Masters studies at the Vienna University of Technology.

    Uwe Zdun is a full professor for software architecture at the Faculty of Computer Science, University of Vienna. Before that, he worked as an assistant professor at the Vienna University of Technology and the Vienna University of Economics respectively. He received his doctoral degree from the University of Essen in 2002. His research focuses on software architecture, software patterns, modeling of complex software systems, service-oriented systems, domain-specific languages, model-driven development, and empirical software engineering in these areas. Uwe has published more than 130 peer-reviewed articles and is co-author of the professional books “Remoting Patterns – Foundations of Enterprise, Internet, and Realtime Distributed Object Middleware,” “Process-Driven SOA – Proven Patterns for Business-IT Alignment,” and “Software-Architektur.” He has gained significant experiences in leading scientific work and has participated in numerous R & D projects, including ARCS, CONTAINER, INDENICA, COMPAS, S-CUBE, TPMHP, Infinica, SCG, and Sembiz. Uwe is editor of the journal Transactions on Pattern Languages of Programming (TPLoP) published by Springer, and Associate Editor-in-Chief for design and architecture for the IEEE Software magazine.

    View full text