A practical pattern recovery approach based on both structural and behavioral analysis

https://doi.org/10.1016/j.jss.2003.11.018Get rights and content

Abstract

While the merit of using design patterns is clear for forward engineering, we could also benefit from design pattern recovery in program understanding and reverse engineering. In this paper, we present a practical approach to enlarge the recoverable scope and improve precision ratio of pattern recovery. To specify both structural aspect and behavioral aspect of design patterns, we introduce traditional predicate logic combined with Allen's interval-based temporal logic as our theory foundation. The formal specifications could be conveniently converted into Prolog representations to support pattern recovery. To illustrate how to specify and recover design patterns in our approach, we take one example for each category of design patterns. Moreover, we give a taxonomy of design patterns based on the analysis in our approach to show its applicable scope. To validate our approach, we have developed a tool named PRAssistor and analyzed two well-known open source frameworks. The experiment results show that most of the patterns addressed in our taxonomy have been recovered. Besides larger recoverable scope, the recovery precision of our approach is much higher than others. Furthermore, we consider that our approach and tool could be promisingly extended to support “Debug at Design Level” and “Pattern-Driven Refactoring”.

Introduction

As an emergent technology, design pattern has received more and more attention from researchers and practitioners. A pattern is a recurring solution to a common problem in a given context and system of forces (Gamma et al., 1994). Patterns help us document design decisions and rationale, reuse wisdom and experience of master practitioners, and form a shared vocabulary for problem-solving discussion.

While the merit of using design patterns is clear for forward engineering, we could also benefit from design pattern recovery in program understanding and reverse engineering. When people work with existing code written by somebody else, it can be an enormous task to reengineer an intended architecture. Software documentation is often not up-to-date or even nonexistent, and key developers may have left the project. The most accurate and complete source of information is the source code, but there might be hundreds of files. Traditional Computer Aided Software Environment (CASE) tools, such as Rational Rose, only provide limited reverse facility to present simple visualization based on class diagrams, while the run time behaviors of the system and high level blocks of collaboration among class instances could not be captured. Design patterns are micro-architectures and high level building blocks. So recovery of pattern instances in existed systems could improve the understanding and maintainability of them, because larger chunks could be understood as a whole.

Some literatures have addressed the recovery of design patterns. However, the recoverable patterns of these approaches are very limited, and most of these patterns belong to structural category (Antoniol et al., 2001; Brown, 1996; Kramer and Prechelt, 1996). Typically, their approaches would produce many false positives. Some researchers have tried to detect more pattern instances rather than approaches strictly relying on the pattern structures (Keller et al., 1999; Niere et al., 2002; Seemann and von Gudenberg, 1998). However, their works are based on the analysis of source code files, while detecting and deciphering interactions of objects in the source code is not easy: polymorphism makes it difficult to determine which method is actually executed at run time, and inheritance means that each object in a running system exhibits such behavior as defined not only in its class, but also in each of its superclass.

It is generally accepted that coherent specifications warrant the recovery of underlying constructs, elementary building blocks, and repeating abstractions. Some researchers have tried to use formal method to specify design patterns (Eden et al., 1997; Eden, 2001; Florijn et al., 1997; Lauder and Kent, 1998; Smith and Stotts, 2002). However, their abstractions are difficult for average designers to work with. Moreover, they are not well catered for pattern recovery.

In this paper, we present an approach based on both structural and behavioral analysis to enlarge the recoverable scope and improve precision ratio of pattern recovery. In Section 2, we introduce the structural and behavioral constructs, which are the foundation of our pattern specification. Some of these constructs are borrowed from UML (Booch et al., 1999; OMG, 2003). While our focus is on supporting pattern recovery, we only consider constructs which, we think, are relevant and acquirable in the recovery process. Moreover, we introduce Allen's interval-based temporal and Hrycej's Temporal Prolog to characterize the temporal aspect of behaviors. In Section 3, we take Singleton pattern, Composite pattern, and Visitor pattern as examples to illustrate how to specify design pattern to support recovery of different category of design patterns. In addition, we give a taxonomy of design pattern's implementation schemas based on the analysis in our approach. In Section 4, we present the tool named PRAssistor, which has been developed in terms of the rationale of our approach. In the implementation of this tool, we have applied some techniques, such as pre-candidates, participants filter, sorting clauses, to reduce the time consumption of pattern recovery. We also use this tool to analysis two well-known open source frameworks to evaluate our approach. In Section 5, we describe the promising applications of our approach and tool in “Debug at Design Level” and “Pattern-Driven Refactoring”. In Section 6, we give an overview of related work, including pattern recovery approaches and formal specification of design pattern. Finally, we summarize the contributions and limitations of our work and outline ideas for the future work. Appendix A gives the detailed definitions of structural constructs, while Appendix B gives the formal definitions of message constructs.

Section snippets

Fundamental constructs

Basically, we could describe a software system in two aspects: structure and behavior. Although there exist non-functional characteristics, they are too difficult to be captured in reverse engineering. Similarly, we could also depict patterns in these two aspects. As these structural and behavioral constructs have been defined in UML (Booch et al., 1999; OMG, 2003), we take predicate logic combined with temporal logic as theory foundation in our formal specification. For some reasons, we will

Pattern specification

It is not possible for a reverse engineering tool to “comprehend” intents of patterns. Instead, the pattern's implementation (e.g. class structure, object interaction) might be detectable and lead to the identification of the actual pattern. Therefore, we focus on how to specify the solution part of design pattern to facilitate pattern recovery.

While the structural aspect of design patterns has been well understood, we will focus on how to specify the behavioral aspect of design patterns.

Architecture

We have built a tool named PRAssistor to analyze Java programs based on our approach, whereas the rationale of our approach is also suitable for other object-oriented languages. The system could be divided into three parts: structure parser, behavior parser, and pattern recognizer. Fig. 4 illustrates the architecture of PRAssistor.

The structure parser is developed with JDK 1.4.1. We make use of the reflection mechanism of Java to get class structures and internal relationships among classes.

Promising applications

At first glance, the approach and the tool seem to be only useful for recovering design information from systems which we could not get complete documents. Actually, we think that the most promising applications of them are “Debug at Design Level” and “Pattern-Driven Refactoring”.

Related work

One of the early works about patter recovery is Pat (Kramer and Prechelt, 1996). It is designed for the design recovery process which searches for structural design pattern in an object-oriented design model. The design constructions are also represented in Prolog. The patterns detected by this tool are limited, since the reverse engineering task is done by a CASE tool (Paradigm Plus) that is only able to recover structural design elements. However, we consider that recovering information about

Conclusion

This paper presents a practical approach to support pattern recovery. To enlarge the recoverable scope and improve precision ratio of pattern recovery, this approach is based on not only structural but also run time behavioral analysis. While most of other approaches (Antoniol et al., 2001; Brown, 1996; Keller et al., 1999; Kramer and Prechelt, 1996; Niere et al., 2002; Seemann and von Gudenberg, 1998) are only based on the analysis of source code, our approach also analyses dynamic interaction

Acknowledgements

This work was funded by National High Technology Research and Development Program of China (863 Program) contract nos. 2001AA415310 and 2002AA411420, National Natural Science Foundation of China (NFSC) contract no. 60073035.

Heyuan Huang was born in 1977 in Nanchang, Jiangxi, China. He is currently a third year Ph.D. student at department of Computer Science and Engineering of Shanghai Jiao Tong University. His research areas include design pattern, software reverse engineering, and software reengineering.

References (25)

  • J.F. Allen

    Towards a general theory of action and time

    Artificial Intelligence

    (1984)
  • G. Antoniol et al.

    Object-oriented design patterns recovery

    Journal of Systems and Software

    (2001)
  • T. Hrycej

    A temporal extension of Prolog

    Journal of Logic Programming

    (1993)
  • J.F. Allen

    Maintaining knowledge about temporal intervals

    Communications of the ACM

    (1983)
  • K. Beck

    Extreme Programming Explained: Embrace Change

    (1999)
  • G. Booch et al.

    The Unified Modeling Language User Guide

    (1999)
  • Brodsky, S., Clark, T., Cook, S., Evans, A., Kent, S., 2000. Feasibility Study in Rearchitecting UML as a Family of...
  • Brown, K.G., 1996. Design reverse-engineering and automated design pattern detection in smalltalk. Master's thesis,...
  • Cinnéide, M.Ó., 2001. Automated application of design patterns: a refactoring approach. Ph.D. thesis, Department of...
  • Eden, A.H., 2001. Formal specification of object-oriented design. In: International Conference on Multidisciplinary...
  • A.H. Eden et al.

    Precise specification and automatic application of design patterns

  • A. Evans et al.

    The UML as a formal modeling notation

  • Cited by (28)

    • A review of design pattern mining techniques

      2009, International Journal of Software Engineering and Knowledge Engineering
    View all citing articles on Scopus

    Heyuan Huang was born in 1977 in Nanchang, Jiangxi, China. He is currently a third year Ph.D. student at department of Computer Science and Engineering of Shanghai Jiao Tong University. His research areas include design pattern, software reverse engineering, and software reengineering.

    Shensheng Zhang was born in 1951 in Shanghai, China. He received his Ph.D. degree from Stanford University. He is currently Porfessor at department of Computer Science and Engineering of Shanghai Jiao Tong University. His research areas include software engineering, workflow.

    Jian Cao was born in 1972 in Jiangsu, China. He received his Ph.D. degree from Nanjing University of Science and Technology. He is currently Associate Professor at department of Computer Science and Engineering of Shanghai Jiao Tong University. His research areas include software engineering, workflow.

    Yonghong Duan was born in 1980 in Neimenggu, China. He is currently a Master student at department of Computer Science and Engineering of Shanghai Jiao Tong University. His research area is design pattern.

    View full text