Restructuring programs by tucking statements into functions

https://doi.org/10.1016/S0950-5849(98)00091-3Get rights and content

Abstract

Changing the internal structure of a program without changing its behavior is called restructuring. This paper presents a transformation called tuck for restructuring programs by decomposing large functions into small functions. Tuck consists of three steps: Wedge, Split, and Fold. A wedge a subset of statements in a slice-contains computations that are related and that may create a meaningful function. The statements in a wedge are split from the rest of the code and folded into a new function. A call to the new function is placed in the now restructured function. That tuck does not alter the behavior of the original function follows from the semantic preserving properties of a slice.

Introduction

Software restructuring is the transformation of software from one representation to another at the same relative abstraction level, without changing the external behavior of the subject system [10]. A software system may be restructured to make it easier to understand and change, and therefore less costly to maintain [2]. Restructuring may also be the enabling step for reengineering a system 34, 39, and for reverse engineering a system to extract its abstractions 6, 18, 38.

Restructuring in the early days of structured programming implied removing the goto statements 1, 3, 22. This notion of restructuring is quite mature and has led to several automated tools [2]. Even though automatic removal of goto statements does not always produce programs that are desirable [9], such restructuring is a necessary step for creating higher, logic-based abstractions from code 6, 18, 38, 39.

This paper investigates the problem of restructuring programs by breaking their large code fragments and tucking them into new functions. The technical challenge in creating new functions lies in capturing computations that are meaningfully related. If that was not necessary, one could simply create functions by breaking off contiguous pieces of code of some pre-set size, such as done by Sneed and Jandrasics [34]. Such a straightforward approach may not yield good functions because of the interleaving of unrelated computations in real-world code [32].

We present a restructuring transformation tuck to decompose large, non-cohesive code fragments into small, cohesive functions [11]. To tuck, according to the American Heritage Dictionary, is to gather and fold. This is precisely what our transformation does. Tuck is a composition of three primitive transformations: Wedge, Split, and Fold. To tuck a code fragment, a programmer first gathers related code by driving a wedge in the function, then splits the code isolated by the wedge, and then folds [4]the split code into a function. Such restructuring may be performed in order to improve the architecture of a software system [36].

Fig. 1, Fig. 2 enumerate by example the type of restructuring performed by tucking. The program in Fig. 1 is not cohesive [35]in that it performs several activities at the same time. It inputs the sale data for a given number of days. Depending on the value of the flag process, it also computes pay, the commission to be paid as a percentage of sale, and the resulting profit. Fig. 2 contains a program resulting from tucking all the statements modifying the variable total-pay into a function ComputeTotalPay. Besides the assignments to total-pay, the new function also contains the for and the if statements so that the computation of total-pay is preserved. A copy of the for statement has been retained in the restructured function to ensure that total-sale is correctly computed.

The tuck transformation improves upon the extract-function transformation contained in Griswold and Notkin's catalogue of restructuring transformations 16, 17. Griswold and Notkin's extract-function creates a new function from contiguous code fragments. Their transformation is limited to structured programs. In contrast, the tuck transformation even creates functions from non-contiguous code, as enumerated by the example in Fig. 1, Fig. 2. Our transformation is defined for unstructured programs as well.

This paper presents a summary of the tuck transformations and its application. The rest of this paper is organized as follows. Section 2defines the problem and highlights the technical challenge in decomposing functions to create smaller functions. Section 3presents some background definitions used later. Section 4presents our transformation for folding a contiguous piece of code into a function. Section 5contains our transformation for tucking non-contiguous code. Section 6gives an example of using this transformation for restructuring a program. Section 7presents a comparison of our work with other related work. Section 8contains our concluding remarks and plans for future research.

Section snippets

Problem definition

We define the problem of tucking statements as follows:
Definition (Tuck). To tuck a set of statements S of a function fold is to create two functions fnew and fS such that (a) fold is equivalent to fnew, (b) fnew calls fS, and (c) fS contains the statements in S (and maybe other statements).

For this discussion we only consider procedural programs that do not contain global variables and I/O statements. A program is made up of functions consisting of statements, both structured and unstructured.

Preliminaries

Our discussions and algorithms are restricted to programs in a procedural language without global variables. The language contains assignment statement, branch statements, goto statement, and function call statement. For simplicity of presentation we consider a function call to be a statement, i.e. it does not appear in any expression. A function has a fixed number of parameters, each either passed by value or by reference.

We consider a function to be represented as a control flow graph (CFG)

Folding contiguous code

In this section we present a transformation to fold contiguous code segments into new functions. The fold transformation, also sometimes called lambda lifting, was first developed by Burstall and Darlington in the context of functional programming [4]and subsequently studied for logic programs [37]. Griswold and Notkin developed this transformation, calling it extract-function, for a structured, imperative language 16, 17. We now extend the transformation to an unstructured, imperative language.

Tucking non-contiguous code

We now present our transformation for tucking a set of statements. This transformation takes three inputs: Gold, a CFG representing function; S, a set of statements of Gold; and GS, a foldable subgraph of Gold containing S. If the statements S can be tucked without changing the external behavior of the function Gold, the transformation returns two CFGs, G1 and G2, where G1 replaces Gold and G2 is a new function containing the statements S.

The tuck transformation is composed of three

Implementation and use

We now present a scenario of using the above transformations to restructure functions and discuss our experience with implementing these transformations.

The tuck transformations requires three parameters: the function to be restructured, a set of seed statements, and a foldable subgraph containing the seed statements. A tool implementing tuck must address how these parameters would be identified. This in turn would depend on whether the tool is a batch or interactive. We first discuss our

Related works

One of the strongest pieces of evidence of the need for restructuring of the type proposed here comes from observations made by Rugaber et al. [31]. They have investigated the problem of detecting interleaved computation, where interleaving is defined as the merging of two or more distinct plans within some contiguous textual area of a program. A plan is a computational structure to achieve some purpose or goal. Rugaber et al. observe that if a subroutine (function) has multiple outputs there

Conclusions and future research

The need for reengineering and restructuring software is motivated by Lehman's second law of software evolution: as a large program is continuously changed, its complexity which reflects deteriorating structure, increases unless work is done to maintain or reduce it ([28], p. 253). To reduce the deterioration of a program's structure a programmer typically has to undo some previous design decisions and modify the code such that it conforms to a new design that is more suitable for the changed

Acknowledgements

This work was partially supported by a contract from the Department of Defense and a grant from the Department of Army, US Army Research Office. The contents of the paper do not necessarily reflect the position or the policy of the funding agencies, and no official endorsement should be inferred.

References (40)

  • T. Kasai

    Translatability of flowcharts into while programs

    J. Comput. Syst. Sci.

    (1974)
  • E. Aschroft, Z. Manna, The translation of `goto' programs to `while' programs, in: Proceedings of the 1971 IFIP...
  • R.S. Arnold

    Software restructuring

    Proc. IEEE

    (1989)
  • B. Baker

    An algorithm for structuring flowgraphs

    J. ACM

    (1977)
  • R.M. Burstall et al.

    A transformation system for developing recursive programs

    J. ACM

    (1977)
  • T. Ball, S. Horwitz, Slicing programs with arbitrary control-flow, in: P. Fritzson (Ed.), Proceedings of the First...
  • P.T. Breuer et al.

    Creating specifications from code; reverse-engineering techniques

    J. Software Maint.: Res. Pract.

    (1991)
  • J.M. Bieman et al.

    Measuring functional cohesion

    IEEE Trans. Software Eng.

    (1994)
  • R.W. Bowdidge, Supporting the restructuring of data abstractions through manipulation of a program visualization, PhD...
  • F.W. Calliss

    Problems with automatic restructurers

    SIGPLAN Notices

    (1988)
  • E.J. Chikofsky et al.

    Reverse engineering and design recovery: a taxonomy

    IEEE Software

    (1990)
  • J.-C. Deprez, A context-sensitive formal transformation for restructuring programs, Master's thesis, The Center for...
  • J. Ferrante et al.

    The program dependence graph and its use in optimization

    ACM Trans. Programming Languages Syst.

    (1987)
  • K. Gallagher, Evaluating the surgeon's assistant: results of a pilot study, in: Proceedings of the Conference on...
  • K. Gallagher, Visual impact analysis, in: International Conference on Software Maintenance,...
  • K.B. Gallagher et al.

    Using program slicing in software maintenance

    IEEE Trans. Software Eng.

    (1991)
  • W.G. Griswold et al.

    Automated assistance for program restructuring

    ACM Trans. Software Eng.

    (1993)
  • W.G. Griswold, Program restructuring as an aid to software maintenance, PhD thesis, University of Washington, July...
  • P.A. Hausler et al.

    Using function abstraction to understand program behaviour

    IEEE Software

    (1990)
  • S. Horwitz et al.

    Integrating non-interfering versions of programs

    ACM Trans. Programming Languages Syst.

    (1989)
  • Cited by (0)

    View full text