Restructuring programs by tucking statements into functions
Introduction
Software restructuring is the transformation of software from one representation to another at the same relative abstraction level, without changing the external behavior of the subject system [10]. A software system may be restructured to make it easier to understand and change, and therefore less costly to maintain [2]. Restructuring may also be the enabling step for reengineering a system 34, 39, and for reverse engineering a system to extract its abstractions 6, 18, 38.
Restructuring in the early days of structured programming implied removing the goto statements 1, 3, 22. This notion of restructuring is quite mature and has led to several automated tools [2]. Even though automatic removal of goto statements does not always produce programs that are desirable [9], such restructuring is a necessary step for creating higher, logic-based abstractions from code 6, 18, 38, 39.
This paper investigates the problem of restructuring programs by breaking their large code fragments and tucking them into new functions. The technical challenge in creating new functions lies in capturing computations that are meaningfully related. If that was not necessary, one could simply create functions by breaking off contiguous pieces of code of some pre-set size, such as done by Sneed and Jandrasics [34]. Such a straightforward approach may not yield good functions because of the interleaving of unrelated computations in real-world code [32].
We present a restructuring transformation tuck to decompose large, non-cohesive code fragments into small, cohesive functions [11]. To tuck, according to the American Heritage Dictionary, is to gather and fold. This is precisely what our transformation does. Tuck is a composition of three primitive transformations: Wedge, Split, and Fold. To tuck a code fragment, a programmer first gathers related code by driving a wedge in the function, then splits the code isolated by the wedge, and then folds [4]the split code into a function. Such restructuring may be performed in order to improve the architecture of a software system [36].
Fig. 1, Fig. 2 enumerate by example the type of restructuring performed by tucking. The program in Fig. 1 is not cohesive [35]in that it performs several activities at the same time. It inputs the sale data for a given number of days. Depending on the value of the flag process, it also computes pay, the commission to be paid as a percentage of sale, and the resulting profit. Fig. 2 contains a program resulting from tucking all the statements modifying the variable total-pay into a function ComputeTotalPay. Besides the assignments to total-pay, the new function also contains the for and the if statements so that the computation of total-pay is preserved. A copy of the for statement has been retained in the restructured function to ensure that total-sale is correctly computed.
The tuck transformation improves upon the extract-function transformation contained in Griswold and Notkin's catalogue of restructuring transformations 16, 17. Griswold and Notkin's extract-function creates a new function from contiguous code fragments. Their transformation is limited to structured programs. In contrast, the tuck transformation even creates functions from non-contiguous code, as enumerated by the example in Fig. 1, Fig. 2. Our transformation is defined for unstructured programs as well.
This paper presents a summary of the tuck transformations and its application. The rest of this paper is organized as follows. Section 2defines the problem and highlights the technical challenge in decomposing functions to create smaller functions. Section 3presents some background definitions used later. Section 4presents our transformation for folding a contiguous piece of code into a function. Section 5contains our transformation for tucking non-contiguous code. Section 6gives an example of using this transformation for restructuring a program. Section 7presents a comparison of our work with other related work. Section 8contains our concluding remarks and plans for future research.
Section snippets
Problem definition
We define the problem of tucking statements as follows:
Definition (Tuck). To tuck a set of statements S of a function fold is to create two functions fnew and fS such that (a) fold is equivalent to fnew, (b) fnew calls fS, and (c) fS contains the statements in S (and maybe other statements).
For this discussion we only consider procedural programs that do not contain global variables and I/O statements. A program is made up of functions consisting of statements, both structured and unstructured.
Preliminaries
Our discussions and algorithms are restricted to programs in a procedural language without global variables. The language contains assignment statement, branch statements, goto statement, and function call statement. For simplicity of presentation we consider a function call to be a statement, i.e. it does not appear in any expression. A function has a fixed number of parameters, each either passed by value or by reference.
We consider a function to be represented as a control flow graph (CFG)
Folding contiguous code
In this section we present a transformation to fold contiguous code segments into new functions. The fold transformation, also sometimes called lambda lifting, was first developed by Burstall and Darlington in the context of functional programming [4]and subsequently studied for logic programs [37]. Griswold and Notkin developed this transformation, calling it extract-function, for a structured, imperative language 16, 17. We now extend the transformation to an unstructured, imperative language.
Tucking non-contiguous code
We now present our transformation for tucking a set of statements. This transformation takes three inputs: Gold, a CFG representing function; S, a set of statements of Gold; and GS, a foldable subgraph of Gold containing S. If the statements S can be tucked without changing the external behavior of the function Gold, the transformation returns two CFGs, G1 and G2, where G1 replaces Gold and G2 is a new function containing the statements S.
The tuck transformation is composed of three
Implementation and use
We now present a scenario of using the above transformations to restructure functions and discuss our experience with implementing these transformations.
The tuck transformations requires three parameters: the function to be restructured, a set of seed statements, and a foldable subgraph containing the seed statements. A tool implementing tuck must address how these parameters would be identified. This in turn would depend on whether the tool is a batch or interactive. We first discuss our
Related works
One of the strongest pieces of evidence of the need for restructuring of the type proposed here comes from observations made by Rugaber et al. [31]. They have investigated the problem of detecting interleaved computation, where interleaving is defined as the merging of two or more distinct plans within some contiguous textual area of a program. A plan is a computational structure to achieve some purpose or goal. Rugaber et al. observe that if a subroutine (function) has multiple outputs there
Conclusions and future research
The need for reengineering and restructuring software is motivated by Lehman's second law of software evolution: as a large program is continuously changed, its complexity which reflects deteriorating structure, increases unless work is done to maintain or reduce it ([28], p. 253). To reduce the deterioration of a program's structure a programmer typically has to undo some previous design decisions and modify the code such that it conforms to a new design that is more suitable for the changed
Acknowledgements
This work was partially supported by a contract from the Department of Defense and a grant from the Department of Army, US Army Research Office. The contents of the paper do not necessarily reflect the position or the policy of the funding agencies, and no official endorsement should be inferred.
References (40)
Translatability of flowcharts into while programs
J. Comput. Syst. Sci.
(1974)- E. Aschroft, Z. Manna, The translation of `goto' programs to `while' programs, in: Proceedings of the 1971 IFIP...
Software restructuring
Proc. IEEE
(1989)An algorithm for structuring flowgraphs
J. ACM
(1977)- et al.
A transformation system for developing recursive programs
J. ACM
(1977) - T. Ball, S. Horwitz, Slicing programs with arbitrary control-flow, in: P. Fritzson (Ed.), Proceedings of the First...
- et al.
Creating specifications from code; reverse-engineering techniques
J. Software Maint.: Res. Pract.
(1991) - et al.
Measuring functional cohesion
IEEE Trans. Software Eng.
(1994) - R.W. Bowdidge, Supporting the restructuring of data abstractions through manipulation of a program visualization, PhD...
Problems with automatic restructurers
SIGPLAN Notices
(1988)