Tutorial: Techniques to Improve the Scalability and Precision of Data Flow Analysis

Soffa, Mary Lou

doi:10.1007/3-540-48294-6_23

Mary Lou Soffa⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1694))

Included in the following conference series:

International Static Analysis Symposium

478 Accesses

Abstract

Since the introduction of data flow analysis more than 20 years ago, the applications of data flow analysis have expanded considerably with the recognition of its practical benefits. The current use of data flow analysis goes well beyond its initial application of register allocation and machine independent optimizations. Compilers today rely heavily on data flow analyses for sophisticated optimizations and to guide the exploitation of architectural features, such as the numbers of processors and their functional units and the memory hierarchy. Besides compilers, data flowanalysis is also used in software engineering. Applications include program verification, debugging (especially of optimized and parallelized code), program test case generation and coverage analysis, regression testing, program integration and program understanding.

The expanded use of data flow analysis has created a demand for a number of extensions and improvements. Advances in data flow analysis have particularly occurred to improve its scalability and precision. Techniques that produce more informative results about the run-time behavior and environment have also been developed by integrating dynamic and architectural information into the analysis and its results. Data flow analysis has been extended to model different programming languages and features, including the object-oriented paradigm and parallel threads.

This tutorial will first present a broad overview of the recent advances in data flow analysis and then focus on techniques that improve the scalability and precision of the analysis.

Concern about the scalability of data flow analysis, both in terms of execution time and memory, is due to the need for whole program analysis, the use of multiple analyses, and the requirements of applications in a production environment. Techniques to improve the performance of analyses have focused on both the program representation used by the analysis and the analysis itself.Anumber of graph representations have been developed that permit direct connections between the generation of data flow information and the use of that information. Other representations have been proposed to enable more efficient interprocedural analysis by producing summary information about procedures. To improve the scalability of analysis, techniques have targeted reducing the number of program points that are modeled and reducing the number of quantities that are modeled simultaneously. Demand driven analysis and partitioning are the major approaches to improve the execution time performance and memory demands of analysis. Performance improvements have also been addressed for the recomputation of data flow by incremental updating of data flowinformation after changes are made to the program.

Path-based approaches have been developed to improve the precision of data flow analysis. The precision of interprocedural analysis is improved by eliminating paths that are invalid based on the procedure call and return structure. Other techniques eliminate paths that are infeasible due to branch correlation. In both of these cases, the precision of the analysis is improved by eliminating spurious facts due to unrealizable paths. Another type of path-based techniques has as its focus the improvement of precision in the information produced for certain paths. One approach separates particular paths, namely frequently executed paths, to improve the precision of the analysis on the separated paths. Other techniques improve the precision of the analysis by proving a distributive formulation of non-distributed data flowproblems. These techniques incorporate information about the quantities generated on separate paths into the analysis to enable a more detailed representation of the quantities and hence less conservative merging at confluence points.

The tutorial will conclude by discussing the needs of various applications in terms of data flow information and future directions for data flow analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Department of Computer Science, University of Pittsburgh, Pittsburgh, PA, 15260, USA
Mary Lou Soffa

Authors

Mary Lou Soffa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica, Università Ca’Foscari di Venezia, via Torino 155, I-30170, Mestre-Venezia, Italy
Agostino Cortesi
Dipartimento di Matematica Pura ed Applicata, Università di Padova, via Belzoni 7, I-35131, Padova, Italy
Gilberto Filé

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Soffa, M.L. (1999). Tutorial: Techniques to Improve the Scalability and Precision of Data Flow Analysis. In: Cortesi, A., Filé, G. (eds) Static Analysis. SAS 1999. Lecture Notes in Computer Science, vol 1694. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48294-6_23

Download citation

DOI: https://doi.org/10.1007/3-540-48294-6_23
Published: 01 October 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66459-8
Online ISBN: 978-3-540-48294-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics