Region inference for higher-order functional languages

Tofte, Mads

doi:10.1007/3-540-60360-3_29

Mads Tofte¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 983))

Included in the following conference series:

International Static Analysis Symposium

137 Accesses
1 Citations

Abstract

Region Inference is a static analysis for inferring the lifetime of values created during evaluation of programs in strict, higher-order, functional languages such as Standard ML. At runtime, the store consists of a stack of regions; each region can in principle hold an unbounded number of values, so the region stack is not a normal runtime stack.

The purpose of region inference is to find out when regions can be allocated and deallocated and, for each expression which directly produces a value, which region the value should be put in. In addition, region inference automatically infers ways of passing regions as extra parameters to functions at runtime. Also, polymorphic recursion is used in the region inference system to distinguish between different invocations of the same recursive function. All values are put in regions, including tuples, closures, lists, references and exception values.

Region Inference is currently being used as the basis for a compiler, called the ML Kit with Regions, which has been under development at the University of Copenhagen for around two years. The ML Kit with Regions can compile Core ML programs to C and to HP-PA assembly language. The purpose of this work is to find out whether the region ideas can be pushed all the way through a compiler to native code. In particular, the system relies purely on region inference for memory management; there is no garbage collector.

To make this possible, it has been necessary to supplement region inference with a number of additional static analyses, which precede code generation. The first analysis after region inference itself is called multiplicity inference; it infers for each region how many times a value is put into that region, distinguishing between 0, 1, and infinitely many times. Thereafter follows a storage mode analysis for detecting when it is possible to store values at the bottom of a region, overwriting values that may already be in the region. (This analysis is necessary to achieve tail recursion optimisation.) Then follows a physical region size inference which infers a physical size for each region, where a physical size is either a number of bytes or infinity, in case the region size cannot be predicted. Then follows code generation to an intermediate language, referred to as the Kit Abstract Machine (KAM) language. Regions that will fit in one machine word become temporary variables in the KAM; region of known, finite size are put on the runtime stack of the KAM; the remaining regions are represented by linked lists of fixed-size pages, which are taken from a free list of pages.

With this combination of analyses, it has become possible to execute very memory intensive benchmarks from the Standard ML of New Jersey benchmark suite on the ML Kit with Regions in space and time which in most cases are very competitive with what is achieved using garbage collection. (As one might suspect, there are cases where garbage collection leads to less memory requirements than region inference, because region inference, being a static analysis, in some cases does not discover that values, which actually are garbage, can be reclaimed.)

The interesting thing about the analyses in the ML Kit with Regions is that most of them are fairly simple-minded—as will become apparent in the talk—but that they address problems which we have found to be significant in practice. Indeed, being a new implementation technology, region inference seems to be a rich source of interesting problems which are suitable for static analysis. For example, Aiken et al. have recently proposed an analysis which delays the allocation point of regions and promotes the de-allocation of regions, in some cases leading to significant improvements over basic region inference.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Author information

Authors and Affiliations

Department of Computer Science, University of Copenhagen (DIKU), Universitetsparken 1, DK-2100, Copenhagen, Denmark
Mads Tofte

Authors

Mads Tofte
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alan Mycroft

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tofte, M. (1995). Region inference for higher-order functional languages. In: Mycroft, A. (eds) Static Analysis. SAS 1995. Lecture Notes in Computer Science, vol 983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60360-3_29

Download citation

DOI: https://doi.org/10.1007/3-540-60360-3_29
Published: 31 May 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60360-3
Online ISBN: 978-3-540-45050-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics