Proceedings of the third international conference on Architectural support for programming languages and operating systems

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
- Download citation
- Copy citation

ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems

Go to Proceedings of the third international conference on Architectural support for programming languages and operating systems

April 1989

1989 Proceeding

Chairman:
Joel Emer,
General Chair:
John Hennessy
Stanford University

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

ASPLOS89: Int'l Conference on Architecture Support for Programming Lang & Operating Systems Boston Massachusetts USA April 3 - 6, 1989

ISBN:

978-0-89791-300-3

Published:

01 April 1989

Sponsors:

SIGPLAN, SIGOPS, SIGARCH, IEEE-CS

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Next Conference

ASPLOS '25

Sponsor:
sigarch
sigops
sigplan

30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

March 30 - April 3, 2025

Rotterdam , Netherlands

Reflects downloads up to 07 Mar 2025Bibliometrics

Citation Count

2,180

Downloads (6 weeks)

497

Downloads (12 months)

4,379

Downloads (cumulative)

26,427

Sections

ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems

1989

Previous Next

Abstract

No abstract available.

Skip Table Of Content Section

Select All

Export Citations Save to Binder

Article

Free

Architecture and compiler tradeoffs for a long instruction wordprocessor

Robert Cohn,
Thomas Gross,
Monica Lam

Pages 2–14https://doi.org/10.1145/70082.68183

A very long instruction word (VLIW) processor exploits parallelism by controlling multiple operations in a single instruction word. This paper describes the architecture and compiler tradeoffs in the design of iWarp, a VLIW single-chip microprocessor ...

- 39
- 623
Metrics
Total Citations39
Total Downloads623
Last 12 Months86
Last 6 weeks13

Abstract
View online with eReader
PDF

Article

Free

Tradeoffs in instruction format design for horizontal architectures

Gurindar S. Sohi,
Sriram Vajapeyam

Pages 15–25https://doi.org/10.1145/70082.68184

With recent improvements in software techniques and the enhanced level of fine grain parallelism made available by such techniques, there has been an increased interest in horizontal architectures and large instruction words that are capable of issuing ...

- 55
- 516
Metrics
Total Citations55
Total Downloads516
Last 12 Months62
Last 6 weeks9

Abstract
View online with eReader
PDF

Article

Free

Overlapped loop support in the Cydra 5

James C. Dehnert,
Peter Y.-T. Hsu,
Joseph P. Bratt

Pages 26–38https://doi.org/10.1145/70082.68185

The Cydra^TM 5 architecture adds unique support for overlapping successive iterations of a loop to a very long instruction word (VLIW) base. This architecture allows highly parallel loop execution for a much larger class of loops than can be vectorized, ...

- 161
- 733
Metrics
Total Citations161
Total Downloads733
Last 12 Months100
Last 6 weeks14

Abstract
View online with eReader
PDF

Article

Free

Architectural support for synchronous task communication

F. J. Burkowski,
G. V. Cormack,
G. D. P. Dueck

Pages 40–53https://doi.org/10.1145/70082.68186

This paper describes the motivation for a set of intertask communication primitives, the hardware support of these primitives, the architecture used in the Sylvan project which studies these issues, and the experience gained from various experiments ...

- 3
- 397
Metrics
Total Citations3
Total Downloads397
Last 12 Months84
Last 6 weeks10

Abstract
View online with eReader
PDF

Article

Free

The fuzzy barrier: a mechanism for high speed synchronization of processors

Rajiv Gupta

Pages 54–63https://doi.org/10.1145/70082.68187

Parallel programs are commonly written using barriers to synchronize parallel processes. Upon reaching a barrier, a processor must stall until all participating processors reach the barrier. A software implementation of the barrier mechanism using ...

- 167
- 1,238
Metrics
Total Citations167
Total Downloads1,238
Last 12 Months178
Last 6 weeks24

Abstract
View online with eReader
PDF

Article

Free

Efficient synchronization primitives for large-scale cache-coherent multiprocessors

James R. Goodman,
Mary K. Vernon,
Philip J. Woest

Pages 64–75https://doi.org/10.1145/70082.68188

This paper proposes a set of efficient primitives for process synchronization in multiprocessors. The only assumptions made in developing the set of primitives are that hardware combining is not implemented in the inter-connect, and (in one case) that ...

- 237
- 1,685
Metrics
Total Citations237
Total Downloads1,685
Last 12 Months181
Last 6 weeks25

Abstract
View online with eReader
PDF

Article

Free

A software instruction counter

J. M. Mellor-Crummey,
T. J. LeBlanc

Pages 78–86https://doi.org/10.1145/70082.68189

Although several recent papers have proposed architectural support for program debugging and profiling, most processors do not yet provide even basic facilities, such as an instruction counter. As a result, system developers have been forced to invent ...

- 95
- 749
Metrics
Total Citations95
Total Downloads749
Last 12 Months93
Last 6 weeks10

Abstract
View online with eReader
PDF

Article

Free

Efficient debugging primitives for multiprocessors

Z. Aral,
I. Gerther,
G. Schaffer

Pages 87–95https://doi.org/10.1145/70082.68190

Existing kernel-level debugging primitives are inappropriate for instrumenting complex sequential or parallel programs. These functions incur a heavy overhead in their use of system calls and process switches. Context switches are used to alternately ...

- 12
- 502
Metrics
Total Citations12
Total Downloads502
Last 12 Months68
Last 6 weeks11

Abstract
View online with eReader
PDF

Article

Free

Sheaved memory: architectural support for state saving and restoration in pages systems

M. E. Staknis

Pages 96–102https://doi.org/10.1145/70082.68191

The concept of read-one/write-many paged memory is introduced and given the name sheaved memory. It is shown that sheaved memory is useful for efficiently maintaining checkpoints in main memory and for providing state saving and state restoration for ...

- 12
- 375
Metrics
Total Citations12
Total Downloads375
Last 12 Months54
Last 6 weeks10

Abstract
View online with eReader
PDF

Article

Free

Reference history, page size, and migration daemons in local/remote architectures

M. A. Holliday

Pages 104–112https://doi.org/10.1145/70082.68192

We address the problem of paged main memory management in the local/remote architecture subclass of shared memory multiprocessors. We consider the case where the operating system has primary responsibility and uses page migration as its main tool. We ...

- 51
- 405
Metrics
Total Citations51
Total Downloads405
Last 12 Months65
Last 6 weeks11

Abstract
View online with eReader
PDF

Article

Free

Translation lookaside buffer consistency: a software approach

D. L. Black,
R. F. Rashid,
D. B. Golub,
C. R. Hill

Pages 113–122https://doi.org/10.1145/70082.68193

We discuss the translation lookaside buffer (TLB) consistency problem for multiprocessors, and introduce the Mach shootdown algorithm for maintaining TLB consistency in software. This algorithm has been implemented on several multiprocessors, and is in ...

- 73
- 1,593
Metrics
Total Citations73
Total Downloads1,593
Last 12 Months377
Last 6 weeks32

Abstract
View online with eReader
PDF

Article

Free

Failure correction techniques for large disk arrays

G. A. Gibson,
L. Hellerstein,
R. M. Karp,
D. A. Patterson

Pages 123–132https://doi.org/10.1145/70082.68194

The ever increasing need for I/O bandwidth will be met with ever larger arrays of disks. These arrays require redundancy to protect against data loss. This paper examines alternative choices for encodings, or codes, that reliably store information in ...

- 57
- 917
Metrics
Total Citations57
Total Downloads917
Last 12 Months108
Last 6 weeks17

Abstract
View online with eReader
PDF

Article

Free

A unified vector/scalar floating-point architecture

N. P. Jouppi,
J. Bertoni,
D. W. Wall

Pages 134–143https://doi.org/10.1145/70082.68195

In this paper we present a unified approach to vector and scalar computation, using a single register file for both scalar operands and vector elements. The goal of this architecture is to yield improved scalar performance while broadening the range of ...

- 25
- 877
Metrics
Total Citations25
Total Downloads877
Last 12 Months197
Last 6 weeks18

Abstract
View online with eReader
PDF

Article

Free

Data buffering: run-time versus compile-time support

H. Mulder

Pages 144–151https://doi.org/10.1145/70082.68196

Data-dependency, branch, and memory-access penalties are main constraints on the performance of high-speed microprocessors. The memory-access penalties concern both penalties imposed by external memory (e.g. cache) or by under utilization of the local ...

- 7
- 772
Metrics
Total Citations7
Total Downloads772
Last 12 Months475
Last 6 weeks9

Abstract
View online with eReader
PDF

Article

Free

An analysis of 8086 instruction set usage in MS DOS programs

T. L. Adams,
R. E. Zimmerman

Pages 152–160https://doi.org/10.1145/70082.68197

- 23
- 2,615
Metrics
Total Citations23
Total Downloads2,615
Last 12 Months337
Last 6 weeks32

View online with eReader
PDF

Article

Free

A real-time support processor for ada tasking

J. Roos

Pages 162–171https://doi.org/10.1145/70082.68198

Task synchronization in Ada causes excessive run-time overhead due to the complex semantics of the rendezvous. To demonstrate that the speed can be increased by two orders of magnitude by using special purpose hardware, a single chip VLSI support ...

- 17
- 396
Metrics
Total Citations17
Total Downloads396
Last 12 Months91
Last 6 weeks14

Abstract
View online with eReader
PDF

Article

Free

The runtime environment for Scheme, a Scheme implementation on the 88000

Steven R. Vegdahl,
Uwe F. Pleban

Pages 172–182https://doi.org/10.1145/70082.68199

We are implementing a Scheme development system for the Motorola 88000. The core of the implementation is an optimizing native code compiler, together with a carefully designed runtime system. This paper describes our experiences with the 88000 as a ...

- 4
- 585
Metrics
Total Citations4
Total Downloads585
Last 12 Months103
Last 6 weeks12

Abstract
View online with eReader
PDF

Article

Free

Program optimization for instruction caches

S. McFarling

Pages 183–191https://doi.org/10.1145/70082.68200

This paper presents an optimization algorithm for reducing instruction cache misses. The algorithm uses profile information to reposition programs in memory so that a direct-mapped cache behaves much like an optimal cache with full associativity and ...

- 239
- 1,797
Metrics
Total Citations239
Total Downloads1,797
Last 12 Months164
Last 6 weeks24

Abstract
View online with eReader
PDF

Article

Free

Using registers to optimize cross-domain call performance

Paul A. Karger

Pages 194–204https://doi.org/10.1145/70082.68201

This paper describes a new technique to improve the performance of cross-domain calls and returns in a capability-based computer system. Using register optimization information obtained from the compiler, a trusted linker can minimize the number of ...

- 15
- 603
Metrics
Total Citations15
Total Downloads603
Last 12 Months136
Last 6 weeks16

Abstract
View online with eReader
PDF

Article

Free

The design of nectar: a network backplane for heterogeneous multicomputers

Emmanuel Arnould,
H. T. Kung,
Francois Bitz,
Robert D. Sansom,
Eric C. Cooperm

Pages 205–216https://doi.org/10.1145/70082.68202

Nectar is a “network backplane” for use in heterogeneous multicomputers. The initial system consists of a star-shaped fiber-optic network with an aggregate bandwidth of 1.6 gigabits/second and a switching latency of 700 nanoseconds. The system can be ...

- 113
- 514
Metrics
Total Citations113
Total Downloads514
Last 12 Months101
Last 6 weeks25

Abstract
View online with eReader
PDF

Article

Free

A message driven OR-parallel machine

S. A. Delgado-Rannauro,
T. J. Reynolds

Pages 217–228https://doi.org/10.1145/70082.68203

A message driven architecture for the execution of OR-parallel logic languages is proposed. The computational model is based on well known compilation techniques for Logic Languages. We present first the multiple binding mechanism for the OR-parallel ...

- 1
- 351
Metrics
Total Citations1
Total Downloads351
Last 12 Months56
Last 6 weeks11

Abstract
View online with eReader
PDF

Article

Free

Evaluating the performance of software cache coherence

S. Owicki,
A. Agarwal

Pages 230–242https://doi.org/10.1145/70082.68204

In a shared-memory multiprocessor with private caches, cached copies of a data item must be kept consistent. This is called cache coherence. Both hardware and software coherence schemes have been proposed. Software techniques are attractive because they ...

- 38
- 766
Metrics
Total Citations38
Total Downloads766
Last 12 Months104
Last 6 weeks22

Abstract
View online with eReader
PDF

Article

Free

Analysis of cache invalidation patterns in multiprocessors

W. Weber,
A. Gupta

Pages 243–256https://doi.org/10.1145/70082.68205

To make shared-memory multiprocessors scalable, researchers are now exploring cache coherence protocols that do not rely on broadcast, but instead send invalidation messages to individual caches that contain stale data. The feasibility of such directory-...

- 146
- 852
Metrics
Total Citations146
Total Downloads852
Last 12 Months131
Last 6 weeks22

Abstract
View online with eReader
PDF

Article

Free

The effect of sharing on the cache and bus performance of parallel programs

S. J. Eggers,
R. H. Katz

Pages 257–270https://doi.org/10.1145/70082.68206

Bus bandwidth ultimately limits the performance, and therefore the scale, of bus-based, shared memory multiprocessors. Previous studies have extrapolated from uniprocessor measurements and simulations to estimate the performance of these machines. In ...

- 156
- 838
Metrics
Total Citations156
Total Downloads838
Last 12 Months108
Last 6 weeks6

Abstract
View online with eReader
PDF

Article

Free

Available instruction-level parallelism for superscalar and superpipelined machines

N. P. Jouppi,
D. W. Wall

Pages 272–282https://doi.org/10.1145/70082.68207

Superscalar machines can issue several instructions per cycle. Superpipelined machines can issue only one instruction per cycle, but they have cycle times shorter than the latency of any functional unit. In this paper these two techniques are shown to ...

- 267
- 3,646
Metrics
Total Citations267
Total Downloads3,646
Last 12 Months460
Last 6 weeks60

Abstract
View online with eReader
PDF

Article

Free

Micro-optimization of floating-point operations

W. J. Dally

Pages 283–289https://doi.org/10.1145/70082.68208

This paper describes micro-optimization, a technique for reducing the operation count and time required to perform floating-point calculations. Micro-optimization involves breaking floating-point operations into their constituent micro-operations and ...

- 19
- 1,023
Metrics
Total Citations19
Total Downloads1,023
Last 12 Months261
Last 6 weeks13

Abstract
View online with eReader
PDF

Article

Free

Limits on multiple instruction issue

M. D. Smith,
M. Johnson,
M. A. Horowitz

Pages 290–302https://doi.org/10.1145/70082.68209

This paper investigates the limitations on designing a processor which can sustain an execution rate of greater than one instruction per cycle on highly-optimized, non-scientific applications. We have used trace-driven simulations to determine that ...

- 148
- 1,059
Metrics
Total Citations148
Total Downloads1,059
Last 12 Months199
Last 6 weeks27

Abstract
View online with eReader
PDF

Save to Binder

Create a New Binder

Name

Contributors

Joel S. Emer
Massachusetts Institute of Technology
- Publication Years1984 - 2024
- Publication counts96
- Citation count8,340
- Available for Download92
- Downloads (cumulative)132,892
- Downloads (12 months)24,646
- Downloads (6 weeks)3,450
- Average Downloads per Article1,444
- Average Citation per Article87
View Full Profile
John L. Hennessy
Stanford University
- Publication Years1977 - 2024
- Publication counts129
- Citation count9,815
- Available for Download102
- Downloads (cumulative)116,186
- Downloads (12 months)33,363
- Downloads (6 weeks)6,676
- Average Downloads per Article1,139
- Average Citation per Article76
View Full Profile

Comments

0 Comments

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing
UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing
IHM '09: Proceedings of the 21st International Conference on Association Francophone d'Interaction Homme-Machine

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Year	Submitted	Accepted	Rate
ASPLOS '19	351	74	21%
ASPLOS '18	319	56	18%
ASPLOS '17	320	53	17%
ASPLOS '16	232	53	23%
ASPLOS '15	287	48	17%
ASPLOS '14	217	49	23%
ASPLOS XV	181	32	18%
ASPLOS XIII	127	31	24%
ASPLOS XII	158	38	24%
ASPLOS X	175	24	14%
ASPLOS IX	114	24	21%
ASPLOS VIII	123	28	23%
ASPLOS VII	109	25	23%
Overall	2,713	535	20%

Save to Binder

Sections

Save to Binder

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing

UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing

IHM '09: Proceedings of the 21st International Conference on Association Francophone d'Interaction Homme-Machine

Acceptance Rates