It is my pleasure to introduce the program for the Eleventh International Conference on Architectural Support for Programming Languages and Operating.Authors submitted 245 abstracts resulting in 169 full paper submissions this year, the largest number ever. The program committee selected 24 papers for inclusion in this proceedings, for an acceptance rate of 14%. The submissions and selected papers reach the ASPLOS goal of going beyond the sometimes artificial boundaries between architecture, operating systems, and programming languages to solve and understand problems in the performance, reliability, security, and power of computer systems.In the first round of reviewing, I assigned each paper two committee member reviewers, and each of them requested one or more reviews. For the 112 papers with a significant number of positive reviews or conflicting reviews, I then assigned an additional committee member reviewer and in a few cases, external reviews. Thus, each paper had at least 4 reviews and all papers discussed at the committee meeting had 5 reviews, with 3 from committee members. There were 827 total reviews, for an average of 4.9 reviews per paper.Authors had a 3 day rebuttal period in which they could address reviewer comments. Most reviews were available before the rebuttal period, but several reviews came in during or after this period. The committee members referred to the rebuttal responses during the committee meeting. In a few cases, the rebuttal responses influenced the final decisions in both directions.The committee meeting was held in Austin on May 10, 1pm--8pm, and May 11, 9am--2pm. All but two of the committee members attended in person, and these two participated via conference phone. We applied a strict conflict-of-interest policy whereby committee members left the room or hung-up for papers with authors with whom the committee member had co-authored in the past 5 years, or held an advisee-advisor relationship, or was from the same institution, or was a co-author.Program committee members were encouraged to submit, and could submit up to 2 papers. 11 paper submissions were from committee members. The committee discussed these papers on the second day, after selecting over half of the program. The committee held the first accepted paper of a committee member to the same conference standard as other papers, and the second paper to a higher standard. The committee selected 5 committee authored papers, 2 of these from one committee member.Program committee members volunteered to shepherd four papers to ensure reviewer comments were well addressed. The committee was very pleased with the quality and diversity of topics in this resulting program here. We hope you find them interesting as well.
Proceeding Downloads
Programming with transactional coherence and consistency (TCC)
- Lance Hammond,
- Brian D. Carlstrom,
- Vicky Wong,
- Ben Hertzberg,
- Mike Chen,
- Christos Kozyrakis,
- Kunle Olukotun
Transactional Coherence and Consistency (TCC) offers a way to simplify parallel programming by executing all code within transactions. In TCC systems, transactions serve as the fundamental unit of parallel work, communication and coherence. As each ...
Spatial computation
This paper describes a computer architecture, Spatial Computation (SC), which is based on the translation of high-level language programs directly into hardware structures. SC program implementations are completely distributed, with no centralized ...
An ultra low-power processor for sensor networks
We present a novel processor architecture designed specifically for use in low-power wireless sensor-network nodes. Our sensor network asynchronous processor (SNAP/LE) is based on an asynchronous data-driven 16-bit RISC core with an extremely low-power ...
D-SPTF: decentralized request distribution in brick-based storage systems
Distributed Shortest-Positioning Time First (D-SPTF) is a request distribution protocol for decentralized systems of storage servers. D-SPTF exploits high-speed interconnects to dynamically select which server, among those with a replica, should service ...
FAB: building distributed enterprise disk arrays from commodity components
This paper describes the design, implementation, and evaluation of a Federated Array of Bricks (FAB), a distributed disk array that provides the reliability of traditional enterprise arrays with lower cost and better scalability. FAB is built from a ...
Deconstructing storage arrays
We introduce Shear, a user-level software tool that characterizes RAID storage arrays. Shear employs a set of controlled algorithms combined with statistical techniques to automatically determine the important properties of a RAID system, including the ...
HIDE: an infrastructure for efficiently protecting information leakage on the address bus
XOM-based secure processor has recently been introduced as a mechanism to provide copy and tamper resistant execution. XOM provides support for encryption/decryption and integrity checking. However, neither XOM nor any other current approach adequately ...
Secure program execution via dynamic information flow tracking
We present a simple architectural mechanism called dynamic information flow tracking that can significantly improve the security of computing systems with negligible performance overhead. Dynamic information flow tracking protects programs against ...
Coherence decoupling: making use of incoherence
This paper explores a new technique called coherence decoupling, which breaks a traditional cache coherence protocol into two protocols: a Speculative Cache Lookup (SCL) protocol and a safe, backing coherence protocol. The SCL protocol produces a ...
Continual flow pipelines
Increased integration in the form of multiple processor cores on a single die, relatively constant die sizes, shrinking power envelopes, and emerging applications create a new challenge for processor architects. How to build a processor that provides ...
Scalable selective re-execution for EDGE architectures
Pipeline flushes are becoming increasingly expensive in modern microprocessors with large instruction windows and deep pipelines. Selective re-execution is a technique that can reduce the penalty of mis-speculations by re-executing only instructions ...
HOIST: a system for automatically deriving static analyzers for embedded systems
Embedded software must meet conflicting requirements such as be-ing highly reliable, running on resource-constrained platforms, and being developed rapidly. Static program analysis can help meet all of these goals. People developing analyzers for ...
Helper threads via virtual multithreading on an experimental itanium® 2 processor-based platform
- Perry H. Wang,
- Jamison D. Collins,
- Hong Wang,
- Dongkeun Kim,
- Bill Greene,
- Kai-Ming Chan,
- Aamir B. Yunus,
- Terry Sych,
- Stephen F. Moore,
- John P. Shen
Helper threading is a technology to accelerate a program by exploiting a processor's multithreading capability to run ``assist'' threads. Previous experiments on hyper-threaded processors have demonstrated significant speedups by using helper threads to ...
Low-overhead memory leak detection using adaptive statistical profiling
Sampling has been successfully used to identify performance optimization opportunities. We would like to apply similar techniques to check program correctness. Unfortunately, sampling provides poor coverage of infrequently executed code, where bugs ...
Locality phase prediction
As computer memory hierarchy becomes adaptive, its performance increasingly depends on forecasting the dynamic program locality. This paper presents a method that predicts the locality phases of a program by a combination of locality profiling and run-...
Dynamic tracking of page miss ratio curve for memory management
Memory can be efficiently utilized if the dynamic memory demands of applications can be determined and analyzed at run-time. The page miss ratio curve(MRC), i.e. page miss rate vs. memory size curve, is a good performance-directed metric to serve this ...
Compiler orchestrated prefetching via speculation and predication
This paper introduces a compiler orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. We focus the scope of the optimization on specific subsets of the program ...
Software prefetching for mark-sweep garbage collection: hardware analysis and software redesign
Tracing garbage collectors traverse references from live program variables, transitively tracing out the closure of live objects. Memory accesses incurred during tracing are essentially random: a given object may contain references to any other object. ...
Devirtualizable virtual machines enabling general, single-node, online maintenance
Maintenance is the dominant source of downtime at high availability sites. Unfortunately, the dominant mechanism for reducing this downtime, cluster rolling upgrade, has two shortcomings that have prevented its broad acceptance. First, cluster-style ...
Fingerprinting: bounding soft-error detection latency and bandwidth
Recent studies have suggested that the soft-error rate in microprocessor logic will become a reliability concern by 2010. This paper proposes an efficient error detection technique, called fingerprinting, that detects differences in execution across a ...
Application-level checkpointing for shared memory programs
Trends in high-performance computing are making it necessary for long-running applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart (CPR) - the state of the computation is saved periodically on disk, and ...
Formal online methods for voltage/frequency control in multiple clock domain microprocessors
Multiple Clock Domain (MCD) processors are a promising future alternative to today's fully synchronous designs. Dynamic Voltage and Frequency Scaling (DVFS) in an MCD processor has the extra flexibility to adjust the voltage and frequency in each domain ...
Heat-and-run: leveraging SMT and CMP to manage power density through the operating system
Power density in high-performance processors continues to increase with technology generations as scaling of current, clock speed, and device density outpaces the downscaling of supply voltage and thermal ability of packages to dissipate heat. Power ...
Performance directed energy management for main memory and disks
Much research has been conducted on energy management for memory and disks. Most studies use control algorithms that dynamically transition devices to low power modes after they are idle for a certain threshold period of time. The control algorithms ...
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
ASPLOS '19 | 351 | 74 | 21% |
ASPLOS '18 | 319 | 56 | 18% |
ASPLOS '17 | 320 | 53 | 17% |
ASPLOS '16 | 232 | 53 | 23% |
ASPLOS '15 | 287 | 48 | 17% |
ASPLOS '14 | 217 | 49 | 23% |
ASPLOS XV | 181 | 32 | 18% |
ASPLOS XIII | 127 | 31 | 24% |
ASPLOS XII | 158 | 38 | 24% |
ASPLOS X | 175 | 24 | 14% |
ASPLOS IX | 114 | 24 | 21% |
ASPLOS VIII | 123 | 28 | 23% |
ASPLOS VII | 109 | 25 | 23% |
Overall | 2,713 | 535 | 20% |