# Lecture Notes in Computer Science 7179 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen #### **Editorial Board** **David Hutchison** Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany Andreas Herkersdorf Kay Römer Uwe Brinkschulte (Eds.) # Architecture of Computing Systems – ARCS 2012 25th International Conference Munich, Germany, February 28 – March 2, 2012 Proceedings #### Volume Editors Andreas Herkersdorf Technische Universität München Lehrstuhl für Integrierte Systeme 80290 München, Germany E-mail: herkersdorf@tum.de Kay Römer Universität zu Lübeck Institut für Technische Informatik 23562 Lübeck, Germany E-mail: roemer@iti.uni-luebeck.de Uwe Brinkschulte Johann Wolfgang Goethe-Universität Frankfurt am Main Eingebettete Systeme 60325 Frankfurt am Main, Germany E-mail: brinks@es.cs.uni-frankfurt.de ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-28292-8 e-ISBN 978-3-642-28293-5 DOI 10.1007/978-3-642-28293-5 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2012930910 CR Subject Classification (1998): C.2, C.5.3, D.4, D.2.11, H.3.5, H.4, H.5.4 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues © Springer-Verlag Berlin Heidelberg 2012 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) ## **Preface** This volume contains the proceedings of the 2012 International Conference on Architecture of Computing Systems (ARCS 2012), hosted by the Technical University of Munich at the Institute of Advanced Studies, February 28 – March 2, 2012. The 25<sup>th</sup> anniversary of ARCS naturally stimulates reflection on how computer systems architecture has evolved over the past decades. Traditionally, desktop computers and embedded computing devices in industry and academia were adopters of high-performance computer architecture and technology with a lag time of several years. "What is in a mainframe today, will be in a PC tomorrow," was the colloquial saying, which, in a transformed sense, is still true. Today, a consumer electronics video game station has in fact an impressive peak compute performance in the range of 2 Teraflops, which is quite comparable to a national research lab supercomputer of a decade ago. However, this was accomplished not only by technology adoption from the high end, but even more by developing a leading edge embedded processor architecture specifically tailored for streaming media applications. Today, GPU graphics processors deploy the highest number of processing elements per chip, and in addition they also provide the highest compute performance per Euro, and even more importantly, the highest compute performance per Watt. With the transition to multi- and manycore platforms, desktop and embedded processor architectures have changed their role from followers of high-end concepts to innovation drivers, influencing in return high-end scientific computing. For example, the new 3 Petaflops SuperMUC computer of the Leibniz-Rechenzentrum in Garching is based on 14.000 8-core processors. The focal topics of ARCS 2012 are centered on platforms for embedded computer systems. Embedded application domains, such as automotive, consumer infotainment, industry automation, and medical electronics have domain specific and stringent requirements with respect to energy efficiency, safety, security, dependability, and real-time constraints. These requirements can only partially be addressed by general purpose processor architectures. ARCS 2012 received a total of 65 submissions, out of which 20 high-quality papers were selected by an international Technical Program Committee of more than 60 experts. Each submission was reviewed by at least three members of the TPC. The final selection was made during a full-day TPC meeting in Frankfurt. Technical sessions of ARCS cover new hardware and software techniques for energy-efficient, failure-tolerant and real-time-capable processing. Multi-/manycore architectures and programming models are discussed as well as innovative 2D-/3D-Network-on-Chip (NoC) interconnects and memory hierarchies. Optimization methods and tools for design validation at different levels of abstraction complete the conference program. Six associated workshops present current work in progress in specific focal domains of computing systems and two #### VI Preface tutorials grant insight into the state of the art in organic computing and partial reconfiguration of FPGA in real-world applications. Keynotes by David August, Princeton University, on "Restoring Computing's former Glory"; by Koen De Bosschere, Ghent University, on "Computing Systems Research Challenges Ahead: The HiPEAC Vision 2011/2012"; and by Sebastian Steibl, Intel Labs Braunschweig, top off the program. We would like to express our sincere thanks to all supporters of the ARCS 2012 organization committee for their help and contributions to making ARCS 2012 a success. In particular, we owe gratitude to all sponsors, the GI management team, the TPC members, the ARCS Fachausschuss, as well as the workshop and tutorial organizers. Special thanks go to all authors who submitted papers to ARCS 2012, whose new ideas, scientific rigor, and tremendous effort is what gives ARCS its inspiring program. Last but not least, we would like to thank Gregor Walla from the Technical University of Munich for administering the ARCS 2012 Website. December 2011 Andreas Herkersdorf Uwe Brinkschulte and Kay Römer Gero Mühl and Jan Richling Walter Stechele and Thomas Wild ## Organization #### General Chair Andreas Herkersdorf TU Muenchen, Germany ### **Program Co-chairs** Kay Roemer University of Luebeck, Germany Uwe Brinkschulte University of Frankfurt, Germany ## Workshops and Tutorials Gero Mühl Universität Rostock Jan Richling TU Berlin ## **Program Committee** Michael Beigl Karlsruhe Institute of Technology, Germany Frank Bellosa Karlsruhe Institute of Technology, Germany Mladen Berekovic Technische Universität Braunschweig, Germany Koen Bertels Technische Universiteit Delft, The Netherlands Arndt Bode Technische Universität München, Germany Plamenka Borovska Technical University of Sofia, Bulgaria Jürgen Brehm Gottfried Wilhelm Leibniz Universität Hannover, Germany Philip Brisk University of California Riverside, USA Jiannong Cao Hong Kong Polytechnic University, Hong Kong, China João M. P. Cardoso Universidade do Porto/FEUP, Portugal Universidade Federal do Rio Grande do Sul, Brazil Koen De Bosschere Universiteit Gent, Belgium Oliver Diessel University of New South Wales, Australia Nikitas Dimopoulos Ahmed El-Mahdy Paolo Faraboschi Fabrizio Ferrandi Pierfrancesco Foglia William Fornaciari Björn Franke University of Victoria, Canada Alexandria University, Egypt HP Labs Barcelona, Spain Politecnico di Milano, Italy Università di Pisa, Italy Politecnico di Milano, Italy University of Edinburgh, UK Daniel Gracia-Pérez CEA, France #### VIII Organization Roberto Giorgi Jan Haase Jörg Henkel Christian Hochberger Murali Jayapala Gert Jervan Ben Juurlink Wolfgang Karl Andreas Koch Krzysztof Kuchcinski Olaf Landsiedel Paul Lukowicz Erik Maehle Tom Martin Dragomir Milojevic Luca Mottola Christian Müller-Schloer #### Dimitrios Nikolopoulos Alex Orailoglu Pascal Sainrat Silvia Santini Toshinori Sato Yiannakis Sazeides Martin Schulz Karsten Schwan Cristina Silvano Leonel Sousa Rainer G. Spallek Olaf Spinczyk Benno Stabernack Jarmo Takala Djamshid Tavanagraian Jürgen Teich Pedro Trancoso Theo Ungerer Stéphane Vialle Lucian Vintan Klaus Waldschmidt Stephan Wong Università di Siena, Italy Technische Universität Wien, Austria Karlsruhe Institute of Technology, Germany Technische Universität Dresden, Germany IMEC, Belgium Tallinn University of Technology, Estonia Technische Universität Berlin, Germany Karlsruhe Institute of Technology, Germany Technische Universität Darmstadt, Germany Lunds Universitet, Sweden Kungliga Tekniska Högskolan, Sweden Universität Passau, Germany Universität Lübeck, Germany Virginia Tech, USA Université Libre de Bruxelles, Belgium Swedish Institute of Computer Science, Sweden Gottfried Wilhelm Leibniz Universität Hannover, Germany Foundation for Research and Technology Hellas, Greece University of California San Diego, USA Université Paul Sabatier Toulouse III, France Eidgenössische Technische Hochschule Zürich, Switzerland Fukuoka University, Japan University of Cyprus, Cyprus Lawrence Livermore National Laboratory, US $\Delta$ Georgia Tech, USA Politecnico di Milano, Italy Universidade Técnica de Lisboa, Portugal Technische Universität Dresden, Germany Technische Universität Dortmund, Germany Fraunhofer HHI, Germany Tampere University of Technology, Finland Universität Rostock, Germany Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany University of Cyprus, Cyprus Universität Augsburg, Germany Supélec, France Lucian Blaga University of Sibiu, Romania Johann-Wolfgang-Goethe-Universität Frankfurt, Germany Technische Universiteit Delft, The Netherlands ## **Invited Talks** **David August**, Princeton University "Restoring Computing's former Glory" Multicore, as currently conceived, is the manifestation of computer architects' failure to continue the decades old, universal performance trend despite the uninterrupted exponential growth of resources that is Moore's Law. The culmination of current directions in commercial and academic research will only reduce the negative impact the multicore programming burden will have on companies, individuals, and society. Rather than punting the problem to programmers, computer architects could continue that once familiar language-independent performance growth trend, but failure is certain when we act on the belief that success is impossible. The purpose of this talk is to establish belief, by compelling demonstration, in a solution which sustains generations of scalable performance for existing parallel codes as much as for the most notoriously sequential legacy codes, preserves our most precious natural resource (programmer sanity), and reclaims computing's performance legacy. **Koen De Bosschere**, HiPEAC Coordinator, Ghent University "Computing Systems Research Challenges Ahead: The HiPEAC Vision 2011/2012" Computing systems have had a tremendous impact on everyday life over the past decades in all domains. Historically, computing performance has been fuelled by "Moore's law", which drove the semiconductor industry for decades. However, a major paradigm shift is now taking place. "Moore's law", while keeping pace in terms of transistor density, will only enable a minor increase of the frequency and decrease of the power dissipation per transistor. As a result, even if it will still be feasible to pack more devices on a chip, it will not be possible to use them all simultaneously. New technology nodes are compounding this problem by increasing leakage power and device variability, and decreasing reliability. The need to provide improved energy efficiency and build reliable systems from unreliable and highly variable components leads to new research directions at all levels. HiPEAC has identified seven specific research objectives: Efficiency (with a focus on energy efficiency) - 1) **Heterogeneous computing systems:** How can we design computer systems to maximize power efficiency and performance? - 2) Locality and communications management: How do we intelligently minimize or control the movement of data to maximize power efficiency and performance? #### System Complexity - 3) Cost-effective software for heterogeneous multi-cores: How do we build tools and systems to enable developers to efficiently write software for future heterogeneous and parallel systems? - 4) Cross-component/cross-layer optimization for design integration: How do we take advantage of the trend towards component-based design without losing the benefits of cross component optimization? - 5) **Next-generation processor cores:** How do we design processor cores for energy-efficiency, reliability, and predictability? **Dependability** and applications (with a focus on their non-functional requirements) - 6) Architectures for the Data Deluge: How can we tackle the growing gap between the growth of data and processing power? - 7) Reliable systems for Ubiquitous Computing: How do we guarantee safety, predictability, availability, and privacy for ubiquitous systems? Furthermore, it will be necessary to investigate research directions breaking with the line of classical Von Neumann systems. Fuelled by new technologies such as dense non-volatile memories, optical interconnects, and 3D stacking, new computing paradigms will be necessary to perform both old and new tasks at high efficiency levels while decreasing the impact of the constraints of the new technology nodes ## Table of Contents Robustness and Fault Tolerance | Classification-Based Improvement of Application Robustness and Quality of Service in Probabilistic Computer Systems | 1 | |--------------------------------------------------------------------------------------------------------------------------------------|----| | A Case Study on Error Resilient Architectures for Wireless Communication | 13 | | Using Dynamic Task Level Redundancy for OpenMP Fault Tolerance<br>$Oussama\ Tahan\ and\ Mohamed\ Shawky$ | 25 | | Power Aware Processing | | | A Very Fast and Quasi-accurate Power-State-Based System-Level Power Modeling Methodology | 37 | | Static Task Mapping for Tiled Chip Multiprocessors with Multiple Voltage Islands | 50 | | An Architecture for Power Management in Automotive Systems Andreas Barthels, Joachim Fröschl, Hans-Ulrich Michel, and Uwe Baumgarten | 63 | | Parallel Processing | | Isaías A. Comprés Ureña, Michael Riepen, Michael Konow, and A Low-Overhead Heuristic for Mixed Workload Resource Partitioning in Cluster-Based Architectures..... Davide Zoni, Patrick Bellasi, and William Fornaciari Michael Gerndt 74 86 | Deterministic Execution Model on COTS Hardware | 98 | |-----------------------------------------------------------------------------------------------------|-----| | Processor Cores | | | Design Principles for Synthesizable Processor Cores | 111 | | HPC Performance Domains on Multi-core Processors with Virtualization | 123 | | A Generic and Non-intrusive Profiling Methodology for SystemC Multi-core Platform Simulation Models | 135 | | Optimization | | | Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging | 147 | | An Approach for Performance Estimation of Hybrid Systems with FPGAs and GPUs as Coprocessors | 160 | | Work Stealing Strategies for Parallel Stream Processing in Soft Real-Time Systems | 172 | | Design Space Exploration of Hybrid Ultra Low Power Branch Predictors | 184 | | Communication and Memory | | | New Memory Organizations for 3D DRAM and PCMs | 200 | | Vertical Link On/Off Control Methods for Wireless 3-D NoCs | 212 | | SADmote: A Robust and Cost-Effective Device for Environmental Monitoring | 225 | |---------------------------------------------------------------------------------------------------------------------|-----| | Streamlined Network-on-Chip for Multicore Embedded Architectures Gadi Oxman, Shlomo Weiss, and Yitzhak (Tsahi) Birk | 238 | | Author Index | 251 | Table of Contents XIII