# Lecture Notes in Computer Science 4697 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen #### **Editorial Board** **David Hutchison** Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany Lynn Choi Yunheung Paek Sangyeun Cho (Eds.) # Advances in Computer Systems Architecture 12th Asia-Pacific Conference, ACSAC 2007 Seoul, Korea, August 23-25, 2007 Proceedings #### Volume Editors Lynn Choi Korea University School of Electrical Engineering Anam-Dong, Sungbuk-Ku, Seoul, Korea E-mail: lchoi@korea.ac.kr Yunheung Paek Seoul National University School of Electrical Engineering Seoul, Korea E-mail: ypaek@snu.ac.kr Sangyeun Cho University of Pittsburgh Department of Computer Science Pittsburgh, PA 15260, USA E-mail: cho@cs.pitt.edu Library of Congress Control Number: 2007932678 CR Subject Classification (1998): B.2, B.4, B.5, C.2, C.1, D.4 LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues ISSN 0302-9743 ISBN-10 3-540-74308-1 Springer Berlin Heidelberg New York ISBN-13 978-3-540-74308-8 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2007 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12109494 06/3180 5 4 3 2 1 0 ### **Preface** On behalf of the program and organizing committee members of this conference, we are pleased to present you with the proceedings of the 12<sup>th</sup> Asia-Pacific Computer Systems Architecture Conference (ACSAC 2007), which was hosted in Seoul, Korea on August 23-25, 2007. This conference has traditionally been a forum for leading researchers in the Asian, American and Oceanian regions to share recent progress and the latest results in both architectural and system issues. In the past few years the conference has become more international in the sense that the geographic origin of participants has become broader to include researchers from all around the world, including Europe and the Middle East. This year, we received 92 paper submissions. Each submission was reviewed by at least three primary reviewers along with up to three secondary reviewers. The total number of completed reviews reached 333, giving each submission 3.6 reviews on average. All the reviews were carefully examined during the paper selection process, and finally 26 papers were accepted, resulting in an acceptance rate of about 28%. The selected papers encompass a wide range of topics, with much emphasis on hardware and software techniques for state-of-the-art multicore and multithreaded architectures. In addition to the regular papers, the technical program of the conference included eight invited papers from world-class renowned researchers and featured two keynotes by Pen-Chung Yew (University of Minnesota) and Kunio Uchiyama (Hitachi), addressing a compiler framework for speculative multithreading and power-efficient heterogeneous multicore chip development, respectively. We sincerely hope that the proceedings will serve as a valuable reference for researchers and developers alike. Putting together ACSAC 2007 was a team effort. First of all, we would like to express our special gratitude to the authors and speakers for providing the contents of the program. We would also like to thank the program committee members and external reviewers for diligently reviewing the papers and providing suggestions for their improvements. We believe that you will find the outcome of their efforts in this book. In addition, we extend our thanks to the organizing committee members and student volunteers, who contributed enormously to various aspects of conference administration. Finally, we would like to express special thanks to Chris Jesshope and Jinling Xue for sharing their experience and offering fruitful feedback in the early stages of preparing the conference. June 2007 Lynn Choi Yunheung Paek Sangyeun Cho ## **Conference Organization** #### **General Co-chairs** Lynn Choi Korea University, Korea Sung Bae Park Samsung Electronics, Korea ## **Program Co-chairs** Yunheung Paek Seoul National University, Korea John Morris University of Auckland, New Zealand Sangyeun Cho University of Pittsburgh, USA ## **Publicity Chair** Ki-Seok Chung Hanyang University, Korea ## **Publication Chair** Hwangnam Kim Korea University, Korea ## **Local Arrangement Chair** Sung Woo Chung Korea University, Korea #### **Finance Chair** Yunmook Nah Dankook University, Korea ## **Registration Chair** Youngho Choi Konkuk University, Korea ## **Steering Committee** Jesse Z. Fang Intel, USA James R. Goodman University of Auckland, New Zealand Gernot Heiser National ICT, Australia #### VIII Organization Kei Hiraki Tokyo University, Japan Chris Jesshope University of Amsterdam, Netherlands Feipei Lai National Taiwan University, Taiwan John Morris University of Auckland, New Zealand Amos Omondi Yonsei University, Korea Ronald Pose Monash University, Australia Stanislav Sedukhin University of Aizu, Japan Mateo Valero Universitat Politecnica de Catalunya, Spain Jingling Xue University of New South Wales, Australia Pen-Chung Yew University of Minnesota, USA ## **Program Committee** Jin Young Choi Korea University, Korea Bruce Christianson University of Hertfordshire, UK Sung Woo Chung Korea University, Korea Oliver Diessel University of New South Wales, Australia Colin Egan University of Hertfordshire, UK Skevos Evripidou University of Cyprus, Cyprus Wong Weng Fai National University of Singapore, Singapore Michael Freeman University of York, UK Guang G. Gao University of Delaware, USA Jean-Luc Gaudiot University of California at Irvine, USA Alex Gontmakher Technion, Israel Gernot Heiser National ICT, Australia Wei-Chung Hsu University of Minnesota, USA Suntae Hwang Kookmin University, Korea Chris Jesshope University of Amsterdam, Netherlands Jeremy Jones Trinity College, Ireland Norman P. Jouppi Hewlett Packard, USA Cheol Hong Kim Chonnam University, Korea Doohyun Kim Kunkook University, Korea Feipei Lai National Taiwan University, Taiwan Hock Beng Lim Nanyang Technological University, Singapore Philip Machanick University of Queensland, Australia Worawan Marurngsith Thammasat University, Thailand Henk Muller University of Bristol, UK Sukumar Nandi Indian Institute of Technology Guwahati, India Tin-Fook Ngai Intel China Research Center, China Amos Omondi Yonsei University, Korea L M Patnaik Indian Institute of Science Bangalore, India Andy Pimentel University of Amsterdam, Netherlands Ronald Pose Monash University, Australia Stanislav G. Sedukhin University of Aizu, Japan Won Shim Seoul National University of Technology, Korea Mark Smotherman Clemson University, USA K. Sridharan Indian Institute of Technology Madras, India Rajeev Thakur Argonne National Laboratory, USA Mateo Valero Universitat Politecnica de Catalunya, Spain Lucian N. Vintan University of Sibiu, Romania Chengyong Wu Zhi-Wei Xu ICT, Chinese Academy of Sciences, China ICT, Chinese Academy of Sciences, China ICT, Chinese Academy of Sciences, China University of New South Wales, Australia Pen-Chung Yew University of Minnesota, USA #### **External Reviewers** Nidhi Aggarwal Kai Hwang Naveen Muralimanohar Nadeem Ahmed Lei Jin Sudha Natarajan Christopher Ang Jonghee Kang Venkatesan Packirisamy Elizabeth M. Belding-Royer Kamil Kedzierski Chanik Park Darius Buntinas Daeho Kim Jagdish Patra Francisco Cazorla Jinpyo Kim Vladimir Pervouchine José M. CelaJohn KimVinod PrasadYang ChenChung-Ta KingKen RobinsonDoosan ChoTei-Wei KuoEsther SalamíPeter ChubbIhor KuzOliverio J. SantanaIan CloughKoen LangendoenMichael Schelansker Toni Cortés Robert Latham Bill Scherer Kyriacou Costas Sanghwan Lee Bertil Schmidt Ahmed Sherif Adrián Cristal Heung-No Lee Abhinay Das Hvuniin Lee Todor P. Stefanov Amitabha Das Graham Leedham Mark Thompson Michel Dubois Binghao Li Jordi Torres Bin Fan Huiyun Li Nian-Feng Tzeng Kuan-Ching Li Jinyun Fang Lei Wang Yu-Chiann Foo Wei Li Yulu Yang John Glossner Adam Postula Jia Yu Sandeep K. Gupta Chen Liu Patryk Zadarnowski Rubén Conzález Shaoshan Liu Ahmed Zekri Rogeli Grima Jie Ma Ge Zhang Jizhong Han Luke Macpherson Jony Zhang Paul HavingaPramod K. MeherLongbing ZhangMichael HicksNeill MillerYoutao ZhangHouman HomayounMiquel Moreto #### Student Volunteers Yong-Soo Bae Hyun-Joon Lee Keunhee Yeo Jae Kyun Jung Kiyeon Lee Jonghee Youn Daeho Kim Sang-Hoon Lee # Table of Contents | A Compiler Framework for Supporting Speculative Multicore Processors (Keynote) | 1 | |-------------------------------------------------------------------------------------------------------------|----| | Power-Efficient Heterogeneous Multicore Technology for Digital<br>Convergence (Keynote) | 2 | | StarDBT: An Efficient Multi-platform Dynamic Binary Translation System | 4 | | Unbiased Branches: An Open Problem | 16 | | An Online Profile Guided Optimization Approach for Speculative Parallel Threading | 28 | | Entropy-Based Profile Characterization and Classification for Automatic Profile Management | 40 | | Laplace Transformation on the FT64 Stream Processor | 52 | | Towards Data Tiling for Whole Programs in Scratchpad Memory Allocation | 63 | | Evolution of NAND Flash Memory Interface | 75 | | FCC-SDP: A Fast Close-Coupled Shared Data Pool for Multi-core DSPs | 80 | | Exploiting Single-Usage for Effective Memory Management Thomas Pianet Olivier Rochecouste and André Seznec | 90 | | An Alternative Organization of Defect Map for Defect-Resilient Embedded On-Chip Memories | 1 | |----------------------------------------------------------------------------------------------------------------------------------------------------|---| | Kang Yi, Shih-Yang Cheng, Young-Hwan Park, Fadi Kurdahi, and Ahmed Eltawil | 1 | | An Effective Design of Master-Slave Operating System Architecture for Multiprocessor Embedded Systems | 1 | | Optimal Placement of Frequently Accessed IPs in Mesh NoCs Reza Moraveji, Hamid Sarbazi-Azad, and Maghsoud Abbaspour | 1 | | An Efficient Link Controller for Test Access to IP Core-Based Embedded System Chips | 1 | | Performance of Keyword Connection Algorithm in Nested Mobility Networks | 1 | | Leakage Energy Reduction in Cache Memory by Software Self-invalidation | 1 | | Exploiting Task Temperature Profiling in Temperature-Aware Task Scheduling for Computational Clusters | 1 | | Runtime Performance Projection Model for Dynamic Power Management | 1 | | Sang-Jeong Lee, Hae-Kag Lee, and Pen-Chung Yew A Power-Aware Alternative for the Perceptron Branch Predictor Kaveh Aasaraai and Amirali Baniasadi | 1 | | Power Consumption and Performance Analysis of 3D NoCs | 2 | | A Design Methodology for Performance-Resource Optimization of a<br>Generalized 2D Convolution Architecture with Quadrant Symmetric<br>Kernels | 2 | | Bipartition Architecture for Low Power JPEG Huffman Decoder Shanq-Jang Ruan and Wei-Te Lin | 2 | | A SWP Specification for Sequential Image Processing Algorithms | 2 | | A Stream System-on-Chip Architecture for High Speed Target Recognition Based on Biologic Vision | 256 | |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | FPGA-Accelerated Active Shape Model for Real-Time People Tracking | 268 | | Performance Evaluation of Evolutionary Multi-core and Aggressively Multi-threaded Processor Architectures | 280 | | Synchronization Mechanisms on Modern Multi-core Architectures Shaoshan Liu and Jean-Luc Gaudiot | 290 | | Concerning with On-Chip Network Features to Improve Cache Coherence Protocols for CMPs Hongbo Zeng, Kun Huang, Ming Wu, and Weiwu Hu | 304 | | Generalized Wormhole Switching: A New Fault-Tolerant Mathematical Model for Adaptively Wormhole-Routed Interconnect Networks F. Safaei, A. Khonsari, M. Fathy, N. Talebanfard, and M. Ould-Khaoua | 315 | | Open Issues in MPI Implementation | 327 | | Implicit Transactional Memory in Kilo-Instruction Multiprocessors Marco Galluzzi, Enrique Vallejo, Adrián Cristal, Fernando Vallejo, Ramón Beivide, Per Stenström, James E. Smith, and Mateo Valero | 339 | | Design of a Low–Power Embedded Processor Architecture Using Asynchronous Function Units | 354 | | A Bypass Mechanism to Enhance Branch Predictor for SMT Processors | 364 | | Thread Priority-Aware Random Replacement in TLBs for a High-Performance Real-Time SMT Processor | 376 | | Architectural Solution to Object-Oriented Programming | 387 | | Author Index | 399 |