# Lecture Notes in Computer Science 3740 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen #### **Editorial Board** **David Hutchison** Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos New York University, NY, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany Thambipillai Srikanthan Jingling Xue Chip-Hong Chang (Eds.) # Advances in Computer Systems Architecture 10th Asia-Pacific Conference, ACSAC 2005 Singapore, October 24-26, 2005 Proceedings #### Volume Editors Thambipillai Srikanthan Nanyang Technological University, School of Computer Engineering Blk N4, Nanyang Avenue, Singapore, 639798 E-mail: astsrikan@ntu.edu.sg Jingling Xue University of New South Wales, School of Computer Science and Engineering Sydney, NSW 2052, Australia E-mail: jxue@cse.unsw.edu.au Chip-Hong Chang Nanyang Technological University, School of Electrical and Electronic Engineering Blk S2, Nanyang Avenue, Singapore 639798 E-mail: echchang@ntu.edu.sg Library of Congress Control Number: 2005934301 CR Subject Classification (1998): B.2, B.4, B.5, C.2, C.1, D.4 ISSN 0302-9743 ISBN-10 3-540-29643-3 Springer Berlin Heidelberg New York ISBN-13 978-3-540-29643-0 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springeronline.com © Springer-Verlag Berlin Heidelberg 2005 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 11572961 06/3142 5 4 3 2 1 0 ## **Preface** On behalf of the Program Committee, we are pleased to present the proceedings of the 2005 Asia-Pacific Computer Systems Architecture Conference (ACSAC 2005) held in the beautiful and dynamic country of Singapore. This conference was the tenth in its series, one of the leading forums for sharing the emerging research findings in this field. In consultation with the ACSAC Steering Committee, we selected a 33-member Program Committee. This Program Committee represented a broad spectrum of research expertise to ensure a good balance of research areas, institutions and experience while maintaining the high quality of this conference series. This year's committee was of the same size as last year but had 19 new faces. We received a total of 173 submissions which is 14% more than last year. Each paper was assigned to at least three and in some cases four Program Committee members for review. Wherever necessary, the committee members called upon the expertise of their colleagues to ensure the highest possible quality in the reviewing process. As a result, we received 415 reviews from the Program Committee members and their 105 co-reviewers whose names are acknowledged in the proceedings. The conference committee adopted a systematic blind review process to provide a fair assessment of all submissions. In the end, we accepted 65 papers on a broad range of topics giving an acceptance rate of 37.5%. We are grateful to all the Program Committee members and the co-reviewers for their efforts in completing the reviews within a tight schedule. In addition to the contributed papers, this year's program included two keynote speeches from authorities in academia and industry: Ruby B. Lee of Princeton University on *Processor Architecture for Trustworthy Computers*, and Jesse Z. Fang of Intel Corporation on *Challenges and Opportunities on Multicore Microprocessor*. It was a rewarding experience to be the Program Chairs for this year's conference. We wish to take this opportunity to thank many people who contributed to making ACSAC 2005 a great success. Firstly, we thank the authors for submitting their work to this year's conference. We thank our efficient and energetic Organizing Committee. In particular, we would like to thank the Publicity Chairs, Vinod Prasad and Tulika Mitra, for having done a wonderful job in publicizing this conference and attracting a high number of submissions, the Web Chairs, Jiajia Chen and Xiaoyong Chen, for maintaining the online conference Web pages, and the Local Arrangements Chair, Douglas Maskell, for ensuring the smooth running of the conference in Singapore. We thank all the Program Committee members, who contributed considerable amounts of their valuable time. It was a great pleasure working with these esteemed members of our research community. We also thank all our sponsors for their support of this event. #### VI Preface Last, but not least, we would like to thank the General Chair, Graham Leedham, for his commitment and perseverance in this invaluable role. We sincerely hope you will find these proceedings valuable and look forward to your participation in future ACSAC conferences. August 2005 Thambipillai Srikanthan Jingling Xue Chip-Hong Chang ## Conference Organization General Chair Graham Leedham Nanyang Technological University, Singapore **Program Chairs** Thambipillai Srikanthan Nanyang Technological University, Singapore Jingling Xue University of New South Wales, Australia **Publications Chair** Chip-Hong Chang Nanyang Technological University, Singapore **Publicity Chairs** Vinod Prasad Nanyang Technological University, Singapore Tulika Mitra National University of Singapore, Singapore Local Arrangements Chair Douglas Maskell Nanyang Technological University, Singapore Web Chairs Jiajia Chen Nanyang Technological University, Singapore Xiaoyong Chen Nanyang Technological University, Singapore ## Program Committee K. Vijayan Asari Old Dominion University, USA UPC, Spain Eduard Ayguade University of Pittsburgh, USA Sangyeun Paul Cho Lvnn Choi Korea University, Korea Christopher T. Clarke University of Bath, UK Oliver Diessel University of New South Wales, Australia Jean-Luc Gaudiot University of California, Irvine, USA James Goodman University of Auckland, New Zealand National ICT, Australia Gernot Heiser Hock Beng Lim National University of Singapore, Singapore Wei-Chung Hsu University of Minnesota, USA Universiteit van Amsterdam, Netherlands Chris Jesshope University of Nebraska, Lincoln, USA Hong Jiang Sridharan K. Indian Institute of Technology, Madras, India Feipei Lai National Taiwan University, Taiwan Xiang Liu Peking University, China Balakrishnan M. Indian Institute of Technology, Delhi, India University of Queensland, Australia Philip Machanick University of Auckland, New Zealand John Morris Tadao Nakamura Tohoku University, Japan Sukumar Nandi Indian Institute of Technology, Guwahati, India Tin-Fook Ngai Intel China Research Center, China Andrew P. Paplinski Monash University, Australia Lalit M. Patnaik Indian Institute of Science, India Jih-Kwon Peir University of Florida, USA Damu Radhakrishnan State University of New York, USA Rajeev Thakur Argonne National Laboratory, USA Tanya Vladimirova University of Surrey, Guildford, UK Weng-Fai Wong National University of Singapore, Singapore Chengyong Wu Institute of Computing Technology, CAS, China Yuanyuan Yang State University of New York at Stony Brook, Pen-Chung Yew University of Minnesota, USA Weimin Zheng Tsinghua University, China #### Co-reviewers Pete Beckman Edmund Lai Ming-Kit Dan BonacheaRobert LathamDmitry BrodskyJonghyun LeeDarius BuntinasSanghoon LeeBin CaoJussipekka LeiwoFrancisco CazorlaSimon Leung Francisco Cazorla Ernie Chan Xiaobin Li Yen Jen Chang Howard Chen William Chen Simon Leung Xiaobin Li Xueming Li Jen-Chiun Lin Jin Lin Kuen-Cheng Chiang Archana Chidanandan Lin Liu Young-ll Cho Shaoshan Liu Peter Chubb Josep M. Codina Lin Liu Xuli Liu Jiwei Lu Yujun Lu Xiaoru Dai Ming Ma Abhinav Das Usama Malik Ryusuke Egawa Verdi March Kevin Elphinstone Xavier Martorell Rao Fu Guillaume Mercier Zhiguo Ge Nader Mohamed Gabriel Ghinita Enric Morancho Qian-Ping Gu Arun Nair Hui Guo Mrinal Nath Yajuan He Hau T. Ngo Sangjin Hong Deng Pan Shen Fu Hsiao Kaustubh S. Patkar Sun-Yuan Hsien Kolin Paul Wei Hsu Jorgen Peddersen Lei Huang Marius Portmann Wei Huo Daniel Potts Andhi Janapsatya Felix Rauch Yaocang Jia Tom Robertazzi Gui Jian Shang-Jang Ruan Priya T.K. Sergio Ruocco Mahmut Kandemir Esther Salami Jinpyo Kim Lin Wen Koh Shannon Koh Anoop Kumar Krishna Estner Salami Chunlei Sang Olivero J. Santana Seng Lin Shee Menon Shibu Mong-Kai Ku Mon-Chau Shie Hiroto Kukuchi David Snowdon ## X Organization Dan Sorin Ken-ichi Suzuki Brian Toonen Patchrawat Uthaisombut Venka Kugan Vivekanandarajah Shengyue Wang Shengyue Wang Yiran Wang Hui Wu Bin Xiao Chia-Lin Yang Hongbo Yang Min Yang Kiren Yellajyosu Kyueun Yi Antonia Zhai Ming Z. Zhang Yifeng Zhu ## **Table of Contents** | Keynote Address I | | |--------------------------------------------------------------------------------------------------------------------------------------------------|----| | Processor Architecture for Trustworthy Computers Ruby B. Lee | 1 | | Session 1A: Energy Efficient and Power Aware<br>Techniques | | | Efficient Voltage Scheduling and Energy-Aware Co-synthesis for Real-Time Embedded Systems Amjad Mohsen, Richard Hofmann | 3 | | Energy-Effective Instruction Fetch Unit for Wide Issue Processors Juan L. Aragón, Alexander V. Veidenbaum | 15 | | Rule-Based Power-Balanced VLIW Instruction Scheduling with Uncertainty Shu Xiao, Edmund MK. Lai, A.B. Premkumar | 28 | | An Innovative Instruction Cache for Embedded Processors Cheol Hong Kim, Sung Woo Chung, Chu Shik Jhon | 41 | | Dynamic Voltage Scaling for Power Aware Fast Fourier Transform (FFT) Processor David Fitrio, Jugdutt (Jack) Singh, Aleksandar (Alex) Stojcevski | 52 | | Session 1B: Methodologies and Architectures for<br>Application-Specific Systems | | | Design of an Efficient Multiplier-Less Architecture for Multi-dimensional Convolution Ming Z. Zhang, Hau T. Ngo, Vijayan K. Asari | 65 | | A Pipelined Hardware Architecture for Motion Estimation of<br>H.264/AVC Su-Jin Lee, Cheong-Ghil Kim, Shin-Dug Kim | 79 | | Embedded Intelligent Imaging On-Board Small Satellites Siti Yuhaniz, Tanya Vladimirova, Martin Sweeting | 90 | | Architectural Enhancements for Color Image and Video Processing on<br>Embedded Systems | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Jongmyon Kim, D. Scott Wills, Linda M. Wills | 104 | | A Portable Doppler Device Based on a DSP with High- Performance Spectral Estimation and Output Yufeng Zhang, Yi Zhou, Jianhua Chen, Xinling Shi, Zhenyu Guo | 118 | | Session 2A: Processor Architectures and<br>Microarchitectures | | | A Power-Efficient Processor Core for Reactive Embedded<br>Applications | | | Lei Yang, Morteza Biglari-Abhari, Zoran Salcic | 131 | | A Stream Architecture Supporting Multiple Stream Execution Models Nan Wu, Mei Wen, Haiyan Li, Li Li, Chunyuan Zhang | 143 | | The Challenges of Massive On-Chip Concurrency Kostas Bousias, Chris Jesshope | 157 | | FMRPU: Design of Fine-Grain Multi-context Reconfigurable Processing Unit | 4 | | Jih-Ching Chiu, Ren-Bang Lin | 171 | | Session 2B: High-Reliability and Fault-Tolerant<br>Architectures | | | Modularized Redundant Parallel Virtual File System Sheng-Kai Hung, Yarsun Hsu | 186 | | Resource-Driven Optimizations for Transient-Fault Detecting SuperScalar Microarchitectures | | | Jie S. Hu, G.M. Link, Johnsy K. John, Shuai Wang,<br>Sotirios G. Ziavras | 200 | | A Fault-Tolerant Routing Strategy for Fibonacci-Class Cubes Xinhua Zhang, Peter K.K. Loh | 215 | | Embedding of Cycles in the Faulty Hypercube | 000 | | Sun-Yuan Hsieh | 229 | | Session | 3A: | Compiler | and | OS | for | Emerging | |---------|-----------------------|----------|-----|----|-----|----------| | Archite | $\operatorname{ctur}$ | es | | | | | | Features Canqun Yang, Xuejun Yang, Jingling Xue | 236 | |---------------------------------------------------------------------------------------------------------------------------------------------|-----| | An Integrated Partitioning and Scheduling Based Branch Decoupling Pramod Ramarao, Akhilesh Tyagi | 252 | | A Register Allocation Framework for Banked Register Files with Access Constraints Feng Zhou, Junchao Zhang, Chengyong Wu, Zhaoqing Zhang | 269 | | Designing a Concurrent Hardware Garbage Collector for Small Embedded Systems Flavius Gruian, Zoran Salcic | 281 | | Irregular Redistribution Scheduling by Partitioning Messages Chang Wu Yu, Ching-Hsien Hsu, Kun-Ming Yu, CK. Liang, Chun-I Chen | 295 | | Session 3B: Data Value Predictions | | | Making Power-Efficient Data Value Predictions Yong Xiao, Xingming Zhou, Kun Deng | 310 | | Speculative Issue Logic You-Jan Tsai, Jong-Jiann Shieh | 323 | | Using Decision Trees to Improve Program-Based and Profile-Based Static Branch Prediction Veerle Desmet, Lieven Eeckhout, Koen De Bosschere | 336 | | Arithmetic Data Value Speculation Daniel R. Kelly, Braden J. Phillips | 353 | | Exploiting Thread-Level Speculative Parallelism with Software Value Prediction | 267 | | Xiao-Feng Li, Chen Yang, Zhao-Hui Du, Tin-Fook Ngai | 367 | # Keynote Address II | Challenges and Opportunities on Multi-core Microprocessor Jesse Fang | 389 | |----------------------------------------------------------------------------------------------------------------------------------------|-----| | Session 4A: Reconfigurable Computing Systems and Polymorphic Architectures | | | Software-Oriented System-Level Simulation for Design Space Exploration of Reconfigurable Architectures K.S. Tham, D.L. Maskell | 391 | | A Switch Wrapper Design for SNA On-Chip-Network Jiho Chang, Jongsu Yi, JunSeong Kim | 405 | | A Configuration System Architecture Supporting Bit-Stream Compression for FPGAs Marco Della Torre, Usama Malik, Oliver Diessel | 415 | | Biological Sequence Analysis with Hidden Markov Models on an FPGA Jacop Yanto, Timothy F. Oliver, Bertil Schmidt, Douglas L. Maskell | 429 | | FPGAs for Improved Energy Efficiency in Processor Based Systems P.C. Kwan, C.T. Clarke | 440 | | Morphable Structures for Reconfigurable Instruction Set Processors Siew-Kei Lam, Deng Yun, Thambipillai Srikanthan | 450 | | Session 4B: Interconnect Networks and Network<br>Interfaces | | | Implementation of a Hybrid TCP/IP Offload Engine Prototype Hankook Jang, Sang-Hwa Chung, Soo-Cheol Oh | 464 | | Matrix-Star Graphs: A New Interconnection Network Based on Matrix Operations | | | Hyeong-Ok Lee, Jong-Seok Kim, Kyoung-Wook Park, Jeonghyun Seo,<br>Eunseuk Oh | 478 | | The Channel Assignment Algorithm on RP(k) Networks Fang'ai Liu, Xinhua Wang, Liancheng Xu | 488 | | Table of Contents | XV | |-----------------------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Extending Address Space of IP Networks with Hierarchical Addressing | | | Tingrong Lu, Chengcheng Sui, Yushu Ma, Jinsong Zhao,<br>Yongtian Yang | 499 | | The Star-Pyramid Graph: An Attractive Alternative to the Pyramid N. Imani, H. Sarbazi-Azad | 509 | | Building a Terabit Router with XD Networks Huaxi Gu, Zengji Liu, Jungang Yang, Zhiliang Qiu, Guochang Kang | 520 | | Session 5A: Parallel Architectures and Computation Models | | | A Real Coded Genetic Algorithm for Data Partitioning and Scheduling in Networks with Arbitrary Processor Release Time S. Suresh, V. Mani, S.N. Omkar, H.J. Kim | 529 | | D3DPR: A Direct3D-Based Large-Scale Display Parallel Rendering<br>System Architecture for Clusters Zhen Liu, Jiaoying Shi, Haoyu Peng, Hua Xiong | 540 | | Determining Optimal Grain Size for Efficient Vector Processing on SIMD Image Processing Architectures Jongmyon Kim, D. Scott Wills, Linda M. Wills | 551 | | A Technique to Reduce Preemption Overhead in Real-Time Multiprocessor Task Scheduling Kyong Jo Jung, Chanik Park | 566 | | Session 5B: Hardware-Software Partitioning,<br>Verification, and Testing of Complex Architectures | | | Minimizing Power in Hardware/Software Partitioning Jigang Wu, Thambipillai Srikanthan, Chengbin Yan | 580 | | Exploring Design Space Using Transaction Level Models Youhui Zhang, Dong Liu, Yu Gu, Dongsheng Wang | 589 | | Increasing Embedding Probabilities of RPRPs in RIN Based BIST Dong-Sup Song, Sungho Kang | 600 | | A Practical Test Scheduling Using Network-Based TAM in Network on Chip Architecture Jin-Ho Ahn, Byung In Moon, Sungho Kang | 614 | |--------------------------------------------------------------------------------------------------------------------------------------------------|-----| | Session 6A: Architectures for Secured Computing | | | DRIL– A Flexible Architecture for Blowfish Encryption Using Dynamic Reconfiguration, Replication, Inner-Loop Pipelining, Loop Folding Techniques | | | T.S.B. Sudarshan, Rahil Abbas Mir, S. Vijayalakshmi | 625 | | Efficient Architectural Support for Secure Bus-Based Shared Memory Multiprocessor | | | Khaled Z. Ibrahim | 640 | | Covert Channel Analysis of the Password-Capability System Dan Mossop, Ronald Pose | 655 | | Session 6B: Simulation and Performance Evaluation | | | Comparing Low-Level Behavior of SPEC CPU and Java Workloads Andy Georges, Lieven Eeckhout, Koen De Bosschere | 669 | | Application of Real-Time Object-Oriented Modeling Technique for Real-Time Computer Control Jong-Sun Kim, Ji-Yoon Yoo | 680 | | VLSI Performance Evaluation and Analysis of Systolic and Semisystolic Finite Field Multipliers | | | Ravi Kumar Satzoda, Chip-Hong Chang | 693 | | Session 7: Architectures for Emerging Technologies and Applications I | | | Analysis of Real-Time Communication System with Queuing Priority Yunbo Wu, Zhishu Li, Yunhai Wu, Zhihua Chen, Tun Lu, Li Wang, Jianjun Hu | 707 | | FPGA Implementation and Analyses of Cluster Maintenance<br>Algorithms in Mobile Ad-Hoc Networks | | | Sai Ganesh Gopalan, Venkataraman Gayathri, Sabu Emmanuel | 714 | | Table of Contents | A V 11 | |----------------------------------------------------------------------------------------------------------------------------------------------|--------| | A Study on the Performance Evaluation of Forward Link in CDMA Mobile Communication Systems Sun-Kuk Noh | 728 | | Session 8: Memory Systems Hierarchy and<br>Management | | | Cache Leakage Management for Multi-programming Workloads Chun-Yang Chen, Chia-Lin Yang, Shih-Hao Hung | 736 | | A Memory Bandwidth Effective Cache Store Miss Policy Hou Rui, Fuxin Zhang, Weiwu Hu | 750 | | Application-Specific Hardware-Driven Prefetching to Improve Data Cache Performance Mehdi Modarressi, Maziar Goudarzi, Shaahin Hessabi | 761 | | Targeted Data Prefetching Weng-Fai Wong | 775 | | Session 9: Architectures for Emerging Technologies and Applications II | | | Area-Time Efficient Systolic Architecture for the DCT Pramod Kumar Meher | 787 | | Efficient VLSI Architectures for Convolution and Lifting Based 2-D Discrete Wavelet Transform Gab Cheon Jung, Seong Mo Park, Jung Hyoun Kim | 795 | | A Novel Reversible TSG Gate and Its Application for Designing<br>Reversible Carry Look-Ahead and Other Adder Architectures | .00 | | Himanshu Thapliyal, M.B. Srinivas | 805 | | Implementation and Analysis of TCP/IP Offload Engine and RDMA Transfer Mechanisms on an Embedded System In-Su Yoon, Sang-Hwa Chung | 818 | | 2.0 2.0 2.000, Sandy 11.00 Chang | 010 | | Author Index | 831 |