Abstract
High-performance and low-power VLIW DSP processors are increasingly being deployed in mobile devices to process video and multimedia applications. The diverse applications of such systems has led to recent research efforts focusing on their resource management and kernel scheduling. In this paper, we address the enhancing the performance of the microkernel for a VLIW DSP processor, called PAC architectures. In order to reduce the number of read and write ports in register files of VLIW architectures, so as to reduce both the power consumption and implementation costs, a distributed register file and multibank register architectures are being adopted in PAC architectures. These methods present challenges for microkernel designs in terms of reducing context switch overhead. In our work, we propose a multiset descriptor mechanism with compiler support to reduce the context switch overheads associated with the use of registers. The experiments were done with the microkernel system called pCore which has an efficient and tiny design that prunes its code size down under 11 Kbytes. Experimental results show that our multiset context-switching mechanism may reduce the context switch overhead up to 30%.
References
International Technology Roadmap for Semiconductors. http://public.itrs.net.
Texas Instruments, Inc. TMS320 DSP/BIOS User’s guide, Nov, 2001.
T.-J. Lin, C.-C. Lee, C.-W. Liu, and C.-W. Jen, “A Novel Register Organization for VLIW Digital Signal Processors,” Proceedings of 2005 IEEE International Symposium on VLSI Design, Automation, and Test, 2005, pp. 335–338.
S. Rixner, W. J. Dally, B. Khailany, P. Mattson, U. J. Kapasi, and J. D. Owens, “Register Organization for Media Processing,” International Symposium on High Performance Computer Architecture (HPCA), 2000, pp. 375–386.
C.-R. Lee, J.-K. Lee, T.-T. Hwang, and Shih-Chun Tsai, “Compiler Optimizations on VLIW Instruction Scheduling for Low Power,” ACM Transact Des Automat Electron Syst, vol. 8, no. 2, 2003, pp. 252–268.
Y.-P. You, C.-R. Lee, and J.-K. Lee, “Compilers for Leakage Power Reductions,” ACM Transact Des Automat Electron Syst, vol. 11, no. 1, 2006, pp. 147–166 (Jan).
P.-S. Chen, Y.-S. Hwang, D.-C. Ju, and J. K. Lee, “Interprocedural Probabilistic Pointer Analysis,” IEEE Trans Parallel Distrib Syst, vol. 15, no. 10, 2004, pp. 893–907 (Oct).
C.-W. Chen, Y.-C. Lin, C.-L. Tang, and J.-K. Lee, “ORC2DSP: Compiler Infrastructure Supports for VLIW DSP Processors,” IEEE VLSI TSA, April 27–29, 2005.
OMAP5910 Dual Core Processor—Technical Reference Manual, Texas Instruments, Jan, 2003.
D. C.-W. Chang, C.-W. Jen, I.-T. Liao, J.-K. Lee, W.-F. Chen, and S.-Y. Tseng, “PAC DSP Core and Application Processors,” IEEE International Conference on Multimedia & Expo (ICME), Toronto, July 9–12, 2006.
Y.-C. Lin, C.-L. Tang, C.-J. Wu, M.-Y. Hung, Y.-P. You, Y.-C. Moo, S.-Y. Chen, and J. K. Lee, “Compiler Supports and Optimizations for PAC VLIW DSP Processors,” Languages and Compilers for Parallel Computing, USA, Oct. 2005.
C.-J. Wu, S.-Y. Chen, and J.-K. Lee, “Copy Propagation Optimizations for VLIW DSP Processors with Distributed Register Files,” Languages and Compilers for Parallel Computing, USA, Nov. 2006.
R. Thekkath and S. J. Eggers, “The Effectiveness of Multiple Hardware Contexts,” ACM SIGPLAN Not, vol. 29, pp. 328–337.
A. Agarwal, B.-H. Lim, and J. Kubiatowicz, “APRIL: A Processor Architecture for Multiprocessing,” Proceedings of 17th Annual International Symposium on Computer Architecture, 1998.
DSP/BIOS Timing Benchmarks for Code Composer Studio v2.2. Application Report SPRA900B, Texas Instruments, April, 2004.
C.-W. Chen, C.-L. Tang, Y.-C. Lin, and J.-K. Lee, “ORC2DSP: Compiler Infrastructure Supports for VLIW DSP Processors,” Proceedings of 2005 IEEE International Symposium on VLSI Design, Automation, and Test, April 27–29, 2005.
R. Ju, S. Chan, and C. Wu, “Open Research Compiler for the Itanium Family,” Tutorial at the 34th Annual Int’l Symposium on Microarchitecture, Dec., 2001
C. P. Thacker, E. M. McCreight, B. W. Lampson, R. F. Sproull, and D. R. Boggs, “Alto: A Personal Computer,” in Computer Strutures: Principles and Examples, C. Gordon Bell Daniel P. Siewiorek and Allen Newell (Eds.), McGraw-Hill, 1982, pp. 549–572.
B. J. Smith, “Architecture and applications of the HEP Multiprocessor Computer System,” in SPIE, vol. 298, 1981, pp. 241–248.
R. A. Iannucci, Toward a Dataflow/von Neumann Hybrid, Proceedings of the 15th Annual International Symposium on Computer Architecture, 1988, pp. 131–140.
W.-D. Weber and A. Gupta, “Exploring the Benefits of Multiple Hardware Contexts in a Microprocessor Architecture: Preliminary Results,” Proceedings of the 16th annual International Symposium on Computer Architecture, 1989.
P.r R. Nuth and W. J. Dally, A Mechanism for Efficient Context Switching. In ACM SIGARCH and IEEE Workshop on Multithreaded Computers, Supercomputing ‘91, November, 1991.
V. Soundararajan and A. Agarwal, Dribbling Registers: A Mechanism for Reducing Context Switch Latency in Large-Scale Multiprocessors. Laboratory for Computer Science Technical Memo MIT/LCS/TM-474, M.I.T., November 6, 1992, p. 21.
J. Snyder, D. Whalley, and T. Baker, “Fast Context Switches: Compiler and Architectural Support for Preemptive Scheduling,” J Microprocess Microsyst, vol. 19, no. 1, 1995, pp. 35–42.
B. Zolfaghari, “A Dynamic Scheduling Algorithm with Minimum Context Switches for Spacecraft Avionics Systems,” Proceedings of IEEE Aerospace Conference, 2004.
I. Hong, M. Potkonjak, and M. Papaefthymiou, “Efficient Block Scheduling to Minimize Context Switching Time for Programmable Embedded Processors,” Des Autom Embed Syst, vol. 4, no. 4, 1999, pp. 311–327 (Oct).
V. Barthelmann, “Inter-Task Register-Allocation for Static Operating Systems,” Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems: Software and Compilers for Embedded Systems, 2002.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hsieh, KY., Lin, YC., Huang, CC. et al. Enhancing Microkernel Performance on VLIW DSP Processors via Multiset Context Switch. J Sign Process Syst Sign Image 51, 257–268 (2008). https://doi.org/10.1007/s11265-007-0060-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-007-0060-y