Skip to main content
Log in

Enhancing Microkernel Performance on VLIW DSP Processors via Multiset Context Switch

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

High-performance and low-power VLIW DSP processors are increasingly being deployed in mobile devices to process video and multimedia applications. The diverse applications of such systems has led to recent research efforts focusing on their resource management and kernel scheduling. In this paper, we address the enhancing the performance of the microkernel for a VLIW DSP processor, called PAC architectures. In order to reduce the number of read and write ports in register files of VLIW architectures, so as to reduce both the power consumption and implementation costs, a distributed register file and multibank register architectures are being adopted in PAC architectures. These methods present challenges for microkernel designs in terms of reducing context switch overhead. In our work, we propose a multiset descriptor mechanism with compiler support to reduce the context switch overheads associated with the use of registers. The experiments were done with the microkernel system called pCore which has an efficient and tiny design that prunes its code size down under 11 Kbytes. Experimental results show that our multiset context-switching mechanism may reduce the context switch overhead up to 30%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. International Technology Roadmap for Semiconductors. http://public.itrs.net.

  2. Texas Instruments, Inc. TMS320 DSP/BIOS User’s guide, Nov, 2001.

  3. T.-J. Lin, C.-C. Lee, C.-W. Liu, and C.-W. Jen, “A Novel Register Organization for VLIW Digital Signal Processors,” Proceedings of 2005 IEEE International Symposium on VLSI Design, Automation, and Test, 2005, pp. 335–338.

  4. S. Rixner, W. J. Dally, B. Khailany, P. Mattson, U. J. Kapasi, and J. D. Owens, “Register Organization for Media Processing,” International Symposium on High Performance Computer Architecture (HPCA), 2000, pp. 375–386.

  5. C.-R. Lee, J.-K. Lee, T.-T. Hwang, and Shih-Chun Tsai, “Compiler Optimizations on VLIW Instruction Scheduling for Low Power,” ACM Transact Des Automat Electron Syst, vol. 8, no. 2, 2003, pp. 252–268.

    Article  Google Scholar 

  6. Y.-P. You, C.-R. Lee, and J.-K. Lee, “Compilers for Leakage Power Reductions,” ACM Transact Des Automat Electron Syst, vol. 11, no. 1, 2006, pp. 147–166 (Jan).

    Article  Google Scholar 

  7. P.-S. Chen, Y.-S. Hwang, D.-C. Ju, and J. K. Lee, “Interprocedural Probabilistic Pointer Analysis,” IEEE Trans Parallel Distrib Syst, vol. 15, no. 10, 2004, pp. 893–907 (Oct).

    Article  Google Scholar 

  8. C.-W. Chen, Y.-C. Lin, C.-L. Tang, and J.-K. Lee, “ORC2DSP: Compiler Infrastructure Supports for VLIW DSP Processors,” IEEE VLSI TSA, April 27–29, 2005.

  9. OMAP5910 Dual Core Processor—Technical Reference Manual, Texas Instruments, Jan, 2003.

  10. D. C.-W. Chang, C.-W. Jen, I.-T. Liao, J.-K. Lee, W.-F. Chen, and S.-Y. Tseng, “PAC DSP Core and Application Processors,” IEEE International Conference on Multimedia & Expo (ICME), Toronto, July 9–12, 2006.

  11. Y.-C. Lin, C.-L. Tang, C.-J. Wu, M.-Y. Hung, Y.-P. You, Y.-C. Moo, S.-Y. Chen, and J. K. Lee, “Compiler Supports and Optimizations for PAC VLIW DSP Processors,” Languages and Compilers for Parallel Computing, USA, Oct. 2005.

  12. C.-J. Wu, S.-Y. Chen, and J.-K. Lee, “Copy Propagation Optimizations for VLIW DSP Processors with Distributed Register Files,” Languages and Compilers for Parallel Computing, USA, Nov. 2006.

  13. R. Thekkath and S. J. Eggers, “The Effectiveness of Multiple Hardware Contexts,” ACM SIGPLAN Not, vol. 29, pp. 328–337.

  14. A. Agarwal, B.-H. Lim, and J. Kubiatowicz, “APRIL: A Processor Architecture for Multiprocessing,” Proceedings of 17th Annual International Symposium on Computer Architecture, 1998.

  15. DSP/BIOS Timing Benchmarks for Code Composer Studio v2.2. Application Report SPRA900B, Texas Instruments, April, 2004.

  16. C.-W. Chen, C.-L. Tang, Y.-C. Lin, and J.-K. Lee, “ORC2DSP: Compiler Infrastructure Supports for VLIW DSP Processors,” Proceedings of 2005 IEEE International Symposium on VLSI Design, Automation, and Test, April 27–29, 2005.

  17. R. Ju, S. Chan, and C. Wu, “Open Research Compiler for the Itanium Family,” Tutorial at the 34th Annual Int’l Symposium on Microarchitecture, Dec., 2001

  18. C. P. Thacker, E. M. McCreight, B. W. Lampson, R. F. Sproull, and D. R. Boggs, “Alto: A Personal Computer,” in Computer Strutures: Principles and Examples, C. Gordon Bell Daniel P. Siewiorek and Allen Newell (Eds.), McGraw-Hill, 1982, pp. 549–572.

  19. B. J. Smith, “Architecture and applications of the HEP Multiprocessor Computer System,” in SPIE, vol. 298, 1981, pp. 241–248.

  20. R. A. Iannucci, Toward a Dataflow/von Neumann Hybrid, Proceedings of the 15th Annual International Symposium on Computer Architecture, 1988, pp. 131–140.

  21. W.-D. Weber and A. Gupta, “Exploring the Benefits of Multiple Hardware Contexts in a Microprocessor Architecture: Preliminary Results,” Proceedings of the 16th annual International Symposium on Computer Architecture, 1989.

  22. P.r R. Nuth and W. J. Dally, A Mechanism for Efficient Context Switching. In ACM SIGARCH and IEEE Workshop on Multithreaded Computers, Supercomputing ‘91, November, 1991.

  23. V. Soundararajan and A. Agarwal, Dribbling Registers: A Mechanism for Reducing Context Switch Latency in Large-Scale Multiprocessors. Laboratory for Computer Science Technical Memo MIT/LCS/TM-474, M.I.T., November 6, 1992, p. 21.

  24. J. Snyder, D. Whalley, and T. Baker, “Fast Context Switches: Compiler and Architectural Support for Preemptive Scheduling,” J Microprocess Microsyst, vol. 19, no. 1, 1995, pp. 35–42.

    Article  Google Scholar 

  25. B. Zolfaghari, “A Dynamic Scheduling Algorithm with Minimum Context Switches for Spacecraft Avionics Systems,” Proceedings of IEEE Aerospace Conference, 2004.

  26. I. Hong, M. Potkonjak, and M. Papaefthymiou, “Efficient Block Scheduling to Minimize Context Switching Time for Programmable Embedded Processors,” Des Autom Embed Syst, vol. 4, no. 4, 1999, pp. 311–327 (Oct).

    Article  Google Scholar 

  27. V. Barthelmann, “Inter-Task Register-Allocation for Static Operating Systems,” Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems: Software and Compilers for Embedded Systems, 2002.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jenq-Kuen Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsieh, KY., Lin, YC., Huang, CC. et al. Enhancing Microkernel Performance on VLIW DSP Processors via Multiset Context Switch. J Sign Process Syst Sign Image 51, 257–268 (2008). https://doi.org/10.1007/s11265-007-0060-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-007-0060-y

Keywords

Navigation