Skip to main content
Log in

Partitioning the Conventional DBT System for Multiprocessors

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Noticeable performance improvement via ever-increasing transistors is gradually trapped into a predicament since software cannot logically and efficiently utilize hardware resource, such as multi-core resource. This is an inevitable problem in dynamic binary translation (DBT) system as well. Though special purpose hardware as aide tool, through some interfaces, provided by DBT enables the system to achieve higher performance, the limitation of it is significant, that is, it is impossible to be used widely by another one. To overcome this drawback, we focus on building compatible software architecture to acquire higher performance without platform dependence. In this paper, we propose a novel multithreaded architecture for DBT system through partitioning distinct function module, which is to adequately utilize multiprocessors resource. This new architecture devides couples the common DBT system (DBTs) working routine into dynamic translation, optimization, and translated code execution phases, and then ramifies them into different threads to enable them concurrently executed. In this new architecture, several efficient novel methods are presented to cope with intractable work that puzzles most researchers, such as communication mechanism, cache layout, and mutual exclusion between threads. Experimental results using SPECint 2000 indicate that this new architecture for DBT system can achieve higher performance — speed up the traditional DBT system by about average 10.75%, with better CPU utilization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hu W W, Hou R, Xiao J H, Zhang L B. High performance general-purpose microprocessors: Past and future. Journal of Computer Science and Technology, 2006, 21(5): 631–640.

    Article  Google Scholar 

  2. Wells P, Chakraborty K, Sohi G. Dynamic heterogeneity and the need for multicore virtualization. ACM SIGOPS Operating Systems Review, 2009, 43(2): 5–14.

    Article  Google Scholar 

  3. Tera-scale research prototype: Connecting 80 simple sores on a single test chip. ftp://download.intel.com/research/platform/terascale/tera-scaleresearchprototypebackgrounder.pdf, Jan. 10, 2010.

  4. Moore RW, Baiocchi J A, Childers B R, Davidson JW, Hiser J D. Addressing the challenges of DBT for the ARM architecture. In Proc. LCTES, Dublin, Ireland, Jun. 19–20, 2009, pp.147–156.

  5. Bellard F. QEMU, a fast and portable dynamic translator. In Proc. USENIX ATC, Anaheim, USA, Apr. 10–15, 2005, p.41.

  6. Ung D, Cifuentes C. Machine-adaptable dynamic binary translation. In Proc. DYNAMO, Boston, USA, Jan. 18, 2000, pp.41–51.

  7. Wang C, Ying V, Wu Y. Supporting legacy binary code in a software transaction compiler with dynamic binary translation and optimization. In Proc. CC, Budapest, Hungary, Mar. 29-Apr. 6, 2008, pp.291–306.

  8. Kondoh G, Komatsu H. Dynamic binary translation specialized for embedded systems. In Proc. VEE, Pittsburgh, USA, Mar. 17–19, 2010, pp.157–166.

  9. Payer M, Gross T. Fast binary translation: Translation efficiency and runtime efficiency. In AMAS-BT, Austin, USA, June 20, 2009.

  10. Lu J, Chen H, Yew P, Hsu W. Design and implementation of a lightweight dynamic optimization system. Journal of Instruction-Level Parallelism, 2006, 6: 1–24.

    Google Scholar 

  11. Zhang W F, Brad C, Tullsen D M. An event-driven multithreaded dynamic optimization framework. In Proc. PACT, Saint Louis, USA, Sept. 17–21, 2005, pp.87–98.

  12. Hazelwood K, Lueck G, Cohn R. Scalable support for multithreaded applications on dynamic binary instrumentation systems. In Proc. ISMM, Dublin, Ireland, Jun. 9–20, 2009, pp.20–29.

  13. Adams K, Agesen O. A comparison of software and hardware techniques for x86 virtualization. In Proc. ASPLOS, San Jose, USA, Oct. 21–25, 2006, pp.2–13.

  14. Baraz L, Devor T, Etzion O, Goldenberg S, Skalesky A, Wang Y, Zemach Y. IA-32 execution layer: A two-phase dynamic translator designed to support IA-32 applications on Itaniumbased systems. In Proc. MICRO, San Diego, USA, Dec. 3–5, 2003, pp.191–201.

  15. Cmelik R F, Ditzel D R, Kelly E J, Hunter C B, Laird D A, Wing M J , Zyner G B. Combining hardware and software to provide an improved microprocessor. US Patent # 6031992, 2000.

  16. Klaiber A. The technology behind Crusoe processors. Transmeta Technical Brief, 2000.

  17. Li T, Liang A, Liu B, Lin L, Guan H. A hardware/software codesigned virtual machine to support multiple ISAS. In Proc. AMSBT, Beijing, China, Jun. 21, 2008, pp.38–44.

  18. Bala V, Duesterwald E, Banerjia S. Dynamo: A transparent runtime optimization system. In Proc. PLDI, Vancouver, Canada, Jun. 18–21, 2000, pp.1–12.

  19. Wang C, Wu Y, Araujo G. Software-based transparent and comprehensive control-flow error detection. In Proc. CGO, New York, USA, Mar. 26–29, 2006, pp.333–345.

  20. Luk C, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi V, Hazelwood K. Pin: Building customized program analysis tools with dynamic instrumentation. In Proc. PLDI, Chicago, USA, Jun. 12–15, 2005, pp.190–200.

  21. Qin F, Wang C, Li Z, Kim H S, Zhou Y, Wu Y. LIFT: A low-overhead practical information flow tracking system for detecting security attacks. In Proc. MICRO, Orlando, USA, Dec. 9–13, 2006, pp.135–148.

  22. Wu Q, Reddi V, Wu Y, Lee J, Conners D, Brooks D, Martonosi M, Clark D. A dynamic compilation framework for controlling microprocessor energy and performance. In Proc. MICRO, Barcelona, Spain, Nov. 12–16, 2005, pp.271–282.

  23. Sridhar S, Shapiro J S, Bungale P P. HDTrans: A lowoverhead dynamic translator. ACM SIGARCH Computer Architecture News, 2005, 35(1): 135–140.

    Article  Google Scholar 

  24. Hiser J D,Williams D, HuW, Davidson JW, Mars J, Childers B R. Evaluating indirect branch handling mechanisms in software dynamic translation systems. In Proc. CGO, San Jose, USA, Mar. 11–14, 2007, pp.61–73.

  25. Shi H H,Wang Y, Guan H B, Liang A L. An intermediate language level optimization framework for dynamic binary translation. ACM SIG/PLAN Notice, 2007, 42(5): 3–9.

    Article  Google Scholar 

  26. SPEC CPU2000 documentation, http://www.spec.org/osg/cpu2000/docs/, Jan. 10, 2010.

  27. Hazelwood K, Smith M D. Managing bounded code caches in dynamic binary optimization systems. ACM Transactions on Architecture and Code Optimization, 2006, 3(3): 263–294.

    Article  Google Scholar 

  28. Stallings W. Operating Systems: Internals and Design Principles. Sixth Edition, Prentice Hall, 2008.

  29. Sun Y, Zhang W. Improving Java performance and energy dissipation through efficient code caching. Design Automation for Embedded Systems, 2009, 13(3): 179–192.

    Article  Google Scholar 

  30. Baiocchi J, Childers B R. Heterogeneous code cache: Using scratchpad and main memory in dynamic binary translators. In Proc. DAC, San Francisco, USA, Jul. 26–31, 2009, pp.744–749.

  31. Hazelwood K, Smith M D. Code cache management schemes for dynamic optimizers. In Proc. INTERACT, Sydney, Australia, Jul. 21–25, 2002, p.102.

  32. Hazelwood K. Code cache management in dynamic optimization systems [Ph.D. Dissertation]. Harvard University, May, 2004.

  33. Chernoff A, Herdeg M, Hookway R, Reeve C, Rubin N, Tye T, Yadavalli S B, Yates J. FX!32: A profile-directed binary translator. IEEE Micro, 1998, 18(2): 56–64.

    Article  Google Scholar 

  34. Ebcioglu K, Altman E R. DAISY: Dynamic complication for 100% architectural compatibility. In Proc. ISCA, Denver, USA, Jun. 2–4, 1997, pp.26–37.

  35. Altman E R, Gschwind M, Sathaye S, Kosonocky S, Bright A, Fritts J, Ledak P, Appenzeller D, Filan Z. BOA: The architecture of a binary translation processor. IBM Research Report RC 21665, 1999.

  36. Dehnert J C, Grant B K, Banning J P, Johnson R, Kistler T, Klaiber A, Mattson J. The Transmeta Code Morphing Software: Using speculation, recovery, and adaptive retranslation to address real-life challenges. In Proc. CGO, San Francisco, USA, Mar. 23–26, 2003, pp.15–24.

  37. Scott K, Kumar N, Velusamy S, Childers B, Davidson J W, Soffa M L. Retargetable and reconfigurable software dynamic translation. In Proc. CGO, San Francisco, USA, Mar. 23–26, 2003, pp.36–47.

  38. Cifuentes C, Lewis B, Ung D. Walkabout — A retargetable dynamic binary translation framework. In Workshop on Binary Translation, Charlottesville, USA, Sept. 22–25, 2002.

  39. Cifuentes C, Emmerik M. UQBT: Adaptable binary translation at low cost. Computer, 2000, 33(3): 60–66.

    Article  Google Scholar 

  40. Bruening D, Duesterwald E, Amarasinghe S. Design and implementation of a dynamic optimization framework for Windows. In Workshop on FDDO, Austin, USA, Dec. 1, 2001.

  41. Chen W K, Lerner S, Chaiken R, Gilles D M. Mojo: A dynamic optimization system. In Workshop on FDDO, Monterey, USA, Dec. 10, 2000.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ru-Hui Ma.

Additional information

This work was supported by the National Natural Science Foundation of China under Grant Nos. 60970108, 60970107, the Science and Technology Commission of Shanghai Municipality under Grant Nos. 09510701600, 10DZ1500200, 10511500102, IBM SUR Funding and IBM Research-China JP Funding.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 86.9 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, RH., Guan, HB., Zhu, EZ. et al. Partitioning the Conventional DBT System for Multiprocessors. J. Comput. Sci. Technol. 26, 474–490 (2011). https://doi.org/10.1007/s11390-011-1148-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-1148-1

Keywords

Navigation