Skip to main content
Log in

An approach to minimizing the interpretation overhead in Dynamic Binary Translation

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Dynamic Binary Translation (DBT) has been widely utilized to convert binary code for one Instruction Set Architecture (ISA) to another at run-time and optimize the code when necessary. A two-stage strategy often applies to DBT, which handles hot code and cold code separately using translation and interpretation respectively to ensure execution efficiency. However, an excessively high overhead of interpretation remains to be tackled. It has been observed that interpretation usually involves a large number of redundant redecoding operations. This paper introduces an approach, namely Decoded Instruction Cache (DICache), which caches the information of the interpreted instructions in the history and attempts to reuse the information as much as possible in the future. Performance benchmark has been carried out with the software and the hardware implementations of DICache. The experimental results indicate that DICache can significantly remove the redundancy of redecoding operations, and this results in a dramatic decline of interpretation overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ebcioglu K, Altman E, Gschwind M, Sathaye S (2001) Dynamic Binary Translation and optimization. IEEE Trans Comput 50(6):529–548

    Article  Google Scholar 

  2. Sites RL, Chernoff A, Kirk MB, Marks MP, Robinson SG (1993) Binary translation. Commun ACM 36(2):69–81

    Article  Google Scholar 

  3. VMware Corporation (2011) Understanding full virtualization, paravirtualization, and hardware assist. http://www.vmware.com/resources/techresources/1008

  4. Dehnert JC, Grant BK, Banning JP, Johnson R, Kistler T, Klaiber A, Mattson J (2003) The Transmeta Code MorphingTM software: using speculation, recovery, and adaptive retranslation to address real-life challenges. In: Proceedings of the 1st annual IEEE/ACM international symposium on code generation and optimization (CGO’03), Washington, DC, USA. IEEE Press, New York, pp 5–24

    Google Scholar 

  5. Chapman M, Magenheimer DJ, Ranganathan P (2007) MagiXen: combining binary translation and virtualization. Hewlett-Packard (HP) Technical Report. May. http://www.hpl.hp.com/techreports/2007/HPL-2007-77.html

  6. Borin R, Wang C, Wu Y, Araujo G (2006) Software-based transparent and comprehensive control-flow error detection. In: Proceedings of the international symposium on code generation and optimization (CGO’06), Washington, DC, USA. IEEE Press, New York, pp 333–345

    Chapter  Google Scholar 

  7. Luk C, John R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi VJ, Hazelwood K (2005) Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the ACM SIGPLAN conference on programming language design and implementation (PLDI’05), Chicago. ACM Press, New York, pp 190–200

    Google Scholar 

  8. Qin F, Wang C, Li Z, Kim H-S, Zhou Y, Wu Y (2006) LIFT: a low-overhead practical information flow tracking system for detecting security attacks. In: Proceedings of the 39th annual IEEE/ACM international symposium on microarchitecture (MICRO’06), Orlando, USA. IEEE Press, New York, pp 135–148

    Chapter  Google Scholar 

  9. Smith JE, Nair R (2006) Virtual machines versatile platforms for systems and processer. Publishing House of Electronics Industry, Beijing

    Google Scholar 

  10. Ebcioglu K, Altman E (1997) DAISY: dynamic compilation for 100% architectural compatibility. In: Proceedings of the 24th international symposium on computer architecture, Denver, USA. IEEE Press, New York, pp 26–37

    Google Scholar 

  11. Ung D, Cifuentes C (2000) Machine-adaptable Dynamic Binary Translation. In: Proceedings of the ACM SIGPLAN workshop on dynamic and adaptive compilation and optimization, Boston, USA. ACM Press, New York, pp 30–40

    Google Scholar 

  12. Hookway RJ, Herdeg MA (1997) Digital FX!32: combining emulation and binary translation. J Digit Technol 9(1):3–12

    Google Scholar 

  13. Hu S, Smith JE (2006) Reducing startup time in co-designed virtual machines. In: Proceedings of the 33th international symposium on computer architecture (ISCA’06), Boston, USA. IEEE Press, New York, pp 277–288

    Google Scholar 

  14. Hwu W-Wm, Mahlke SA, Chen WY, Chang PP, Warter NJ, Bringmann RA, Ouellette RG, Hank RE, Kiyohara T, Haab GE, Holm JG, Lavery DM (1993) The superblock: an effective technique for VLIW and superscalar compilation. J Supercomput 7(1–2):229–248

    Article  Google Scholar 

  15. Dean J, Hicks JE, Waldspurger CA, Weihl WE, Chrysos G (2000) ProfileMe: hardware support for instruction-level profiling on out-of-order processors. In: Proceedings of the 27th international symposium on computer architecture (ISCA’00), Vancouver, Canada. IEEE Press, New York, pp 316–325

    Google Scholar 

  16. Duesterwald E, Bala V (2000) Software profiling for hot path prediction: less is more. In: Proceedings of the 12th international conference on architectural support for programming languages and operating systems (ASPLOS’00), Cambridge, MA. ACM Press, New York, pp 202–211

    Google Scholar 

  17. Ball T, Larus JR (1996) Efficient path profiling. In: Proceedings of the 29th annual ACM/IEEE international symposium on microarchitecture (MICRO’96), Paris, France. IEEE Press, New York, pp 46–57

    Chapter  Google Scholar 

  18. Moseley T, Shye A, Reddi V, Grunwald D, Peri R (2007) Shadow profiling: hiding instrumentation costs with parallelism. In: Proceedings of the international symposium on code generation and optimization (CGO’07), Washington, DC, USA. IEEE Press, New York, pp 198–208

    Chapter  Google Scholar 

  19. Ung D, Cifuentes C (2000) Optimising hot paths in a Dynamic Binary Translator. ACM SIGARCH Comput Archit News 29(1):55–65

    Article  Google Scholar 

  20. Rosner R, Almog Y, Moffie M, Schwartz N, Mendelson A (2004) Power Awareness through selective dynamically optimized traces. In: Proceedings of the 31st international symposium on computer architecture (ISCA’04), München, Germany. IEEE Press, New York, pp 162–175

    Chapter  Google Scholar 

  21. Bala V, Duesterwald E, Banerjia S (2000) Dynamo: a transparent dynamic optimization system. In: Proceedings of the ACM SIGPLAN conference on programming language design and implementation (PLDI’00), Vancouver, Canada. ACM Press, New York, pp 1–12

    Chapter  Google Scholar 

  22. Aho AV, Sethi R, Ullman JD (1986) Compilers: principles, techniques, and tools. Addison-Wesley, Boston

    Google Scholar 

  23. CPU2000 SPEC (2000) http://www.spec.org/cpu

  24. Chen W, Lu H, Shen L, Wang Z, Xiao N, Zheng Z (2009) A hardware approach for reducing interpretation overhead. In: Proceedings of the 9th IEEE international conference on computer and information technology (CIT’09), Washington, DC, USA. IEEE Press, New York, pp 98–103

    Chapter  Google Scholar 

  25. Chen W, Lu H, Shen L, Wang Z, Xiao N (2009) Using Pcache to Speedup interpretation in Dynamic Binary Translation. In: Proceedings of the 7th IEEE international symposium on parallel and distributed processing with applications (ISPA’09), presented in the 2009 international workshop on architecture support for virtualization techniques, Washington, DC, USA. IEEE Press, New York, pp 255–230

    Google Scholar 

  26. Klint P (1981) Interpretation techniques. Softw Pract Exp 11(9):963–973

    Article  Google Scholar 

  27. Kogge PM (1982) An architecture trail to threaded-code systems. Computer 15(3):22–34

    Article  Google Scholar 

  28. Bell JR (1973) Threaded Code. Communications of the ACM (Jun):370–372

  29. Bochs (2011) http://bochs.sourceforge.net

  30. Chen D, Turner SJ, Cai W, Theodoropoulos GK, Xiong M, Lees M (2010) Synchronization in federation community networks. J Parallel Distrib Comput 70(2):144–159. doi:10.1016/j.jpdc.2009.10.006

    Article  MATH  Google Scholar 

  31. Chen D, Turner SJ, Cai W, Xiong M (2008) A decoupled federate architecture for high level architecture-based distributed simulation. J Parallel Distrib Comput 68(11):1487–1503. doi:10.1016/j.jpdc.2008.07.010

    Article  Google Scholar 

  32. Chen D, Ewald R, Theodoropoulos GK (2008) Data access in distributed simulations of multi-agent systems. J Syst Softw 81(12):2345–2360

    Article  Google Scholar 

  33. Chen D, Turner SJ, Cai W (2008) Towards fault-tolerant HLA-based distributed simulations. SIMULATION: Trans Soc Model Simul Int 84(10/11):493–509

    Article  Google Scholar 

  34. Chen D, Theodoropoulos GK, Turner SJ, Cai W, Minson R, Zhang Y (2008) Large scale agent-based simulation on the grid. Future Gener Comput Syst 24(7):658–671

    Article  Google Scholar 

  35. Chen D, Turner SJ, Cai W, Gan BP, Low MYH (2005) Algorithms for HLA-based distributed simulation cloning. ACM Trans Model Comput Simul 15(4):316–345

    Article  Google Scholar 

  36. Intel Corporation (2011) Intel® 64 and IA-32 architectures software developer’s manual, vol 1: basic architecture. http://www.intel.com/Assets/PDF/manual/253665.pdf

  37. Hennessy JL, Patterson DA (2002) Computer architecture, a quantitative approach, 3rd edn. Morgan Kaufmann, San Mateo

    MATH  Google Scholar 

  38. SimpleScalar 3.0 (2011) http://www.simplescalar.com

  39. HP Corporation (2011) CACTI. http//quid.hpl.hp.com:9081/cacti/

  40. Semeraro G, Magklis G, Balasubramonian R, Albonesi DH, Dwarkadas S, Scott ML (2002) Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. In: Proceedings of the 8th international symposium on high-performance computer architecture (HPCA’02), Washington, DC, USA. IEEE Press, New York, pp 29–42

    Chapter  Google Scholar 

  41. Chen D, Li D, Xiong M, Bao H, Li X (2010) GPGPU-aided ensemble empirical mode decomposition for EEG analysis during anaesthesia. IEEE Trans Inf Technol Biomed 14(6):1417–1427

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dan Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, W., Chen, D. & Wang, Z. An approach to minimizing the interpretation overhead in Dynamic Binary Translation. J Supercomput 61, 804–825 (2012). https://doi.org/10.1007/s11227-011-0636-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-011-0636-y

Keywords

Navigation