skip to main content
10.1145/2611354.2611368acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
tutorial

SPTU: Improving Dynamic Binary Translation through Software Prediction with Target Updating

Authors Info & Claims
Published:30 June 2014Publication History

ABSTRACT

In dynamic translation system, handling indirect branch is a major source of performance overhead, because it must perform an on-the-fly address translation at each indirect branch execution. The translation systems usually adopt software prediction to reduce the overhead of address translation, but the low prediction accuracy restricts the performance improvement.

This paper analyzes the performance bottleneck of software prediction, and proposes a novel prediction mechanism called Software Prediction with Target Updating (SPTU), which can significantly improve the prediction accuracy with an acceptable overhead. Based on the observation of the phase characteristic of branch targets, SPTU adopts a coarse-grained target updating mechanism, which updates the prediction targets at a proper frequency. SPTU leverages software prediction miss count to detect phase status, and triggers target updating only when the branch phase changes.

The experiment shows that, compared with software prediction, SPTU can improve the average prediction accuracy from 48.0% to 77.5%, and reduces the performance overhead by 21.6% on average. Furthermore, SPTU could cooperate with other optimization techniques for handling indirect branches.

References

  1. M. Souza, D. Nicacio, and G. Araujo. ISAMAP: instruction mapping driven by dynamic binary translation. In Proceedings of the 2010 international conference on Computer Architecture, pages 117--138, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, pages 89--100, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Payer and T. R. Gross. Fine-grained user-space security through virtualization. In Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pages 157--168, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Jones and N. Topham. High speed CPU simulation using LTU dynamic binary translation. In High Performance Embedded Architectures and Compilers, pages 50--64, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Noll and T. R. Gross. An infrastructure for dynamic optimization of parallel programs. In Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, pages 325--326, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Kaufmann and R. G. Spallek. Superblock compilation and other optimization techniques for a Java-based DBT machine emulator. In Proceedings of the 9th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pages 33--40, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. D. Hiser, D. W. Williams, W. Hu, J. W. Davidson, J. Mars, and B. R. Childers. Evaluating indirect branch handling mechanisms in software dynamic translation systems. ACM Transactions on Architecture and Code Optimization, vol. 8, pages 1--28, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Borin and Y. Wu. Characterization of DBT overhead. In Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC), pages 178--187, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In Proceedings of the International Symposium on Code generation and Optimization: Feedback-directed and Runtime Optimization, pages 265--275, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 190--200, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H.-S. Kim and J. E. Smith. Hardware support for control transfers in code caches. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, pages 253--264, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Scott and J. Davidson. Strata: a software dynamic translation infrastructure. In IEEE Workshop on Binary Translation, 2001.Google ScholarGoogle Scholar
  13. S. Sridhar, J. S. Shapiro, E. Northup, and P. P. Bungale. HDTrans: an open source, low-level dynamic instrumentation system. In Proceedings of the 2nd International Conference on Virtual Execution Environments, pages 175--185, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Ishizaki, M. Kawahito, T. Yasue, H. Komatsu, and T. Nakatani. A study of devirtualization techniques for a Java Just-In-Time compiler. In Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages 294--310, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Driesen and U. Hölzle. Accurate indirect branch prediction. In Proceedings of the 25th annual international symposium on Computer architecture, pages 167--178, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P.-Y. Chang, E. Hao, and Y. N. Patt. Target prediction for indirect jumps. In Proceedings of the 24th annual international symposium on Computer architecture, pages 274--283, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Jia, C. Yang, D. Tong, and K. Wang. Correlated Software Prediction for Indirect Branch in Dynamic Translation Systems. Journal of Computer Research and Development, vol. 50, 2013 (in Chinese).Google ScholarGoogle Scholar
  18. N. Jia, C. Yang, J. Wang, D. Tong, and K. Wang. SPIRE: improving dynamic binary translation through SPC-indexed indirect branch redirecting. In Proceedings of the 9th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pages 1--12, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Payer and T. R. Gross. Generating low-overhead dynamic binary translators. In Proceedings of the 3rd Annual Haifa Experimental Systems Conference, pages 1--14, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Koju, X. Tong, A. I. Sheikh, M. Ohara, and T. Nakatani. Optimizing indirect branches in a system-level dynamic binary translator. In Proceedings of the 5th Annual International Systems and Storage Conference, pages 1--12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W. Hu, J. Wang, X. Gao, Y. Chen, Q. Liu, and G. Li. Godson-3: a scalable multicore RISC processor with x86 emulation. IEEE Micro, vol. 29, pages 17--29, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Li and C. Wu. A new replacement algorithm on content associative memory for binary translation system. In Proceedings of the Workshop on Architectural and Microarchitectural Support for Binary Translation, pages 45--54, 2008.Google ScholarGoogle Scholar
  23. D. Mihocka and S. Shwartsman. Virtualization Without Direct Execution or Jitting: Designing a Portable Virtual Machine Infrastructure. In Proceedings of the Workshop on Architectural and Microarchitectural Support for Binary Translation, pages 55--70, 2008.Google ScholarGoogle Scholar
  24. H. Guan, B. Liu, Z. Qi, Y. Yang, H. Yang, and A. Liang. CoDBT: A multi-source dynamic binary translator using hardware-software collaborative techniques. Journal of Systems Architecture, vol. 56, pages 500--508, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Hazelwood and M. D. Smith. Generational cache management of code traces in dynamic optimization systems. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, page 169, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Guha, K. hazelwood, and M. L. Soffa. DBT path selection for holistic memory efficiency and performance. In Proceedings of the 6th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pages 145--156, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C.-C. Hsu, P. Liu, J.-J. Wu, P.-C. Yew, D.-Y. Hong, W.-C. Hsu, and C.-M. Wang. Improving dynamic binary optimization through early-exit guided code region formation. In Proceedings of the 9th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pages 23--32, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Guha, K. Hazelwood, and M. L. Soffa. Reducing exit stub memory consumption in code caches. In Proceedings of the 2nd international Conference on High Performance Embedded Architectures and Compilers, pages 87--101, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Bruening and V. Kiriansky. Process-shared and persistent code caches. In Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, pages 61--70, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. V. J. Reddi, D. Connors, R. Cohn, and M. D. Smith. Persistent code caching: exploiting code reuse across executions and applications. In Proceedings of the International Symposium on Code Generation and Optimization, pages 74--88, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D.-Y. Hong, C.-C. Hsu, P.-C. Yew, J.-J. Wu, W.-C. Hsu, P. Liu, C.-M. Wang, and Y.-C. Chung. HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores. In Proceedings of the 10th International Symposium on Code Generation and Optimization, pages 104--113, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Kim, J. A. Joao, O. Mutlu, C. J. Lee, Y. N. Patt, and R. Cohn. VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization. In Proceedings of the 34th Annual International Symposium on Computer Architecture, pages 424--435, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. Dhanasekaran and K. Hazelwood. Improving indirect branch translation in dynamic binary translators. In Proceedings of the ASPLOS Workshop on Runtime Environments, Systems, Layering, and Virtualized Environments, pages 11--18, 2011.Google ScholarGoogle Scholar
  34. Standard Performance Evaluation Corporation. SPEC CPU. http://www.spec.orgGoogle ScholarGoogle Scholar

Index Terms

  1. SPTU: Improving Dynamic Binary Translation through Software Prediction with Target Updating

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SYSTOR 2014: Proceedings of International Conference on Systems and Storage
      June 2014
      168 pages
      ISBN:9781450329200
      DOI:10.1145/2611354

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 June 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • tutorial
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate94of285submissions,33%

      Upcoming Conference

      SYSTOR '24
      The 17th ACM International Systems and Storage Conference
      September 23 - 25, 2024
      Tel-Aviv , Israel

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader