ABSTRACT
We present techniques for eliminating dispatch overhead in a virtual machine interpreter using a lightweight just-in-time native-code compilation. In the context of the Tcl VM, we convert bytecodes to native Sparc code, by concatenating the native instructions used by the VM to implement each bytecode instruction. We thus eliminate the dispatch loop. Furthermore, immediate arguments of bytecode instructions are substituted into the native code using runtime specialization. Native code output from the C compiler is not amenable to relocation by copying; fix-up of the code is required for correct execution. The dynamic instruction count improvement from eliding dispatch depends on the length in native instructions of each bytecode opcode implementation. These are relatively long in Tcl, but dispatch is still a significant overhead. However, their length also causes our technique to overflow the instruction cache. Furthermore, our native compilation consumes runtime. Some benchmarks run up to three times faster, but roughly half slow down, or exhibit little change.
- J. Aycock. Converting Python Virtual Machine Code to C. In Proc. of 7th Intl. Python Conf., 1998.Google Scholar
- V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In Proc. of PLDI, 2000. Google ScholarDigital Library
- J. R. Bell. Threaded code. Communications of the ACM, 16:370--372, 1973. Google ScholarDigital Library
- F. Bellard. Qemu x86 cpu emulator {online}. 2004. Available from: http://fabrice.bellard.free.fr/qemu/.Google Scholar
- D. Cuthbert. The Kanga Tcl to C converter {online}. 2000. Available from: http://sourceforge.net/projects/kt2c/.Google Scholar
- R. B. Dewar. Indirect threaded code. Communications of the ACM, 18:330--331, 1973. Google ScholarDigital Library
- M. A. Ertl. Threaded code {online}. 1998. Available from: http://www.complang.tuwien.ac.at/forth/threaded-code.html/.Google Scholar
- M. A. Ertl and D. Gregg. Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters. In Proc. of PLDI, 2003. Google ScholarDigital Library
- M. A. Ertl and D. Gregg. The Structure and Performance of Efficient Interpreters. Journal of Instruction-Level Parallelism, 5:1--25, 2003.Google Scholar
- B. Grant, M. Mock, M. Philipose, C. Chambers, and S. J. Eggers. DyC: an expressive annotation-directed dynamic compiler for C. Theoretical Computer Science, 248(1-2):147--199, 2000. Google ScholarDigital Library
- B. Lewis. An on-the-fly bytecode compiler for Tcl. In Proc. of the 4th Annual Tcl/Tk Workshop, 1996. Google ScholarDigital Library
- P. S. Magnusson and F. L. et al. SimICS/sun4m: A Virtual Workstation. In Proc. of the Usenix Annual Technical Conference, 1998. Google ScholarDigital Library
- E. Miranda. BrouHaHa - A Portable Smalltalk Interpreter. In Proc. of OOPSLA '87, pages 354--365. Google ScholarDigital Library
- S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997. Google ScholarDigital Library
- I. Piumarta and F. Riccardi. Optimizing direct-threaded code by selective inlining. In Proc. of PLDI, pages 291--300, 1998. Google ScholarDigital Library
- F. Rouse and W. Christopher. A Typing System for an Optimizing Multiple-Backend Tcl Compiler. In Proc. of the 5th Annual Tcl/Tk Workshop, 1997. Google ScholarDigital Library
- M. Sofer. Tcl Engines {online}. Available from: http://sourceforge.net/projects/tclengine/.Google Scholar
- SPARC International Inc.A. The SPARC Architecture Manual, Version 8. 1992. Google ScholarDigital Library
- G. T. Sullivan, D. L. Bruening, I. Baron, T. Garnett, and S. Amarasinghe. Dynamic native optimization of interpreters. In Proc. of the 2003 workshop on Interpreters, Virtual Machines and Emulators. Google ScholarDigital Library
- Sun Microelectronics. UltraSPARC IIi User's Manual. 1997.Google Scholar
- Tcl Core Team. TclLib benchmarks {online}. 2003. Available from: http://www.tcl.tk/software/tcllib/.Google Scholar
- B. Vitale. Catenation and Operand Specialization for Tcl Virtual Machine Performance. Master's thesis, University of Toronto, 2004.Google Scholar
Index Terms
- Catenation and specialization for Tcl virtual machine performance
Recommendations
A portable approach to dynamic optimization in run-time specialization
AbstractThis paper proposes arun-time bytecode specialization (BCS) technique that analyzes programs and generates specialized programs at run-time in an intermediate language. By using an intermediate language for code generation, a back-end system can...
Fine-grained modularity and reuse of virtual machine components
AOSD '12: Proceedings of the 11th annual international conference on Aspect-oriented Software DevelopmentModularity is a key concept for large and complex applications and an important enabler for collaborative research. In comparison, virtual machines (VMs) are still mostly monolithic pieces of software. Our goal is to significantly reduce to the cost of ...
Analyzing the performance of code-copying virtual machines
OOPSLA '08: Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applicationsMany popular programming languages use interpreter-based execution for portability, supporting dynamic or reflective properties, and ease of implementation. Code-copying is an optimization technique for interpreters that reduces the performance gap ...
Comments