Hardware Acceleration of Red-Black Tree Management and Application to Just-In-Time Compilation

Carbon, Alexandre; Lhuillier, Yves; Charles, Henri-Pierre

doi:10.1007/s11265-014-0902-3

Hardware Acceleration of Red-Black Tree Management and Application to Just-In-Time Compilation

Published: 06 June 2014

Volume 77, pages 95–115, (2014)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Alexandre Carbon¹,
Yves Lhuillier¹ &
Henri-Pierre Charles^2,3

416 Accesses
1 Citation
Explore all metrics

Abstract

Due to the everlasting consumer demand for more complex applications, embedded systems have evolved both in terms of complexity and heterogeneity. The architecture of such systems often includes several kinds of different computing resources (DSPs, GPUs, etc.). As a consequence, software designers are facing significant performance and portability issues to target these devices. Software relies more and more on virtualization technologies to maximize portability of applications. In order to balance portability and performance, most virtualization technologies leverage Just-in-time (JIT) compilation to provide runtime optimized code from portable one. Nevertheless, the efficiency of JIT compilation depends on the ability to compensate its overhead with execution speedups of generated code. While most research efforts focus on limiting overhead of JIT compilation phases by reducing their occurrences, this paper investigates opportunities of speeding up JIT compilation itself. We first present a performance analysis of different JIT compilation technologies in order to identify hardware and software optimization opportunities. Second, we propose a solution based on a dedicated processor with specialized instructions for critical functions of JIT compilers. These specialized instructions provide an average 5× speedup on manipulations of associative arrays and dynamic memory allocation. Based on the LLVM framework, we show a 15% overall speedup on code generator’s execution time. Because our specialized instructions are hidden behind standard libraries, we also argue that these instructions may be transparently reused for a wider range of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploration of the Relationship Between Just-in-Time Compilation Policy and Number of Cores

An analysis of executable size reduction by LLVM passes

Article 01 June 2019

Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading

References

Apple Inc (Original authors) and Khronos Group (Developpers) OpenCL (Open Computing Language), [Online, March 2014]. http://www.khronos.org/opencl/.
ARM (2014). Cortex-A5 Processor. http://www.arm.com/products/processors/cortex-a/cortex-a5.php..
ARM Limited Steele S., Java Program Manager. White paper: Accelerating to meet the challenge of embedded java, november 2001.
Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A. (2006). The landscape of parallel computing research: A view from berkeley. Tech. Rep. UCB/EECS-2006-183, EECS Department, University of California, Berkeley.
Aycock, J. (2003). A brief history of just-in-time. ACM Computing Surveys, 35, 97–113.
Article Google Scholar
Baiocchi, J., Childers, B.R., Davidson, J.W., Hiser, J.D., Misurda, J. (2007). Fragment cache management for dynamic binary translators in embedded systems with scratchpad. In: Proceedings of the 2007 international conference on compilers, architecture, and synthesis for embedded systems, CASES ’07, pp 75-84, New York, ACM.
Bayer, R. (1972). Symmetric binary b-trees: Data structure and maintenance algorithms. Informatica Acta, 1, 290–306.
Article MATH MathSciNet Google Scholar
Berger, E.D., Zorn, B.G., McKinley, K.S. (2002). Reconsidering custom memory allocation. In Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, OOPSLA ’02, pp. 1–12, New York, ACM.
Borkar, S., & Chien, A.A. (2011). The future of microprocessors. Commun ACM, 54(5), 67–77.
Article Google Scholar
Campanoni, S., Agosta, G., Reghizzi, S.C. (2008). A parallel dynamic compiler for cil bytecode. SIGPLAN Not, 43(4), 11–20. doi:http://dx.doi.org/10.1145/1374752.1374754.
Article Google Scholar
Cao, T., Blackburn, S.M., Gao, T., McKinley, K.S. (2012). The yin and yang of power and performance for asymmetric hardware and managed software. In: Proceedings of the 39th Annual International Symposium on Computer Architecture, ISCA ’12, pp 225-236 Washington, DC, USA, IEEE Computer Society.
Carbon, A., Lhuillier, Y., Charles, H.P. (2013). Hardware acceleration for just-in-time compilation on heterogeneous embedded systems. In: Application-Specific Systems, Architectures and Processors (ASAP), 2013 IEEE 24th International Conference on, pp 203-210.
Carbon, A., Lhuillier, Y., Charles, H.-P. (2013). Code specialization for red-black tree man- agement algorithms. In Proceedings of the 3rd international workshop on adaptive self-tuning computing systems, ADAPT ’13, page To appear, New York, ACM.
CEA LIST. Unisim virtual platforms. http://unisim-vp.org/site/index.html. [On- line, March 2014].
Chang, M., Smith, E., Reitmaier, R., Bebenita, M., Gal, A., Wimmer, C., Eich, B., Franz, M. (2009). Tracing for web 3.0: trace compilation for the next generation web applications. In: Proceedings of the ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, VEE ’09, pp 71-80 New York, ACM.
Charles, H.-P., & Sajjad, K. (2009). HPBCG High Performance Binary Code Generator. [Online, March 2014]. http://code.google.com/p/hpbcg/.
Chen, G., Kandemir, M., Vijaykrishnan, N., Irwin, M.J. (2003). Energy-aware code cache management for memory-constrained java devices. In SOC Conference, 2003. Proceedings. IEEE International [Systems-on-Chip], 179–182.
Cohen, A., & Rohou, E. (2010). Processor virtualization and split compilation for hetero- geneous multicore embedded systems. In Proceedings of the 47th Design Automation Conference, DAC ’10, pages 102-107, New York, ACM.
Gal, A., Probst, C.W., Franz, M. (2006). Hotpathvm: an effective jit compiler for resource-constrained devices. In: Proceedings of the 2nd international conference on virtual execution environments, VEE ’06, pp 144-153, New York, NY, USA, ACM.
Guibas, L.J., & Sedgewick, R. (1978). A dichromatic framework for balanced trees. IEEE Annual Symposium on Foundations of Computer Science, 0, 8–21.
MathSciNet Google Scholar
Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B. (2001). MiBench: A free, commercially representative embedded benchmark suite. In Pro- ceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, WWC ’01, Washington. IEEE Computer Society, 3–14.
Heiser, G. (2008). The role of virtualization in embedded systems. In: Proceedings of the 1st workshop on Isolation and integration in embedded systems, IIES ’08, pp 11-16, New York, NY, USA, ACM.
Kulkarni, P.A., & Fuller, J. (2011). Jit compilation policy on single-core and multi-core ma- chines. In Proceedings of the 2011 15th workshop on interaction between compilers and computer architectures, INTERACT ’11, Washington. IEEE Computer Society, 54–62.
Kumar, R., Farkas, K.I., Jouppi, N.P., Ranganathan, P., Tullsen, D.M. (2003). Single-isa heterogeneous multi-core architectures: The potential for processor power reduction. In: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, MICRO 36, pp 81, Washington, IEEE Computer Society.
Lattner, C., & Adve. V. (2004). LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, CGO ’04, pp. 75, Washington, IEEE Computer Society.
Lea, D. (2000). A memory allocator. http://g.oswego.edu/dl/html/malloc.html.
Moore, R.W., Baiocchi, J.A., Childers, B.R., Davidson, J.W., Hiser, J.D. (2009). Addressing the challenges of dbt for the arm architecture. In Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on languages, compilers, and tools for embedded sys- tems, LCTES ’09, pp. 147–156, New York,ACM.
Nethercote, N., & Seward, J. (2007). Valgrind: a framework for heavyweight dynamic binary instrumentation. In Proceedings of the 2007 ACM SIGPLAN conference on Program- ming language design and implementation, PLDI ’07, pp. 89–100, New York, NY, USA, ACM.
Nuzman, D., Dyshel, S., Rohou, E., Rosen, I., Williams, K., Yuste, D., Cohen, A., Zaks, A. (2011). Vapor simd: Auto-vectorize once, run everywhere. In Proceedings of the 9th Annual IEEE/ACM international symposium on code generation and optimization, CGO ’11, pp 151–160, Washington, DC, USA, IEEE Computer Society.
Pty Ltd Southern Storm Software (2014). Dotgnu project.
Radhakrishnan, R., John, L.K., Rubio, J., Vijaykrishnan, N. (1999). Execution characteristics of just-in-time compilers.
Rigo, A. (2004). Representation-based just-in-time specialization and the psyco prototype for python. In Proceedings of the 2004 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation, PEPM ’04, pages 15-26, New York, ACM.
Schoeberl, M. (2008). A java processor architecture for embedded real-time systems. J Syst Archit, 54(1-2), 265–286.
Article Google Scholar
Shaylor, N. (2002). A just-in-time compiler for memory-constrained low-power devices. In: Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium, USENIX Association, Berkeley, (pp. 119–126). USA: CA.
Suleman, M.A., Mutlu, O., Qureshi, M.K., Patt, Y.N. (2009). Accelerating critical section execution with asymmetric multi-core architectures. SIGPLAN Not, 44(3), 253– 264.
Article Google Scholar
Van Vleck, T. (2014). The IBM 360/67 and CP/CMS. URLhttp://www.multicians.org/thvv/360-67.html.
Xamarin (2014). The Mono Project. http://www.mono-project.com.
Yang, B.S., Moon, S.-M., Park, S., Lee, J., Lee, S., Park, J., Chung, Y.C., Kim, S., Ebcioglu, K., Altman, E. (1999). Latte: A java vm just-in-time compiler with fast and efficient register allocation. In: Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, PACT ’99, pp 128 Washington, DC, USA, IEEE Computer Society.

Download references

Author information

Authors and Affiliations

CEA, LIST, Embedded Computing Laboratory, F-91191, Gif-sur-Yvette, France
Alexandre Carbon & Yves Lhuillier
Univ. Grenoble Alpes, F-38000, Grenoble, France
Henri-Pierre Charles
CEA, LIST, MINATEC Campus, F-38054, Grenoble, France
Henri-Pierre Charles

Authors

Alexandre Carbon
View author publications
You can also search for this author in PubMed Google Scholar
Yves Lhuillier
View author publications
You can also search for this author in PubMed Google Scholar
Henri-Pierre Charles
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandre Carbon.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carbon, A., Lhuillier, Y. & Charles, HP. Hardware Acceleration of Red-Black Tree Management and Application to Just-In-Time Compilation. J Sign Process Syst 77, 95–115 (2014). https://doi.org/10.1007/s11265-014-0902-3

Download citation

Received: 13 September 2013
Revised: 14 April 2014
Accepted: 28 April 2014
Published: 06 June 2014
Issue Date: October 2014
DOI: https://doi.org/10.1007/s11265-014-0902-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hardware Acceleration of Red-Black Tree Management and Application to Just-In-Time Compilation

Abstract

Access this article

Similar content being viewed by others

Exploration of the Relationship Between Just-in-Time Compilation Policy and Number of Cores

An analysis of executable size reduction by LLVM passes

Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hardware Acceleration of Red-Black Tree Management and Application to Just-In-Time Compilation

Abstract

Access this article

Similar content being viewed by others

Exploration of the Relationship Between Just-in-Time Compilation Policy and Number of Cores

An analysis of executable size reduction by LLVM passes

Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation