skip to main content
10.1145/2555243.2555247acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Eliminating global interpreter locks in ruby through hardware transactional memory

Published: 06 February 2014 Publication History

Abstract

Many scripting languages use a Global Interpreter Lock (GIL) to simplify the internal designs of their interpreters, but this kind of lock severely lowers the multi-thread per-formance on multi-core machines. This paper presents our first results eliminating the GIL in Ruby using Hardware Transactional Memory (HTM) in the IBM zEnterprise EC12 and Intel 4th Generation Core processors. Though prior prototypes replaced a GIL with HTM, we tested real-istic programs, the Ruby NAS Parallel Benchmarks (NPB), the WEBrick HTTP server, and Ruby on Rails. We devised a new technique to dynamically adjust the transaction lengths on a per-bytecode basis, so that we can optimize the likelihood of transaction aborts against the relative overhead of the instructions to begin and end the transactions. Our results show that HTM achieved 1.9- to 4.4-fold speedups in the NPB programs over the GIL with 12 threads, and 1.6- and 1.2-fold speedups in WEBrick and Ruby on Rails, respectively. The dynamic transaction-length adjustment chose the best transaction lengths for any number of threads and applications with sufficiently long running times.

References

[1]
Blundell, C., Raghavan, A., and Martin, M. M. K. RETCON: transactional repair without replay. In ISCA, pp. 258--269, 2010.
[2]
Cascaval, C., Blundell, C., Michael, M., Cain, H. W., Wu, P., Chiras, S., and Chatterjee, S. Software transactional memory: why is it only a research toy? ACM Queue, 6(5), pp. 46--58, 2008.
[3]
Dice, D., Lev, Y., Moir, M., and Nussbaum, D. Early experience with a commercial hardware transactional memory implementation. In ASPLOS, pp. 157--168, 2009.
[4]
ECMAScript. http://www.ecmascript.org/.
[5]
Haring, R. A., Ohmacht, M., Fox, T. W., Gschwind, M. K., Satterfield, D. L., Sugavanam, K., Coteus, P. W., Heidelberger, P., Blumrich, M. A., Wisniewski, R.W., Gara, A., Chiu, G. L.-T., Boyle, P.A., Chist, N.H., and Kim, C. The IBM Blue Gene/Q compute chip. IEEE Micro, 32(2), pp. 48--60, 2012.
[6]
IBM. Power ISA Transactional Memory. Power.org, 2012.
[7]
IBM. z/Architecture Principles of Operation Tenth Edition (September, 2012). http://publibfi.boulder.ibm.com/epubs/pdf/dz9zr009.pdf.
[8]
Intel Corporation. Intel Architecture Instruction Set Extensions Programming Reference. 319433-012a edition, 2012.
[9]
IronPython, http://ironpython.codeplex.com/.
[10]
IronRuby, http://www.ironruby.net/.
[11]
Jacobi, C., Slegel, T., and Greinder, D. Transactional memory architecture and implementation for IBM System z. In MICRO 45, 2012.
[12]
JRuby, http://jruby.org/.
[13]
Jython, http://www.jython.org/.
[14]
Lua, http://www.lua.org/
[15]
Minh, C. C., Chung, J., Kozyrakis, C., and Olukotun, K. STAMP: Stanford transactional applications for multi-processing. In IISWC, pp. 35--46, 2008.
[16]
NAS Parallel Benchmarks, http://www.nas.nasa.gov/publications/npb.html.
[17]
Nose, T. Ruby version of NAS Parallel Benchmarks 3.0. http://www-hiraki.is.s.u-tokyo.ac.jp/members/tknose/.
[18]
Odaira, R. and Castanos, J. G. Eliminating global interpreter locks in Ruby through hardware transactional memory. Research Report RT0950, IBM Research -- Tokyo, 2013.
[19]
Perl threads, http://perldoc.perl.org/perlthrtut.html.
[20]
PyPy Status Blog. We need Software Transactional Memory. http://morepypy.blogspot.jp/2011/08/we-need-software-transactional-memory.html.
[21]
Python programming language. http://www.python.org/.
[22]
Rajwar, R. and Goodman, J. R. Speculative lock elision: enabling highly concurrent multithreaded execution. In MICRO, pp. 294--305, 2001.
[23]
Riley, N. and Zilles, C. Hardware transactional memory support for lightweight dynamic language evolution. In Dynamic Language Symposium (OOPSLA Companion), pp. 998--1008, 2006.
[24]
Rubinius, http://rubini.us/.
[25]
Ruby on Rails. http://rubyonrails.org/.
[26]
Ruby programming language, http://www.ruby-lang.org/.
[27]
Shum, C.-L. IBM zNext: the 3rd generation high frequency micro-processor chip. In HotChips 24, 2012.
[28]
Stuecheli, J. Next Generation POWER microprocessor. In HotChips 25, 2013.
[29]
Tabba, F. Adding concurrency in python using a commercial processor's hardware transactional memory support. ACM SIGARCH Computer Architecture News, 38(5), pp. 12--19, 2010.
[30]
Tatsubori, M., Tozawa, A., Suzumura, T., Trent, S., Onodera, T. Evaluation of a just-in-time compiler retrofitted for PHP. In VEE, pp. 121--132, 2010.
[31]
Wang, A., Gaudet, M., Wu, P., Ohmacht, M., Amaral, J. N., Barton, C., Silvera, R., Michael, M. M. Evaluation of Blue Gene/Q hardware support for transactional memories. In PACT, pp. 127--136, 2012.

Cited By

View all
  • (2020)Design and Implementation of Network Monitoring System Back End for 3T Regions with Parallel Processing2020 14th International Conference on Telecommunication Systems, Services, and Applications (TSSA10.1109/TSSA51342.2020.9310825(1-5)Online publication date: 4-Nov-2020
  • (2019)Reflections on the compatibility, performance, and scalability of parallel PythonProceedings of the 15th ACM SIGPLAN International Symposium on Dynamic Languages10.1145/3359619.3359747(91-103)Online publication date: 20-Oct-2019
  • (2018)Virtual machine design for parallel dynamic programming languagesProceedings of the ACM on Programming Languages10.1145/32764792:OOPSLA(1-25)Online publication date: 24-Oct-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
February 2014
412 pages
ISBN:9781450326568
DOI:10.1145/2555243
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 February 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. global interpreter lock
  2. hardware transactional memory
  3. lock elision
  4. scripting language

Qualifiers

  • Research-article

Conference

PPoPP '14
Sponsor:

Acceptance Rates

PPoPP '14 Paper Acceptance Rate 28 of 184 submissions, 15%;
Overall Acceptance Rate 230 of 1,014 submissions, 23%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Design and Implementation of Network Monitoring System Back End for 3T Regions with Parallel Processing2020 14th International Conference on Telecommunication Systems, Services, and Applications (TSSA10.1109/TSSA51342.2020.9310825(1-5)Online publication date: 4-Nov-2020
  • (2019)Reflections on the compatibility, performance, and scalability of parallel PythonProceedings of the 15th ACM SIGPLAN International Symposium on Dynamic Languages10.1145/3359619.3359747(91-103)Online publication date: 20-Oct-2019
  • (2018)Virtual machine design for parallel dynamic programming languagesProceedings of the ACM on Programming Languages10.1145/32764792:OOPSLA(1-25)Online publication date: 24-Oct-2018
  • (2018)Parallelization of dynamic languages: synchronizing built-in collectionsProceedings of the ACM on Programming Languages10.1145/32764782:OOPSLA(1-30)Online publication date: 24-Oct-2018
  • (2016)Parallel virtual machines with RPythonACM SIGPLAN Notices10.1145/3093334.298923352:2(48-59)Online publication date: 1-Nov-2016
  • (2016)Parallel virtual machines with RPythonProceedings of the 12th Symposium on Dynamic Languages10.1145/2989225.2989233(48-59)Online publication date: 1-Nov-2016
  • (2016)Parallel Performance Problems on Shared-Memory Multicore SystemsIEEE Transactions on Software Engineering10.1109/TSE.2016.251934642:8(764-785)Online publication date: 1-Aug-2016
  • (2016)CodeOcean - A versatile platform for practical programming excercises in online environments2016 IEEE Global Engineering Education Conference (EDUCON)10.1109/EDUCON.2016.7474573(314-323)Online publication date: Apr-2016
  • (2015)Quantitative comparison of hardware transactional memory for Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8ACM SIGARCH Computer Architecture News10.1145/2872887.275040343:3S(144-157)Online publication date: 13-Jun-2015
  • (2015)Quantitative comparison of hardware transactional memory for Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8Proceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750403(144-157)Online publication date: 13-Jun-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media