skip to main content
10.1145/1449764.1449778acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Contention-aware scheduler: unlocking execution parallelism in multithreaded java programs

Published: 19 October 2008 Publication History

Abstract

In multithreaded programming, locks are frequently used as a mechanism for synchronization. Because today's operating systems do not consider lock usage as a scheduling criterion, scheduling decisions can be unfavorable to multithreaded applications, leading to performance issues such as convoying and heavy lock contention in systems with multiple processors. Previous efforts to address these issues (e.g., transactional memory, lock-free data structure) often treat scheduling decisions as "a fact of life," and therefore these solutions try to cope with the consequences of undesirable scheduling instead of dealing with the problem directly.
In this paper, we introduce Contention-Aware Scheduler (CA-Scheduler), which is designed to support efficient execution of large multithreaded Java applications in multiprocessor systems. Our proposed scheduler employs a scheduling policy that reduces lock contention. As will be shown in this paper, our prototype implementation of the CA-Scheduler in Linux and Sun HotSpot virtual machine only incurs 3.5% runtime overhead, while the overall performance differences, when compared with a system with no contention awareness, range from a degradation of 3% in a small multithreaded benchmark to an improvement of 15% in a large Java application server benchmark.

References

[1]
J. Aas. Understanding the Linux 2.6.8.1 Scheduler. On-line article, 2006. http://josh.trancesoftware.com/linux/linux cpu scheduler.pdf.
[2]
T. E. Anderson, B. N. Bershad, E. D. Lazowska, and H. M. Levy. Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism. In Proceedings of ACM Symposium on Operating Systems Principles (SOSP), pages 95--109, New York, NY, 1991.
[3]
M. Arnold, A. Welc, and V. T. Rajan. Improving Virtual Machine Performance Using a Cross-Run Profile Repository. In Proceedings of the ACM SIGPLAN Conference on Object Oriented Programming Systems and Applications (OOPSLA), pages 297--311, San Diego, CA, 2005.
[4]
D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin Locks: Featherweight Synchronization for Java. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 258--268, Montreal, Quebec, Canada, June 1998.
[5]
J. C. Bezdek, R. Ehrlich, and W. Full. FCM: The fuzzy C-Means Clustering Algorithm. Computers & Geosciences, 10(2-3):191--203, 1984.
[6]
C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, November 1995.
[7]
S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. Eliot, B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), pages 169--190, Portland, OR, 2006.
[8]
B. D. Carlstrom, J. Chung, H. Chafi, A. McDonald, C. Cao Minh, L. Hammond, C. Kozyrakis, and K. and Olukotun. Transactional Execution of Java Programs. In OOPSLA 2005 Workshop on Synchronization and Concurrency in Object-Oriented Languages (SCOOL). Oct 2005.
[9]
M. Cohen, S. B. Kooi, and W. Srisa-an. Clustering the Heap in Multi-Threaded Applications for Improved Garbage Collection. In Proceedings of the Conference on Genetic and Evolutionary Computation (GECCO), pages 1901--1908, Seattle, WA, 2006.
[10]
J. C. Dehnert, B. K. Grant, J. P. Banning, R. Johnson, T. Kistler, A. Klaiber, and J. Mattson. The Transmeta Code Morphing Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-Life Challenges. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 15--24, San Francisco, CA, 2003.
[11]
R. Dimpsey, R. Arora, and K. Kuiper. Java Server Performance: A Case Study of Building Efficient, Scalable JVMs. IBM Systems Journal, 39(1):151--174, 2000.
[12]
C. Grzegorczyk, S. Soman, C. Krintz, and R. Wolski. Isla Vista Heap Sizing: Using Feedback to Avoid Paging. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 325--340, San Jose, CA, March 2007.
[13]
T. Harris, A. Cristal, O. Unsal, E. Ayguade, F. Gagliardi, B. Smith, and M. Valero. Transactional Memory: An Overview. IEEE Micro, 27(3):8--29, May-June 2007.
[14]
T. Harris, M. Plesko, A. Shinnar, and D. Tarditi. Optimizing Memory Transactions. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 14--25. Ottawa, Ontario, Canada, Jun 2006.
[15]
J. A. Hartigan and M. A. Wong. A K-Means Clustering Algorithm. Applied Statistics, 28:100--108, 1979.
[16]
M. Herlihy and J. E. B. Moss. Transactional Memory: Architectural Support for Lock-Free Data Structures. In Proceedings of the International Symposium on Computer Architecture (ISCA), pages 289--300. May 1993.
[17]
HSQL Database Engine. hsqldb. On-Line Documentation, Last visited: December 2007. http://hsqldb.org/web/hsqlFAQ.html.
[18]
IBM. Jikes RVM. http://jikesrvm.sourceforge.net.
[19]
B. D. Marsh, M. L. Scott, T. J. LeBlanc, and E. P. Markatos. First-Class User-Level Threads. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), pages 110--121, New York, NY, 1991.
[20]
Microsoft Corp. Using Microsoft Virtual PC 2007 for Application Compatibility. White Paper, August 2006. http://www.microsoft.com/windows/products/winfamily/virtualpc/appcompat.mspx.
[21]
K. E. Moore, J. Bobba, M. J. Moravan, M. D. Hill, and D. A. Wood. LogTM: Log-Based Transactional Memory. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pages 254--265. Feb 2006.
[22]
D. A. Patterson and J. L. Hennessy. Computer Organization and Design (3rd ed.): the Hardware/Software Interface. Morgan Kaufmann Publishers Inc., San Francisco, CA, 2004.
[23]
R. Rajwar and J. R. Goodman. Speculaive Lock Elision: Enabling Highly Concurrent Multithreaded Execution. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 294--305, Austin, TX, 2001.
[24]
R. Rajwar and J. R. Goodman. Transactional Lock-Free Execution of Lock-Based Programs. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 5--17, San Jose, CA, 2002.
[25]
C. J. Rossbach, O. S. Hofmann, D. E. Porter, H. E. Ramadan, B. Aditya, and E. Witchel. TxLinux: Using and Managing Hardware Transactional Memory in an Operating System. In Proceedings of ACM Symposium on Operating Systems Principles (SOSP), pages 87--102, New York, NY, 2007.
[26]
Silberschatz and Galvin and Gagne. Operating System Concepts, 7th Edition. Addison Wesley, 2007.
[27]
J. Singer, G. Brown, I. Watson, and J. Cavazos. Intelligent Selection of Application-Specific Garbage Collectors. In Proceedings of the International Symposium on Memory Management (ISMM), pages 91--102, Montréal, Quebec, Canada, 2007.
[28]
S. Soman, C. Krintz, and D. F. Bacon. Dynamic Selection of Application-Specific Garbage Collectors. In Proceedings of the International Symposium on Memory Management (ISMM), pages 49--60, Vancouver, BC, Canada, 2004.
[29]
Standard Performance Evaluation Corporation. SPECjAppServer2004 user's guide. http://www.spec.org.
[30]
Standard Performance Evaluation Corporation. SPECjbb2005. On-Line Documentation, Last visited: July 2007. http://www.spec.org/jbb2005.
[31]
Sun Microsystems. ECPERF. http://java.sun.com/developer/earlyAccess/j2ee/ecperf/download.html.
[32]
D. Tam, R. Azimi, and M. Stumm. Thread Clustering: Sharing-Aware Scheduling on SMP-CMP-SMT Multiprocessors. SIGOPS Operating System Review, 41(3):47--58, 2007.
[33]
A. Tucker and A. Gupta. Process Control and Scheduling Issues for Multiprogrammed Shared-Memory Multiprocessors. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), pages 159--166, New York, NY, 1989.
[34]
A. Tucker, B. Smaalders, D. Singleton, and N. Kosche. US patent 5,937,187: Method and Apparatus for Execution and Preemption Control of Computer Process Entities, 1999.
[35]
V. Uhlig. The Mechanics of In-Kernel Synchronization for a Scalable Microkernel. SIGOPS Operating System Review, 41(4):49--58, 2007.
[36]
F. Xian, W. Srisa-an, and H. Jiang. Allocation-Phase Aware Thread Scheduling Policies to Improve Garbage Collection Performance. In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM), pages 79--90, Montréal, Quebec, Canada, October 2007.
[37]
T. Yang, E. D. Berger, S. F. Kaplan, and J. E. B. Moss. CRAMM: Virtual Memory Support for Garbage-Collected Applications. In Proceedings of the USENIX Conference on Operating System Design and Implementation (OSDI), pages 103--116, Seattle, WA, November 2006.

Cited By

View all
  • (2018)Characterizing and optimizing hotspot parallel garbage collection on multicore systemsProceedings of the Thirteenth EuroSys Conference10.1145/3190508.3190512(1-15)Online publication date: 23-Apr-2018
  • (2017)One Process to Reap Them AllACM SIGPLAN Notices10.1145/3140607.305075452:7(171-186)Online publication date: 8-Apr-2017
  • (2017)One Process to Reap Them AllProceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3050748.3050754(171-186)Online publication date: 8-Apr-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
OOPSLA '08: Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
October 2008
654 pages
ISBN:9781605582153
DOI:10.1145/1449764
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 43, Issue 10
    September 2008
    613 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1449955
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. java
  2. operating systems
  3. scheduling

Qualifiers

  • Research-article

Conference

OOPSLA08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 268 of 1,244 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Characterizing and optimizing hotspot parallel garbage collection on multicore systemsProceedings of the Thirteenth EuroSys Conference10.1145/3190508.3190512(1-15)Online publication date: 23-Apr-2018
  • (2017)One Process to Reap Them AllACM SIGPLAN Notices10.1145/3140607.305075452:7(171-186)Online publication date: 8-Apr-2017
  • (2017)One Process to Reap Them AllProceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3050748.3050754(171-186)Online publication date: 8-Apr-2017
  • (2017)Jumbler: A lock-contention aware thread scheduler for multi-core parallel machines2017 International Conference on Recent Advances in Signal Processing, Telecommunications & Computing (SigTelCom)10.1109/SIGTELCOM.2017.7849799(77-81)Online publication date: Jan-2017
  • (2015)TumblerACM Transactions on Architecture and Code Optimization10.1145/282769812:4(1-24)Online publication date: 16-Nov-2015
  • (2015)AIRACM Transactions on Modeling and Computer Simulation10.1145/270142025:3(1-25)Online publication date: 16-Apr-2015
  • (2015)Requester-Based Spin Lock: A Scalable and Energy Efficient Locking Scheme on Multicore SystemsIEEE Transactions on Computers10.1109/TC.2013.19664:1(166-179)Online publication date: Jan-2015
  • (2014)Continuously measuring critical section pressure with the free-lunch profilerACM SIGPLAN Notices10.1145/2714064.266021049:10(291-307)Online publication date: 15-Oct-2014
  • (2014)Lock contention aware thread migrationsACM SIGPLAN Notices10.1145/2692916.255527349:8(369-370)Online publication date: 6-Feb-2014
  • (2014)Continuously measuring critical section pressure with the free-lunch profilerProceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications10.1145/2660193.2660210(291-307)Online publication date: 15-Oct-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media