skip to main content
10.1145/3135974.3135986acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

POLM2: automatic profiling for object lifetime-aware memory management for hotspot big data applications

Published:11 December 2017Publication History

ABSTRACT

Big Data applications suffer from unpredictable and unacceptably high pause times due to bad memory management (Garbage Collection, GC) decisions. This is a problem for all applications but it is even more important for applications with low pause time requirements such as credit-card fraud detection or targeted website advertisement systems, which can easily fail to comply with Service Level Agreements due to long GC cycles (during which the application is stopped). This problem has been previously identified and is related to Big Data applications keeping in memory (for a long period of time, from the GC's perspective) massive amounts of data objects.

Memory management approaches have been proposed to reduce the GC pause time by allocating objects with similar lifetimes close to each other. However, they either do not provide a general solution for all types of Big Data applications (thus only solving the problem for a specific set of applications), and/or require programmer effort and knowledge to change/annotate the application code.

This paper proposes POLM2, a profiler that automatically: i) estimates application allocation profiles based on execution records, and ii) instruments application bytecode to help the GC taking advantage of the profiling information. Thus, no programmer effort is required to change the source code to allocate objects according to their lifetimes. POLM2 is implemented for the OpenJDK HotSpot Java Virtual Machine 8 and uses NG2C, a recently proposed GC which supports multi-generational pretenuring. Results show that POLM2 is able to: i) achieve pauses as low as NG2C (which requires manual source code modification), and ii) significantly reduce application pauses by up to 80% when compared to G1 (default collector in OpenJDK). POLM2 does not negatively impact neither application throughput nor memory utilization.

References

  1. Matthew Arnold, Stephen Fink, David Grove, Michael Hind, and Peter F. Sweeney. 2000. Adaptive Optimization in the Jalapeño JVM. In Proceedings of the 15th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA '00). ACM, New York, NY, USA, 47--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Arnold, S. J. Fink, D. Grove, M. Hind, and P. F. Sweeney. 2005. A Survey of Adaptive Optimization in Virtual Machines. Proc. IEEE 93, 2 (Feb 2005), 449--466.Google ScholarGoogle ScholarCross RefCross Ref
  3. David F. Bacon, Perry Cheng, and V. T. Rajan. 2003. Controlling Fragmentation and Space Consumption in the Metronome, a Real-time Garbage Collector for Java. In Proceedings of the 2003 ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES '03). ACM, New York, NY, USA, 81--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. William S Beebee Jr and Martin Rinard. 2001. An implementation of scoped memory for Real-Time Java. In International Workshop on Embedded Software. Springer, 289--305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA '06). ACM, New York, NY, USA, 169--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Stephen M. Blackburn, Matthew Hertz, Kathryn S. Mckinley, J. Eliot B. Moss, and Ting Yang. 2007. Profile-based Pretenuring. ACM Trans. Program. Lang. Syst. 29, 1, Article 2 (Jan. 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Stephen M Blackburn, Richard Jones, Kathryn S. McKinley, and J Eliot B Moss. 2002. Beltway: Getting Around Garbage Collection Gridlock. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI '02). ACM, New York, NY, USA, 153--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Stephen M. Blackburn and Kathryn S. McKinley. 2008. Immix: A Mark-region Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '08). ACM, 22--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chandrasekhar Boyapati, Alexandru Salcianu, William Beebee, Jr., and Martin Rinard. 2003. Ownership Types for Safe Region-based Memory Management in Real-time Java. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI '03). ACM, New York, NY, USA, 324--337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Rodrigo Bruno, Luís Picciochi Oliveira, and Paulo Ferreira. 2017. NG2C: Pretenuring Garbage Collection with Dynamic Generations for HotSpot Big Data Applications. In Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory Management (ISMM 2017). ACM, New York, NY, USA, 2--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yingyi Bu, Vinayak Borkar, Guoqing Xu, and Michael J. Carey. 2013. A Bloat-aware Design for Big Data Applications. In Proceedings of the 2013 International Symposium on Memory Management (ISMM '13). ACM, New York, NY, USA, 119--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Perry Cheng, Robert Harper, and Peter Lee. 1998. Generational Stack Collection and Profile-driven Pretenuring. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (PLDI '98). 162--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Daniel Clifford, Hannes Payer, Michael Stanton, and Ben L. Titzer. 2015. Memento Mori: Dynamic Allocation-site-based Optimizations. SIGPLAN Not. 50, 11 (June 2015), 105--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Daniel Clifford, Hannes Payer, Michael Starzinger, and Ben L. Titzer. 2014. Allocation Folding Based on Dominance. In Proceedings of the 2014 International Symposium on Memory Management (ISMM '14). ACM, New York, NY, USA, 15--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Nachshon Cohen and Erez Petrank. 2015. Data Structure Aware Garbage Collector. In Proceedings of the 2015 International Symposium on Memory Management (ISMM '15). ACM, New York, NY, USA, 28--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. David Detlefs, Christine Flood, Steve Heller, and Tony Printezis. 2004. Garbage-first Garbage Collection. In Proceedings of the 4th International Symposium on Memory Management (ISMM '04). ACM, New York, NY, USA, 37--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. David Detlefs, Christine Flood, Steve Heller, and Tony Printezis. 2004. Garbage-first Garbage Collection. In Proceedings of the 4th International Symposium on Memory Management (ISMM '04). ACM, New York, NY, USA, 37--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David Gay and Alex Aiken. 2001. Language Support for Regions. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation (PLDI '01). ACM, New York, NY, USA, 70--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. David Gay and Bjarne Steensgaard. 2000. Fast Escape Analysis and Stack Allocation for Object-Based Programs. In Proceedings of the 9th International Conference on Compiler Construction (CC '00). Springer-Verlag, London, UK, UK, 82--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lokesh Gidra, Gaël Thomas, Julien Sopena, and Marc Shapiro. 2012. Assessing the Scalability of Garbage Collectors on Many Cores. SIGOPS Oper. Syst. Rev. 45, 3 (Jan. 2012), 15--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lokesh Gidra, Gaël Thomas, Julien Sopena, and Marc Shapiro. 2013. A Study of the Scalability of Stop-the-world Garbage Collectors on Multicores. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '13). ACM, New York, NY, USA, 229--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ionel Gog, Jana Giceva, Malte Schwarzkopf, Kapil Vaswani, Dimitrios Vytiniotis, Ganesan Ramalingam, Manuel Costa, Derek G. Murray, Steven Hand, and Michael Isard. 2015. Broom: Sweeping Out Garbage Collection from Big Data Systems. In 15th Workshop on Hot Topics in Operating Systems (HotOS XV). USENIX Association, Kartause Ittingen, Switzerland. https://www.usenix.org/conference/hotos15/workshop-program/presentation/gog Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ionel Gog, Jana Giceva, Malte Schwarzkopf, Kapil Vaswani, Dimitrios Vytiniotis, Ganesan Ramalingan, Derek Murray, Steven Hand, and Michael Isard. 2015. Broom: Sweeping out Garbage Collection from Big Data Systems. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS'15). USENIX Association, Berkeley, CA, USA, 2--2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, and James Cheney. 2002. Region-based Memory Management in Cyclone. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI '02). ACM, New York, NY, USA, 282--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Niels Hallenberg, Martin Elsman, and Mads Tofte. 2002. Combining Region Inference and Garbage Collection. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI '02). ACM, New York, NY, USA, 141--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Timothy L. Harris. 2000. Dynamic Adaptive Pre-tenuring. In Proceedings of the 2nd International Symposium on Memory Management (ISMM '00). ACM, 127--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Matthew Hertz, Stephen M. Blackburn, J. Eliot B. Moss, Kathryn S. McKinley, and Darko Stefanović. 2006. Generating Object Lifetime Traces with Merlin. ACM Trans. Program. Lang. Syst. 28, 3 (May 2006), 476--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael Hicks, Greg Morrisett, Dan Grossman, and Trevor Jim. 2004. Experience with Safe Manual Memory-management in Cyclone. In Proceedings of the 4th International Symposium on Memory Management (ISMM '04). ACM, New York, NY, USA, 73--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Richard L. Hudson, Ron Morrison, J. Eliot B. Moss, and David S. Munro. 1997. Garbage Collecting the World: One Car at a Time. In Proceedings of the 12th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA '97). ACM, New York, NY, USA, 162--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Richard Jones, Antony Hosking, and Eliot Moss. 2016. The garbage collection handbook: the art of automatic memory management. CRC Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Richard E. Jones and Chris Ryder. 2008. A Study of Java Object Demographics. In Proceedings of the 7th International Symposium on Memory Management (ISMM '08). ACM, New York, NY, USA, 121--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sumant Kowshik, Dinakar Dhurjati, and Vikram Adve. 2002. Ensuring Code Safety Without Runtime Checks for Real-time Control Systems. In Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '02). ACM, New York, NY, USA, 288--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a Social Network or a News Media?. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 591--600. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale Graph Computation on Just a PC. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI'12). USENIX Association, Berkeley, CA, USA, 31--46. http://dl.acm.org/citation.cfm?id=2387880.2387884 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Avinash Lakshman and Prashant Malik. 2010. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev. 44, 2 (April 2010), 35--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pengcheng Li, Chen Ding, and Hao Luo. 2014. Modeling Heap Data Growth Using Average Liveness. In Proceedings of the 2014 International Symposium on Memory Management (ISMM '14). ACM, New York, NY, USA, 71--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lu Lu, Xuanhua Shi, Yongluan Zhou, Xiong Zhang, Hai Jin, Cheng Pei, Ligang He, and Yuanzhen Geng. 2016. Lifetime-based Memory Management for Distributed Data Processing Systems. Proc. VLDB Endow. 9, 12 (Aug. 2016), 936--947. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Simon Marlow, Tim Harris, Roshan P. James, and Simon Peyton Jones. 2008. Parallel Generational-copying Garbage Collection with a Block-structured Heap. In Proceedings of the 7th International Symposium on Memory Management (ISMM '08). ACM, New York, NY, USA, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Luis Mastrangelo, Luca Ponzanelli, Andrea Mocci, Michele Lanza, Matthias Hauswirth, and Nathaniel Nystrom. 2015. Use at Your Own Risk: The Java Unsafe API in the Wild. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 695--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Michael McCandless, Erik Hatcher, and Otis Gospodnetic. 2010. Lucene in Action, Second Edition: Covers Apache Lucene 3.0. Manning Publications Co., Greenwich, CT, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Shan Lu, Sanazsadat Alamian, and Onur Mutlu. 2016. Yak: A High-performance Big-data-friendly Garbage Collector. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI'16). USENIX Association, Berkeley, CA, USA, 349--365. http://dl.acm.org/citation.cfm?id=3026877.3026905 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang, Jianfei Hu, and Guoqing Xu. 2015. FACADE: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '15). ACM, 675--690. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Filip Pizlo, Lukasz Ziarek, and Jan Vitek. 2009. Real Time Java on Resource-constrained Platforms with Fiji VM. In Proceedings of the 7th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES '09). ACM, New York, NY, USA, 110--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Nathan P. Ricci, Samuel Z. Guyer, and J. Eliot B. Moss. 2011. Elephant Tracks: Generating Program Traces with Object Death Records. In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java (PPPJ '11). 139--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jacob Seligmann and Steffen Grarup. 1995. Incremental mature garbage collection using the train algorithm. In European Conference on Object-Oriented Programming. Springer, 235--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Codruţ Stancu, Christian Wimmer, Stefan Brunthaler, Per Larsen, and Michael Franz. 2015. Safe and Efficient Hybrid Memory Management for Java. In Proceedings of the 2015 International Symposium on Memory Management (ISMM '15). ACM, New York, NY, USA, 81--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Gil Tene, Balaji Iyengar, and Michael Wolf. 2011. C4: The Continuously Concurrent Compacting Collector. In Proceedings of the International Symposium on Memory Management (ISMM '11). ACM, New York, NY, USA, 79--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Mads Tofte and Jean-Pierre Talpin. 1997. Region-Based Memory Management. Inf. Comput. 132, 2 (Feb. 1997), 109--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. David Ungar. 1984. Generation Scavenging: A Non-disruptive High Performance Storage Reclamation Algorithm. In Proceedings of the First ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments (SDE 1). ACM, New York, NY, USA, 157--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Raja Vallée-Rai, Etienne Gagnon, Laurie J. Hendren, Patrick Lam, Patrice Pominville, and Vijay Sundaresan. 2000. Optimizing Java Bytecode Using the Soot Framework: Is It Feasible?. In Proceedings of the 9th International Conference on Compiler Construction (CC '00). Springer-Verlag, London, UK, UK, 18--34. http://dl.acm.org/citation.cfm?id=647476.727758 Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Guoqing Xu. 2013. Resurrector: A Tunable Object Lifetime Profiling Technique for Optimizing Real-world Programs. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA '13). ACM, 111--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Yudi Zheng, Lubomír Bulej, and Walter Binder. 2015. Accurate Profiling in the Presence of Dynamic Compilation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 433--450. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. POLM2: automatic profiling for object lifetime-aware memory management for hotspot big data applications

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    Middleware '17: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference
    December 2017
    268 pages
    ISBN:9781450347204
    DOI:10.1145/3135974

    Copyright © 2017 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 December 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Middleware '17 Paper Acceptance Rate20of85submissions,24%Overall Acceptance Rate203of948submissions,21%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader