Skip to main content

Accurate Fork-Join Profiling on the Java Virtual Machine

  • Conference paper
  • First Online:
Euro-Par 2022: Parallel Processing (Euro-Par 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13440))

Included in the following conference series:

Abstract

The fork-join model for parallel computing has become very popular and is included in the Java class library since Java 7. While understanding and optimizing the performance of fork-join computations is of paramount importance, accurately profiling them on the Java Virtual Machine (JVM) is challenging due to the complexity of the API. In this paper, we present a novel model for analyzing fork-join computations on the JVM, addressing the peculiarities of the Java fork-join framework, including features such as task unforking and task reuse. We implement our model in a profiler that detects every spawned fork-join task, capturing all task dependencies and aiming at collecting cycle-accurate task-granularity data. We evaluate our profiler against a dedicated fork-join profiler for the JVM, showing that our tool achieves higher profile accuracy and introduces less overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Reference cycles are the clock cycles elapsed during an operation, collected at the nominal processor frequency (regardless of frequency scaling). The paper uses the term “cycles” to indicate “reference cycles” for short.

  2. 2.

    \(\mathsf{FJProf}\) is a fork-join-specific version of the task-granularity profiler tgp [24].

  3. 3.

    Average overheads and accuracies across multiple workloads are computed using the geometric mean.

References

  1. Adhianto, L., et al.: HPCTOOLKIT: tools for performance analysis of optimized parallel programs. Concurrency Comput. Pract. Exp. 22(6), 685–701 (2010). https://doi.org/10.1002/cpe.1553

    Article  Google Scholar 

  2. Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. J. Parallel Distrib. Comput. 37(1), 55–69 (1996). https://doi.org/10.1145/209936.209958

    Article  Google Scholar 

  3. Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. ACM 46(5), 720–748 (1999). https://doi.org/10.1145/324133.324234

    Article  MathSciNet  MATH  Google Scholar 

  4. Chen, S., et al.: Scheduling threads for constructive cache sharing on CMPs. In: SPAA, pp. 105–115 (2007). https://doi.org/10.1145/1248377.1248396

  5. Conway, M.E.: A multiprocessor system design. In: AFIPS, pp. 139–146 (1963). https://doi.org/10.1145/1463822.1463838

  6. Fonseca, A., Cabral, B.: Evaluation of runtime cut-off approaches for parallel programs. In: VECPAR, pp. 121–134 (2016). https://doi.org/10.1007/978-3-319-61982-8_13

  7. Fonseca, A., Stork, S.: AeminiumBenchmarks (2016). https://github.com/AEminium/AeminiumBenchmarks

  8. Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the Cilk-5 multithreaded language. SIGPLAN Not. 33(5), 212–223 (1998). https://doi.org/10.1145/277650.277725

    Article  Google Scholar 

  9. Guo, Y., Barik, R., Raman, R., Sarkar, V.: Work-first and help-first scheduling policies for async-finish task parallelism. In: IPDPS, pp. 1–12 (2009). https://doi.org/10.1109/IPDPS.2009.5161079

  10. Haller, P., Tu, S.: The Scala Actors API (2022). https://docs.scala-lang.org/overviews/core/actors.html

  11. He, Y., Leiserson, C.E., Leiserson, W.M.: The Cilkview scalability analyzer. In: SPAA, pp. 145–156 (2010). https://doi.org/10.1145/1810479.1810509

  12. ICL: PAPI (2021). http://icl.utk.edu/papi

  13. Lea, D.: A Java Fork/Join framework. In: JAVA, pp. 36–43 (2000). https://doi.org/10.1145/337449.337465

  14. Lifflander, J., Krishnamoorthy, S., Kale, L.V.: Steal tree: low-overhead tracing of work stealing schedulers. In: PLDI, pp. 507–518 (2013). https://doi.org/10.1145/2499370.2462193

  15. Marek, L., et al.: ShadowVM: robust and comprehensive dynamic program analysis for the Java platform. ACM SIGPLAN Not. 49(3), 105–114 (2013). https://doi.org/10.1145/2517208.2517219

    Article  Google Scholar 

  16. Marek, L., Villazón, A., Zheng, Y., Ansaloni, D., Binder, W., Qi, Z.: DiSL: a domain-specific language for bytecode instrumentation. In: AOSD, pp. 239–250 (2012). https://doi.org/10.1145/2162049.2162077

  17. Mohr, B., Brown, D., Malony, A.: TAU: a portable parallel program analysis environment for pC++. In: CONPAR – VAPP VI, pp. 29–40 (1994). https://doi.org/10.1007/3-540-58430-7_4

  18. Nyman, L., Laakso, M.: Notes on the history of Fork and Join. IEEE Ann. Hist. Comput. 38(3), 84–87 (2016). https://doi.org/10.1109/MAHC.2016.34

    Article  Google Scholar 

  19. Oracle: Package java.util.stream (2021). https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/stream/Stream.html

  20. Oracle: ForkJoinTask (2022). https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/ForkJoinTask.html

  21. Prokopec, A., et al.: Renaissance: benchmarking suite for parallel applications on the JVM. In: PLDI, pp. 31–47 (2019). https://doi.org/10.1145/3314221.3314637

  22. Renaissance Suite: Documentation Overview. https://renaissance.dev/docs (2019)

  23. Rosà, A., Rosales, E., Binder, W.: Analysis and optimization of task granularity on the java virtual machine. ACM Trans. Program. Lang. Syst. 41(3) (2019). https://doi.org/10.1145/3338497

  24. Rosà A.: TGP (2022). https://github.com/fithos/tgp

  25. Rosales, E., Rosà, A., Binder, W.: FJProf: profiling Fork/Join applications on the Java Virtual Machine. In: VALUETOOLS, pp. 128–135 (2020). https://doi.org/10.1145/3388831.3388851

  26. Rosà, A., Binder, W.: Optimizing type-specific instrumentation on the JVM with reflective supertype information. J. Vis. Lang. Comput. 49, 29–45 (2018). https://doi.org/10.1016/j.jvlc.2018.10.007

    Article  Google Scholar 

  27. Schardl, T.B., Kuszmaul, B.C., Lee, I.T.A., Leiserson, W.M., Leiserson, C.E.: The Cilkprof scalability profiler. In: SPAA, pp. 89–100 (2015). https://doi.org/10.1145/2755573.2755603

  28. Tallent, N.R., Mellor-Crummey, J.M.: Identifying performance bottlenecks in work-stealing computations. Computer 42(12), 44–50 (2009). https://doi.org/10.1109/MC.2009.396

    Article  Google Scholar 

  29. Teng, Q.M., Wang, H.C., Xiao, Z., Sweeney, P.F., Duesterwald, E.: THOR: a performance analysis tool for Java applications running on multicore systems. IBM J. Res. Dev. 54(5), 4:1–4:17 (2010). https://doi.org/10.1147/JRD.2010.2058481

  30. The Clojure Team: Reducers (2019). https://clojure.org/reference/reducers

  31. The GPars Team: GPars - A Concurrency & Parallelism Framework for Groovy and Java (2016). http://www.gpars.org

Download references

Acknowledgments

This work has been supported by Oracle (ERO project 1332) and by the Swiss National Science Foundation (project 200020_188688).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Basso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Basso, M., Rosales, E., Schiavio, F., Rosà, A., Binder, W. (2022). Accurate Fork-Join Profiling on the Java Virtual Machine. In: Cano, J., Trinder, P. (eds) Euro-Par 2022: Parallel Processing. Euro-Par 2022. Lecture Notes in Computer Science, vol 13440. Springer, Cham. https://doi.org/10.1007/978-3-031-12597-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-12597-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-12596-6

  • Online ISBN: 978-3-031-12597-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics