Skip to main content
Log in

An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

An empirical study is presented that examines the potential to parallelize general-purpose software systems. The study is conducted on 13 open source systems comprising over 14 MLOC. Each for-loop is statically analyzed to determine if it can be parallelized or not. A for-loop that can be parallelized is termed a free-loop. Free-loops can be easily parallelized using tools such as OpenMP. For the loops that cannot be parallelized, the various inhibitors to parallelization are determined and tabulated. The data shows that the most prevalent inhibitor by far, is functions called within for-loops that have side effects. This single inhibitor poses the greatest challenge in adapting and re-engineering systems to better utilize modern multi-core architectures. This fact is somewhat contradictory to the literature, which is primarily focused on the removal of data dependencies within loops. Results of this paper also show that function calls via function pointers and virtual methods have very little impact on the for-loop parallelization process. Historical data over a 10-year period of inhibitor counts for the set of systems studied is also presented. It shows that there is little change in the potential for parallelization of loops over time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig 4
Fig 5
Fig 6
Fig. 7
Fig. 8
Fig 9
Fig 10
Fig 11

Similar content being viewed by others

References

  • Alnaeli SM et al (2012). Empirically examining the parallelizability of open source software system. Proceedings of the 2012 19th Working Conference on Reverse Engineering

  • Alomari HW et al (2014) srcSlice: very efficient and scalable forward static slicing. J Softw: Evol Process: n/a-n/a

  • Bacon DF, Sweeney PF (1996). Fast static analysis of C++ virtual function calls. Proceedings of the 11th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. San Jose, California, USA, ACM: 324–341

  • Banerjee UK (1988) Dependence analysis for supercomputing. Kluwer Academic Publishers

  • Banning JP (1979) An efficient way to find the side effects of procedure calls and the aliases of variables. Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages. San Antonio, Texas, ACM: 29–41

  • Barney B (2012) Introduction to parallel computing. from https://computing.llnl.gov/tutorials/parallel_comp/#Models

  • Bik AJC, Gannon D (1997) Automatically exploiting implicit parallelism in Java. Concurr Pract Exp 9:579–619

    Article  Google Scholar 

  • Bliss N (2007) Addressing the multicore trend with automatic parallelization. Lincoln Lab J 17:12

    Google Scholar 

  • Calder B, Grunwald D (1994). Reducing indirect function call overhead in C++ programs. Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages. Portland, Oregon, USA, ACM: 397–408

  • Cheng B-C, Hwu W (2000) An empirical study of function pointers using SPEC Benchmarks. Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing, Springer-Verlag: 490–493

  • Collard ML et al (2011) Lightweight transformation and fact extraction with the srcML toolkit. Source Code Analysis and Manipulation (SCAM). Williamsburg, VA, USA: 10

  • Collard ML et al (2003) An XML-based lightweight C++ Fact Extractor. Proceedings of the 11th IEEE International Workshop on Program Comprehension, IEEE Computer Society: 134

  • Collard ML et al (2002) Supporting document and data views of source code. In Proceedings of ACM Symposium on Document Engineering: 8

  • Dean J et al (1995) Optimization of object-oriented programs using static class hierarchy analysis. Proceedings of the 9th European Conference on Object-Oriented Programming, Springer-Verlag: 77–101

  • Dig D et al (2009). Relooper: refactoring for loop parallelism in Java. Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications. Orlando, Florida, USA, ACM: 793–794

  • Dijkstra E (1979) Go to statement considered harmful. Classics in software engineering, Yourdon Press: 27–33

  • Dragan N et al (2009). Using method stereotype distribution as a signature descriptor for software systems. in the Proceedings of the IEEE International Conference on Software Maintenance (ICSM’09)

  • Dragan N et al (2010) Automatic identification of class stereotypes. Proceedings of the 2010 I.E. International Conference on Software Maintenance, IEEE Computer Society: 1–10

  • Emami M, Ghiya R, Hendren LJ (1994) Context-sensitive interprocedural points-to analysis in the presence of function pointers. SIGPLAN Not 29(6):242–256

    Article  Google Scholar 

  • Feng L (2009) Automatic parallelization in Graphite. from http://gcc.gnu.org/wiki/Graphite/Parallelization

  • Ghezzi C, Jazayeri M (1982) Programming language concepts, Wiley

  • Goff G, Kennedy K, Tseng C-W (1991) Practical dependence testing. SIGPLAN Not 26(6):15–29

    Article  Google Scholar 

  • Group TP (2012) Parallel Fortran, C and C++ Compilers , from http://www.pgroup.com/products/pgicdk.htm

  • Grove D et al (1997) Call graph construction in object-oriented languages. Proceedings of the 12th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. Atlanta, Georgia, USA, ACM: 108–124

  • Intel (2010) Automatic parallelization with Intel® compilers. From http://software.intel.com/en-us/articles/automatic-parallelization-with-intel-compilers/

  • Hennessy JL, Patterson DA (2006) Computer architecture: a quantitative approach. Morgan Kaufman Publishers, San Francisco

    MATH  Google Scholar 

  • Jacobson T, Stubbendieck G (2003). Dependency analysis of for-loop structures for automatic parallelization of C code

  • Kennedy K, Allen JR (2002) Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Publishers Inc

  • Kim M, Kim H, Luk C-K (2010) Prospector: a dynamic data-dependence profiler to help parallel programming. In 2nd USENIX Workshop on Hot Topics in Parallelism , HotPar ‘10

  • Kong X, Klappholz D, Psarris K (1991) The I test: an improved dependence test for automatic parallelization and vectorization. IEEE Trans Parallel Distrib Syst 2:342–349

    Article  Google Scholar 

  • Kulkarni D et al (1993) Loop and data transformations: a tutorial. University of Toronto

  • Maydan DE et al (1991) Efficient and exact data dependence analysis. Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation. Toronto, Ontario, Canada, ACM: 1–14

  • Mock M, Atkinson DC, Chambers C, Eggers SJ (2005) program slicing with dynamic points-to sets. IEEE Trans Softw Eng 31(8):657–678

    Article  Google Scholar 

  • Muth R, Debray SK (1997) On the complexity of function pointer may-alias analysis. Proceedings of the 7th International Joint Conference CAAP/FASE on Theory and Practice of Software Development, Springer-Verlag: 381–392

  • Nadgir N (2001) Using OpenMP to parallelize a program. From http://developers.sun.com/solaris/articles/openmp.html

  • Nikolopoulos DS et al (2001) The trade-off between implicit and explicit data distribution in shared-memory programming paradigms. Proceedings of the 15th international conference on Supercomputing (ICS ’01). Sorrento, Italy, ACM: 15

  • Novillo D (2006) OpenMP and automatic parallelization in GCC. Proceedings of the GCC Developers’ Summit conference. Ottawa, Canada

  • Oracle (2010) Subprogram call in a Loop. From http://docs.oracle.com/cd/E19205-01/819-5262/aeuje/index.html

  • Orso A, Sinha S, Harrold MJ (2004) Classifying data dependences in the presence of pointers for program comprehension, testing, and debugging. ACM Trans Softw Eng Methodol 13(2):199–239

    Article  Google Scholar 

  • Petersen PM, Padua DA (1993) Static and dynamic evaluation of data dependence analysis. Proceedings of the 7th international conference on Supercomputing. Tokyo, Japan, ACM: 107–116

  • Petersen PM, Padua DA (1996) Static and dynamic evaluation of data dependence analysis techniques. IEEE Trans Parallel Distrib Syst 7(11):1121–1132

    Article  Google Scholar 

  • Petersen PM, Padua DA (1991) Experimental evaluation of some data dependence tests Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801

  • Psarris K, Kyriakopoulos K (1999) Data dependence testing in practice. Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, IEEE Computer Society: 264

  • Pugh W (1991) The Omega test: a fast and practical integer programming algorithm for dependence analysis. Proceedings of the 1991 ACM/IEEE conference on Supercomputing. Albuquerque, New Mexico, United States, ACM: 4–13

  • Wilson RSFRP, Wilson CS, Amarasinghe SP, Anderson JM, Tjiang SWK, Shih-wei L, Chau-wen T, Hall MW, Lam MS, Hennessy JL (1994) SUIF: an Infrastructure for research on parallelizing and optimizing compilers. ACM SIGPLAN Not 29:31–37

    Article  Google Scholar 

  • Shah Anand RBG (1995) Function pointers in C - an empirical study. Technical Report LCSR-TR-244: 11

  • Spuler DA, Sajeev ASM (1994) Compiler detection of function call side effects Technical Report 94/01

  • Sundaresan V et al (2000). Practical virtual method call resolution for Java. Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. Minneapolis, Minnesota, USA, ACM: 264–280

  • Zhang S, Ryder BG (1994). Complexity of single level function pointer aliasing analysis, Rutgers University, Department of Computer Science, Laboratory for Computer Science Research

Download references

Acknowledgments

This work was supported in part by a grant from the US National Science Foundation CNS 13-05292/05217.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonathan I. Maletic.

Additional information

Communicated by: Ahmed E. Hassan

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alnaeli, S.M., Maletic, J.I. & Collard, M.L. An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems. Empir Software Eng 21, 1272–1301 (2016). https://doi.org/10.1007/s10664-015-9385-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-015-9385-5

Keywords

Navigation