An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems

Alnaeli, Saleh M.; Maletic, Jonathan I.; Collard, Michael L.

doi:10.1007/s10664-015-9385-5

An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems

Published: 28 May 2015

Volume 21, pages 1272–1301, (2016)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Saleh M. Alnaeli¹,
Jonathan I. Maletic¹ &
Michael L. Collard²

416 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

An empirical study is presented that examines the potential to parallelize general-purpose software systems. The study is conducted on 13 open source systems comprising over 14 MLOC. Each for-loop is statically analyzed to determine if it can be parallelized or not. A for-loop that can be parallelized is termed a free-loop. Free-loops can be easily parallelized using tools such as OpenMP. For the loops that cannot be parallelized, the various inhibitors to parallelization are determined and tabulated. The data shows that the most prevalent inhibitor by far, is functions called within for-loops that have side effects. This single inhibitor poses the greatest challenge in adapting and re-engineering systems to better utilize modern multi-core architectures. This fact is somewhat contradictory to the literature, which is primarily focused on the removal of data dependencies within loops. Results of this paper also show that function calls via function pointers and virtual methods have very little impact on the for-loop parallelization process. Historical data over a 10-year period of inhibitor counts for the set of systems studied is also presented. It shows that there is little change in the potential for parallelization of loops over time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

On the Prevalence of Function Side Effects in General Purpose Open Source Software Systems

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

References

Alnaeli SM et al (2012). Empirically examining the parallelizability of open source software system. Proceedings of the 2012 19th Working Conference on Reverse Engineering
Alomari HW et al (2014) srcSlice: very efficient and scalable forward static slicing. J Softw: Evol Process: n/a-n/a
Bacon DF, Sweeney PF (1996). Fast static analysis of C++ virtual function calls. Proceedings of the 11th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. San Jose, California, USA, ACM: 324–341
Banerjee UK (1988) Dependence analysis for supercomputing. Kluwer Academic Publishers
Banning JP (1979) An efficient way to find the side effects of procedure calls and the aliases of variables. Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages. San Antonio, Texas, ACM: 29–41
Barney B (2012) Introduction to parallel computing. from https://computing.llnl.gov/tutorials/parallel_comp/#Models
Bik AJC, Gannon D (1997) Automatically exploiting implicit parallelism in Java. Concurr Pract Exp 9:579–619
Article Google Scholar
Bliss N (2007) Addressing the multicore trend with automatic parallelization. Lincoln Lab J 17:12
Google Scholar
Calder B, Grunwald D (1994). Reducing indirect function call overhead in C++ programs. Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages. Portland, Oregon, USA, ACM: 397–408
Cheng B-C, Hwu W (2000) An empirical study of function pointers using SPEC Benchmarks. Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing, Springer-Verlag: 490–493
Collard ML et al (2011) Lightweight transformation and fact extraction with the srcML toolkit. Source Code Analysis and Manipulation (SCAM). Williamsburg, VA, USA: 10
Collard ML et al (2003) An XML-based lightweight C++ Fact Extractor. Proceedings of the 11th IEEE International Workshop on Program Comprehension, IEEE Computer Society: 134
Collard ML et al (2002) Supporting document and data views of source code. In Proceedings of ACM Symposium on Document Engineering: 8
Dean J et al (1995) Optimization of object-oriented programs using static class hierarchy analysis. Proceedings of the 9th European Conference on Object-Oriented Programming, Springer-Verlag: 77–101
Dig D et al (2009). Relooper: refactoring for loop parallelism in Java. Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications. Orlando, Florida, USA, ACM: 793–794
Dijkstra E (1979) Go to statement considered harmful. Classics in software engineering, Yourdon Press: 27–33
Dragan N et al (2009). Using method stereotype distribution as a signature descriptor for software systems. in the Proceedings of the IEEE International Conference on Software Maintenance (ICSM’09)
Dragan N et al (2010) Automatic identification of class stereotypes. Proceedings of the 2010 I.E. International Conference on Software Maintenance, IEEE Computer Society: 1–10
Emami M, Ghiya R, Hendren LJ (1994) Context-sensitive interprocedural points-to analysis in the presence of function pointers. SIGPLAN Not 29(6):242–256
Article Google Scholar
Feng L (2009) Automatic parallelization in Graphite. from http://gcc.gnu.org/wiki/Graphite/Parallelization
Ghezzi C, Jazayeri M (1982) Programming language concepts, Wiley
Goff G, Kennedy K, Tseng C-W (1991) Practical dependence testing. SIGPLAN Not 26(6):15–29
Article Google Scholar
Group TP (2012) Parallel Fortran, C and C++ Compilers , from http://www.pgroup.com/products/pgicdk.htm
Grove D et al (1997) Call graph construction in object-oriented languages. Proceedings of the 12th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. Atlanta, Georgia, USA, ACM: 108–124
Intel (2010) Automatic parallelization with Intel® compilers. From http://software.intel.com/en-us/articles/automatic-parallelization-with-intel-compilers/
Hennessy JL, Patterson DA (2006) Computer architecture: a quantitative approach. Morgan Kaufman Publishers, San Francisco
MATH Google Scholar
Jacobson T, Stubbendieck G (2003). Dependency analysis of for-loop structures for automatic parallelization of C code
Kennedy K, Allen JR (2002) Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Publishers Inc
Kim M, Kim H, Luk C-K (2010) Prospector: a dynamic data-dependence profiler to help parallel programming. In 2nd USENIX Workshop on Hot Topics in Parallelism , HotPar ‘10
Kong X, Klappholz D, Psarris K (1991) The I test: an improved dependence test for automatic parallelization and vectorization. IEEE Trans Parallel Distrib Syst 2:342–349
Article Google Scholar
Kulkarni D et al (1993) Loop and data transformations: a tutorial. University of Toronto
Maydan DE et al (1991) Efficient and exact data dependence analysis. Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation. Toronto, Ontario, Canada, ACM: 1–14
Mock M, Atkinson DC, Chambers C, Eggers SJ (2005) program slicing with dynamic points-to sets. IEEE Trans Softw Eng 31(8):657–678
Article Google Scholar
Muth R, Debray SK (1997) On the complexity of function pointer may-alias analysis. Proceedings of the 7th International Joint Conference CAAP/FASE on Theory and Practice of Software Development, Springer-Verlag: 381–392
Nadgir N (2001) Using OpenMP to parallelize a program. From http://developers.sun.com/solaris/articles/openmp.html
Nikolopoulos DS et al (2001) The trade-off between implicit and explicit data distribution in shared-memory programming paradigms. Proceedings of the 15th international conference on Supercomputing (ICS ’01). Sorrento, Italy, ACM: 15
Novillo D (2006) OpenMP and automatic parallelization in GCC. Proceedings of the GCC Developers’ Summit conference. Ottawa, Canada
Oracle (2010) Subprogram call in a Loop. From http://docs.oracle.com/cd/E19205-01/819-5262/aeuje/index.html
Orso A, Sinha S, Harrold MJ (2004) Classifying data dependences in the presence of pointers for program comprehension, testing, and debugging. ACM Trans Softw Eng Methodol 13(2):199–239
Article Google Scholar
Petersen PM, Padua DA (1993) Static and dynamic evaluation of data dependence analysis. Proceedings of the 7th international conference on Supercomputing. Tokyo, Japan, ACM: 107–116
Petersen PM, Padua DA (1996) Static and dynamic evaluation of data dependence analysis techniques. IEEE Trans Parallel Distrib Syst 7(11):1121–1132
Article Google Scholar
Petersen PM, Padua DA (1991) Experimental evaluation of some data dependence tests Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801
Psarris K, Kyriakopoulos K (1999) Data dependence testing in practice. Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, IEEE Computer Society: 264
Pugh W (1991) The Omega test: a fast and practical integer programming algorithm for dependence analysis. Proceedings of the 1991 ACM/IEEE conference on Supercomputing. Albuquerque, New Mexico, United States, ACM: 4–13
Wilson RSFRP, Wilson CS, Amarasinghe SP, Anderson JM, Tjiang SWK, Shih-wei L, Chau-wen T, Hall MW, Lam MS, Hennessy JL (1994) SUIF: an Infrastructure for research on parallelizing and optimizing compilers. ACM SIGPLAN Not 29:31–37
Article Google Scholar
Shah Anand RBG (1995) Function pointers in C - an empirical study. Technical Report LCSR-TR-244: 11
Spuler DA, Sajeev ASM (1994) Compiler detection of function call side effects Technical Report 94/01
Sundaresan V et al (2000). Practical virtual method call resolution for Java. Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. Minneapolis, Minnesota, USA, ACM: 264–280
Zhang S, Ryder BG (1994). Complexity of single level function pointer aliasing analysis, Rutgers University, Department of Computer Science, Laboratory for Computer Science Research

Download references

Acknowledgments

This work was supported in part by a grant from the US National Science Foundation CNS 13-05292/05217.

Author information

Authors and Affiliations

Kent State University, Kent, OH, 44240, USA
Saleh M. Alnaeli & Jonathan I. Maletic
The University of Akron, Akron, OH, 44325, USA
Michael L. Collard

Authors

Saleh M. Alnaeli
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan I. Maletic
View author publications
You can also search for this author in PubMed Google Scholar
Michael L. Collard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jonathan I. Maletic.

Additional information

Communicated by: Ahmed E. Hassan

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alnaeli, S.M., Maletic, J.I. & Collard, M.L. An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems. Empir Software Eng 21, 1272–1301 (2016). https://doi.org/10.1007/s10664-015-9385-5

Download citation

Published: 28 May 2015
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10664-015-9385-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems

Abstract

Access this article

Similar content being viewed by others

On the Prevalence of Function Side Effects in General Purpose Open Source Software Systems

On the Prevalence of Function Side Effects in General Purpose Open Source Software Systems

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems

Abstract

Access this article

Similar content being viewed by others

On the Prevalence of Function Side Effects in General Purpose Open Source Software Systems

On the Prevalence of Function Side Effects in General Purpose Open Source Software Systems

DiscoPoP: A Profiling Tool to Identify Parallelization Opportunities

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation