Abstract
Predicting distributed application performance is a constant challenge to researchers, with an increased difficulty when heterogeneous systems are involved. Research conducted so far is limited by application type, programming language, or targeted system. The employed models become too complex and prediction cost increases significantly. We propose dPerf, a new performance prediction tool. In dPerf, we extended existing methods from the frameworks Rose and SimGrid. New methods have also been proposed and implemented such that dPerf would perform (i) static code analysis and (ii) trace-based simulation. Based on these two phases, dPerf predicts the performance of C, C++ and Fortran applications communicating using MPI or P2PSAP. Neither one of the used frameworks was developed explicitly for performance prediction, making dPerf a novel tool. dPerf accuracy is validated by a sequential Laplace code and a parallel NAS benchmark. For a low prediction cost and a high gain, dPerf yields accurate results.
Similar content being viewed by others
References
Adve VS, Bagrodia R, Browne JC, Deelman E, Dube A, Houstis EN, Rice JR, Sakellariou R, Sundaram-Stukel DJ, Teller PJ, Vernon MK (2000) POEMS: End-to-end performance design of large parallel adaptive computational systems. IEEE Trans Softw Eng 26:1027–1048
ANR CIP project web page. http://spiderman-2.laas.fr/CIS-CIP
Badia RM, Escalé F, Gabriel E, Gimenez J, Keller R, Labarta J, Müller MS (2004) Performance prediction in a grid environment. In: Grid computing. Lecture notes in computer science, vol 2970. Springer, Berlin/Heidelberg, pp 257–264
Bailey DH, Barszcz E, Barton JT, Browning DS, Carter RL, Dagum L, Fatoohi RA, Frederickson PO, Lasinski TA, Schreiber RS, Simon HD, Venkatakrishnan V, Weeratunga SK (1991) The NAS parallel benchmarks—summary and preliminary results. In: SC’91: proceedings of the 1991 ACM/IEEE conference on supercomputing. ACM Press, New York, pp 158–165
Bourgeois J, Spies F (2000) Performance prediction of an NAS benchmark program with ChronosMix environment. In: Euro-Par’00: the 6-th international Euro-Par conference on parallel processing. Springer, Berlin, pp 208–216
Casanova H, Legrand A, Quinson M (2008) SimGrid: a generic framework for large-scale distributed experiments. In: UKSIM’08: proceedings of the 10th int conference on computer modeling and simulation. IEEE Computer Society, Los Alamitos, pp 126–131
Cornea BF, Bourgeois J (2011) Performance prediction of distributed applications using block benchmarking methods. In: PDP’11, 19-th int Euromicro conf on parallel, distributed and network-based processing. IEEE Computer Society, Los Alamitos
Cornea B, Bourgeois J (2012) http://lifc.univ-fcomte.fr/page_personnelle/recherche/136
Culler D, Karp R, Patterson D, Sahay A, Schauser KE, Santos E, Subramonian R, von Eicken T (1993) LogP: towards a realistic model of parallel computation. ACM Press, New York, pp 1–12
El Baz D, Nguyen TT (2010) A self-adaptive communication protocol with application to high performance peer to peer distributed computing. In: PDP’10: proceedings of the 18th Euromicro conference on parallel, distributed and network-based processing. IEEE Computer Society, Los Alamitos, pp 327–333
Ernst-Desmulier JB, Bourgeois J, Spies F, Verbeke J (2005) Adding new features in a peer-to-peer distributed computing framework. In: PDP’05: proceedings of the 13th Euromicro conference on parallel, distributed and network-based processing. IEEE Computer Society, Los Alamitos, pp 34–41
Ernst-Desmulier JB, Bourgeois J, Spies F (2008) P2pperf: a framework for simulating and optimizing peer-to-peer-distributed computing applications. Concurr Comput 20(6):693–712
Fahringer T (1996) On estimating the useful work distribution of parallel programs under the P3T: a static performance estimator. Concurr Pract Exp 8:28–32
Fahringer T, Zima HP (1993) A static parameter based performance prediction tool for parallel programs. In: ICS’93: proceedings of the 7th international conference on supercomputing. ACM Press, New York, pp 207–219
Finney SA (2001) Real-time data collection in Linux: a case study. Behav Res Methods Instrum Comput 33:167–173
Laplace transform instrumented with dPerf; simple block benchmarking method. http://bogdan.cornea.perso.neuf.fr/files/journal_files/laplace_dperf.c
Laplace transform. http://www.physics.ohio-state.edu/~ntg/780/c_progs/laplace.c
Li J, Shi F, Deng N, Zuo Q (2009) Performance prediction based on hierarchy parallel features captured in multi-processing system. In: HPDC’09: proc of the 18th ACM int symposium on high performance distributed computing. ACM Press, New York, pp 63–64
Livadas PE, Croll S (1994) System dependence graphs based on parse trees and their use in software maintenance. Inf Sci 76(3–4):197–232
Marin G (2007) Application insight through performance modeling. In: IPCCC’07: proceedings of the performance, computing, and comm. conf. IEEE Computer Society, Los Alamitos
Marin G, Mellor-Crummey J (2004) Cross-architecture performance predictions for scientific applications using parameterized models. In: SIGMETRICS’04/Performance’04: proceedings of the joint international conference on measurement and modeling of computer systems. ACM Press, New York, pp 2–13
NAS parallel benchmarks. http://www.nas.nasa.gov/Resources/Software/npb.html
Nguyen TT, El Baz D, Spiteri P, Jourjon G, Chau M (2010) High performance peer-to-peer distributed computing with application to obstacle problem. In: IPDPSW’10: IEEE international symposium on parallel distributed processing, workshops and Phd forum, pp 1–8
Noeth M, Marathe J, Mueller F, Schulz M, de Supinski B (2006) Scalable compression and replay of communication traces in massively parallel environments. In: SC’06: proceedings of the 2006 ACM/IEEE conference on supercomputing. ACM Press, New York, p 144
PAPI project website. http://icl.cs.utk.edu/papi/
PAPI SC2008 handout. http://icl.cs.utk.edu/graphics/posters/files/
Perfmon project webpage. http://perfmon2.sourceforge.net/
Pettersson M (2012) Perfctr project webpage. http://user.it.uu.se/~mikpe/linux/perfctr/
Prakash S, Bagrodia RL (1998) MPI-SIM: using parallel simulation to evaluate mpi programs. In: WSC’98: proceedings of the 30th conference on winter simulation. IEEE Computer Society Press, Los Alamitos, pp 467–474
Rose LD, Poxon H (2009) A paradigm change: from performance monitoring to performance analysis. In: SBAC-PAD, pp 119–126
Saavedra RH, Smith AJ (1996) Analysis of benchmark characteristics and benchmark performance prediction. ACM Trans Comput Syst 14(4):344–384
Schordan M, Quinlan D (2003) A source-to-source architecture for user-defined optimizations. In: Modular programming languages. Lecture notes in computer science, vol 2789. Springer, Berlin/Heidelberg, pp 214–223
Skinner D, Kramer W (2005) Understanding the causes of performance variability in HPC workloads. In: IEEE workload characterization symposium, pp 137–149
Snavely A, Wolter N, Carrington L (2001) Modeling application performance by convolving machine signatures with application profiles. In: WWC’01: IEEE international workshop on workload characterization. IEEE Computer Society, Los Alamitos, pp 149–156
Sundaram-Stukel D, Vernon MK (1999) Predictive analysis of a wavefront application using LogGP. In: 7th ACM SIGPLAN symposium on principles and practice of parallel programming, vol 34(8). ACM Press, New York, pp 141–150
The message passing interface standard. http://www-unix.mcs.anl.gov/mpi
van Gemund AJC (2003) Symbolic performance modeling of parallel systems. IEEE Trans Parallel Distrib Syst 14(2):154–165
Zaparanuks D, Jovic M, Hauswirth M (2009) Accuracy of performance counter measurements. In: ISPASS’09: IEEE international symposium on performance analysis of systems and software, pp 23–32
Zhai J, Chen W, Zheng W (2010) Phantom: predicting performance of parallel applications on large-scale parallel machines using a single node. In: PPoPP’10: proceedings of the 15th ACM SIGPLAN symposium on principles and practice of parallel programming. ACM Press, New York, pp 305–314
Acknowledgements
This work is funded by the French National Agency for Research under the ANR-07-CIS7-011-01 contract [2].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cornea, B.F., Bourgeois, J. A framework for efficient performance prediction of distributed applications in heterogeneous systems. J Supercomput 62, 1609–1634 (2012). https://doi.org/10.1007/s11227-012-0823-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-012-0823-5