Abstract
The trend from single processor to parallel computer architectures has increased the importance of parallel computing. To support parallel computing it is important to map parallel algorithms to a computing platform that consists of multiple parallel processing nodes. In general different alternative mappings can be defined that perform differently with respect to the quality requirements for power consumption, efficiency and memory usage. The mapping process can be carried out manually for platforms with a limited number of processing nodes. However, for exascale computing in which hundreds of thousands of processing nodes are applied, the mapping process soon becomes intractable. To assist the parallel computing engineer we provide a model-driven approach to analyze, model, and select feasible mappings. We describe the developed toolset that implements the corresponding approach together with the required metamodels and model transformations. We illustrate our approach for the well-known complete exchange algorithm in parallel computing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Moore, G.E.: Cramming More Components Onto Integrated Circuits. Proceedings of the IEEE 86(1), 82–85 (1998)
Aizcorbe, A.M., Kortum, S.S.: Moore’s Law and the Semiconductor Industry: A Vintage Model. Scandinavian Journal of Economics 107(4), 603–630 (2005)
Frank, M.P.: The physical limits of computing. Computing in Science & Engineering 4(3), 16–26 (2002)
Amdahl, G.M.: Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities. Reprinted from the AFIPS Conference Proceedings, Atlantic City, N.J., April 18-20, vol. 30, pp. 483–485. AFIPS Press, Reston (1967); when Dr. Amdahl was at International Business Machines Corporation, Sunnyvale, California. IEEE Solid-State Circuits Newsletter 12(3), 19–20 (Summer 2007)
Gustafson, J.L.: Reevaluating Amdahl’s law. Communications of the ACM 31(5), 532–533 (1988)
Hill, M.D., Marty, M.R.: Amdahl’s Law in the Multicore Era. Computer 41(7), 33–38 (2008)
Karp, A.H., Flatt, H.P.: Measuring parallel processor performance. Commun. ACM 33(5), 539–543 (1990)
Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hiller, J., Karp, S., Keckler, S., Klein, D., Lucas, R., Richards, M., Scarpelli, A., Scott, S., Snavely, A., Sterling, T., Williams, R.S., Yelick, K., Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hiller, J., Keckler, S., Klein, D., Williams, R.S., Yelick, K.: Exascale Computing Study: Technology Challenges in Achieving Exascale Systems. DARPA (2008)
İmre, K.M., Baransel, C., Artuner, H.: Efficient and Scalable Routing Algorithms for Collective Communication Operations on 2D All–Port Torus Networks. International Journal of Parallel Programming 39(6), 746–782 (2011) ISSN: 0885-7458
Kim, S.-G., Maeng, S.-R., Cho, J.-W.: Complete exchange algorithms in wormhole-routed torus networks: a divide-and-conquer strategy. In: Proceedings of the Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN 1999), pp. 296–301 (1999)
Suh, Y.-J., Shin, K.G.: All-to-all personalized communication in multidimensional torus and mesh networks. IEEE Transactions on Parallel and Distributed Systems 12(1), 38–59 (2001)
Tsai, Y.J., McKinley, P.K.: An extended dominating node approach to collective communication in all-port wormhole-routed 2D meshes. In: Proceedings of the Scalable High-Performance Computing Conference, pp. 199–206 (1994)
Chien, A.A., Konstantinidou, M.: Workloads and Performance Metrics for Evaluating Parallel Interconnects, pp. 23–27. Morgan-Kaufmann (Summer-Fall 1994)
Zhang, X.D., Yan, Y., He, K.Q.: Latency Metric: An Experimental Method for Measuring and Evaluating Parallel Program and Architecture Scalability. Journal of Parallel and Distributed Computing 22(3), 392–410 (1994) ISSN 0743-7315, 10.1006/jpdc.1994.1100
Talia, D.: Models and Trends in Parallel Programming. Parallel Algorithms and Applications 16(2), 145–180 (2001)
Baransel, C., İmre, K.M.: A Parallel Implementation of Strassen’s Matrix Multiplication Algorithm for Wormhole-Routed All-Port 2D Torus Networks. Journal of Supercomputing 62(1), 486–509 (2012)
Peters, J.G., Syska, M.: Circuit-Switched Broadcasting in Torus Networks. IEEE Transactions on Parallel and Distributed Systems 7(3), 246–255 (1996)
Lenstra, H.W., Pomerance, C.: A Rigorous Time Bound for Factoring Integers. Journal of the American Mathematical Society 5(3), 483–516 (1992)
Object Management Group (OMG), Model Driven Architecture (MDA), ormsc/2001-07-01 (2001)
MPI: A Message-Passing Interface Standart, version 1.1 (2013), http://www.mpi-forum.org/docs/mpi-11-html/mpi-report.html
Czarnecki, K., Helsen, S.: Feature-based survey of model transformation approaches. IBM Syst. J. 45(3), 621–645 (2006)
ATL: ATL Transformation Language (2013), http://www.eclipse.org/atl/
Xpand, Open Architectureware (2013), http://wiki.eclipse.org/Xpand
Shende, S.S., Malony, A.D.: The Tau Parallel Performance System. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)
Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience 22(6), 685–701 (2010)
Krell Institute, Open|Speedshop (2013), http://www.openspeedshop.org
Geimer, M., Saviankou, P., Strube, A., Szebenyi, Z., Wolf, F., Wylie, B.J.N.: Further Improving the Scalability of the Scalasca Toolset. In: Jónasson, K. (ed.) PARA 2010, Part II. LNCS, vol. 7134, pp. 463–473. Springer, Heidelberg (2012)
Rudy, G., Khan, M.M., Hall, M., Chen, C., Chame, J.: A programming language interface to describe transformations and code generation. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. LNCS, vol. 6548, pp. 136–150. Springer, Heidelberg (2011)
Han, T.D., Abdelrahman, T.S.: hiCUDA: High-Level GPGPU Programming. IEEE Transactions on Parallel and Distributed Systems 22(1), 78–90 (2011)
Gamatié, A., Le Beux, S., Piel, É., Ben Atitallah, R., Etien, A., Marquet, P., Dekeyser, J.-L.: A Model-Driven Design Framework for Massively Parallel Embedded Systems. ACM Transactions on Embedded Computing Systems 10(4), 1–36 (2011)
Object Management Group. A UML profile for MARTE (2009), http://www.omgmarte.org
Sussman, A.: Model-driven mapping onto distributed memory parallel computers. In: Proceedings Supercomputing 1992, pp. 818–829 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arkın, E., Tekinerdogan, B., İmre, K.M. (2013). Model-Driven Approach for Supporting the Mapping of Parallel Algorithms to Parallel Computing Platforms. In: Moreira, A., Schätz, B., Gray, J., Vallecillo, A., Clarke, P. (eds) Model-Driven Engineering Languages and Systems. MODELS 2013. Lecture Notes in Computer Science, vol 8107. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41533-3_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-41533-3_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41532-6
Online ISBN: 978-3-642-41533-3
eBook Packages: Computer ScienceComputer Science (R0)