skip to main content
10.1145/3195612.3195615acmotherconferencesArticle/Chapter ViewAbstractPublication Pageshp3cConference Proceedingsconference-collections
research-article

Optimized approach based on time prediction and space chunking for polyhedron programs parallelization on multicores

Published: 15 March 2018 Publication History

Abstract

A complex challenge in parallel computing is cores load balancing aiming to minimize the parallel program overall execution time called makespan. As the performance of some parallel architectures such as multicores may vary during program execution, an effective mapping should support this unknown variation to avoid drawbacks on makespan. In fact, mapping or static load balancing method may not be effective when the target machine state changes during program execution. Thread affinity has appeared as an important technique to improve the program performance and for better stability.
In this context, we propose a predictive approach allowing parallel nested loops adaptation to processor's performance using iterations chunking at runtime. Our approach is based on thread pinning, space chunking and performance detection at runtime. Thus, parting from a parallel program, we define a first set of loop nest iterations called chunk. This first chunk is run using an initial mapping assuming homogeneous cores. Then, performance assessment will correct the mapping by predicting the future core's state. Then, this new mapping will be applied to a new chunk for further evaluation and prediction and so on. The process would stop when the program is fully run or when judging that chunking is no longer effective.

References

[1]
F. E. Sandnes and O. Sinnen. A New Strategy for Multiprocessor Scheduling of Cyclic Task Graphs. International Journal of High Performance Computing and Networking. 3(1):62--71. 2005.
[2]
R. Camposano. Path-Based Scheduling for Synthesis. IEEE Transactions on Computer-Aided Design. 10:85--93. 1991.
[3]
M. Rahmouniand A. A. Jerraya. Formulation and Evaluation of Scheduling Techniques for Control Flow Graphs. In Proceedings of the Design Automation Conference. 386--391. Los Alamitos, CA, USA. 1995.
[4]
M. Kaul, R. Vemuri, S. Govindarajan and I. Quaiss. An Automated Temporal Partitioning and Loop Fission Approach for FPGA Based Reconfigurable Synthesis of DSP Applications. In Proceedings of the Design Automation Conference, 616--622, 1999.
[5]
J. M. P. Cardoso, Loop Dissevering: A Technique for Temporally Partitioning Loops in Dynamically Reconfigurable Computing Platforms. In Proceedings of the International Parallel and Distributed Processing Symposium. 22--26. 2003.
[6]
P. Sucha, Z. Hanzalek, A. Hermanek, and J. Schier. Efficient FPGA Implementation of Equalizer for Finite Interval Constant Modulus Algorithm. In Proceedings of the International Symposium on Industrial Embedded Systems, 1--10. 2006.
[7]
M. Weinhardt and W. Luk. Pipeline Vectorization. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 20(2):234--248. 2001.
[8]
T. Yang and C. Fu. Heuristic Algorithms for Scheduling Iterative Task Computations on Distributed Memory Machines. IEEE Transactions on Parallel and Distributed Systems. 8(6):608--622. 1997.
[9]
A.T. Chronopoulos, S. Penmatsa, J. Xu and S. Ali. Distributed loop- scheduling schemes for heterogeneous computer systems. Concurrency and Computation : Practice and Experience. 771--785. 2006.
[10]
B. Pradelle, P. Clauss, Vers la parallélisation dynamique dans le modèle polyédrique. ICPS/LSIIT - Université de Strasbourg, 2009.
[11]
C.P. Kruskal and A. Weiss. Allocating independent sub-tasks on parallel processors. IEEE Transactions on Software Engineering. (11/10):1001--1016. 1990.
[12]
D. R. Llanos, D. Orden and B. Palop. Meseta: A new scheduling strategy for speculative parallelization of randomized incremental algorithms. In Proc. 2005 ICPP Work-shops (HPSEC-05). 121-128. IEEE Press.2005.
[13]
D. R., Llanos, D. Orden, and B. Palop. Just-In-Time scheduling for loop-based speculative parallelization Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing PDP '08. 334--342. 2008.
[14]
C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy Or Discard Execution Model For Speculative Parallelization On Multicores. MICRO. IEEE Computer Society. 330--341. 2008.
[15]
T. Chen. Speculative Parallelization on Multicore Processors. Doctoral thesis. University of California. 2010.
[16]
S. Aravind. Beyond the Realm of the Polyhedral Model: Combining Speculative Program Parallelization with Polyhedral Compilation. Doctoral Thesis. Strasbourg University. 2015.
[17]
A. Mazouz, S.A.A. Touati and D. Barthou, Dynamic Thread Pinning for Phase-Based OpenMP Programs, Euro-Par 2013 Parallel Processing: 19th International Conference, Aachen, Germany, August 26-30, 2013.

Index Terms

  1. Optimized approach based on time prediction and space chunking for polyhedron programs parallelization on multicores

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    HP3C: Proceedings of the 2nd International Conference on High Performance Compilation, Computing and Communications
    March 2018
    123 pages
    ISBN:9781450363372
    DOI:10.1145/3195612
    • Conference Chair:
    • Steven Guan
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 March 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. automatic parallelization
    2. chunk
    3. load balancing
    4. mapping
    5. multicores
    6. polyhedron program
    7. prediction
    8. space chunking
    9. thread level parallelism
    10. thread pinning
    11. time prediction

    Qualifiers

    • Research-article

    Conference

    HP3C 2018

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 40
      Total Downloads
    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media