Skip to main content

Evolving Cut-Off Mechanisms and Other Work-Stealing Parameters for Parallel Programs

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10199))

Abstract

Optimizing parallel programs is a complex task because the interference among many different parameters. Work-stealing runtimes, used to dynamically balance load among different processor cores, are no exception. This work explores the automatic configuration of the following runtime parameters: dynamic granularity control algorithms, granularity control cache, work-stealing algorithm, lazy binary splitting parameter, the maximum queue size and the unparking interval. The performance of the program is highly sensible to the granularity control algorithm, which can be a combination of other granularity algorithms. In this work, we address two search-based problems: finding a globally efficient work-stealing configuration, and finding the best configuration just for an individual program. For both problems, we propose the use of a Genetic Algorithm (GA). The genotype of the GA is able to represent combinations of up to three cut-off algorithms, as well as other work-stealing parameters.

The proposed GA has been evaluated in its ability to obtain a more efficient solution across a set of programs, in its ability to generalize the solution to a larger set of programs, and its ability to evolve single programs individually.

The GA was able to improve the performance of the set of programs in the training set, but the obtained configurations were not generalized to a larger benchmark set. However, it was able to successfully improve the performance of each program individually.

This is a preview of subscription content, log in via an institution.

References

  1. Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. J. Parallel Distrib. Comput. 37(1), 55–69 (1996)

    Article  Google Scholar 

  2. Dagum, L., Menon, R.: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)

    Article  Google Scholar 

  3. Lea, D.: A java fork/join framework. In: Proceedings of the ACM 2000 Conference on Java Grande, pp. 36–43. ACM (2000)

    Google Scholar 

  4. Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., Von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: ACM Sigplan Notices, vol. 40, pp. 519–538. ACM (2005)

    Google Scholar 

  5. Stork, S., Naden, K., Sunshine, J., Mohr, M., Fonseca, A., Marques, P., Aldrich, J.: Æminium: a permission-based concurrent-by-default programming language approach. ACM Trans. Program. Lang. Syst. (TOPLAS) 36(1), 2 (2014)

    Article  Google Scholar 

  6. Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: ACM Sigplan Notices, vol. 33, pp. 212–223. ACM (1998)

    Google Scholar 

  7. Mohr, E., Kranz, D.A., Halstead, R.H.: Lazy task creation: a technique for increasing the granularity of parallel programs. IEEE Trans. Parallel Distrib. Syst. 2(3), 264–280 (1991)

    Article  Google Scholar 

  8. Duran, A., Corbalán, J., Ayguadé, E.: Evaluation of OpenMP task scheduling strategies. In: Eigenmann, R., Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 100–110. Springer, Heidelberg (2008). doi:10.1007/978-3-540-79561-2_9

    Chapter  Google Scholar 

  9. Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 36. IEEE Press (2008)

    Google Scholar 

  10. Fonseca, A., Cabral, B.: Evaluation of runtime cut-off approaches for parallel programs. In: VECPAR 2016 Proceedings (2016)

    Google Scholar 

  11. Miller, B.L., Goldberg, D.E.: Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9(3), 193–212 (1995)

    MathSciNet  Google Scholar 

  12. DeJong, K.: An analysis of the behavior of a class of genetic adaptive systems. Ph.D. Thesis, University of Michigan (1975)

    Google Scholar 

  13. Olivier, S.L., Prins, J.F.: Evaluating OpenMP 3.0 run time systems on unbalanced task graphs. In: Müller, M.S., Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 63–78. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02303-3_6

    Chapter  Google Scholar 

  14. Tchiboukdjian, M., Danjean, V., Gautier, T., Mentec, F., Raffin, B.: A work stealing scheduler for parallel loops on shared cache multicores. In: Guarracino, M.R., et al. (eds.) Euro-Par 2010. LNCS, vol. 6586, pp. 99–107. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21878-1_13

    Chapter  Google Scholar 

  15. Cong, G., Kodali, S., Krishnamoorthy, S., Lea, D., Saraswat, V., Wen, T.: Solving large, irregular graph problems using adaptive work-stealing. In: 2008 37th International Conference on Parallel Processing, pp. 536–545. IEEE (2008)

    Google Scholar 

  16. Wang, L., Cui, H., Duan, Y., Lu, F., Feng, X., Yew, P.C.: An adaptive task creation strategy for work-stealing scheduling. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 266–277. ACM (2010)

    Google Scholar 

  17. Chen, S., Gibbons, P.B., Kozuch, M., Liaskovitis, V., Ailamaki, A., Blelloch, G.E., Falsafi, B., Fix, L., Hardavellas, N., Mowry, T.C., et al.: Scheduling threads for constructive cache sharing on cmps. In: Proceedings of the Nineteenth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 105–115. ACM (2007)

    Google Scholar 

  18. Ahmad, I., Dhodhi, M.K.: Multiprocessor scheduling in a genetic paradigm. Parallel Comput. 22(3), 395–406 (1996)

    Article  MATH  Google Scholar 

  19. Kwok, Y.K., Ahmad, I.: Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. J. Parallel Distrib. Comput. 47(1), 58–77 (1997)

    Article  Google Scholar 

  20. Wang, L., Siegel, H.J., Roychowdhury, V.P., Maciejewski, A.A.: Task matching and scheduling in heterogeneous computing environments using a genetic-algorithm-based approach. J. Parallel Distrib. Comput. 47(1), 8–22 (1997)

    Article  Google Scholar 

  21. Corrêa, R.C., Ferreira, A., Rebreyend, P.: Scheduling multiprocessor tasks with genetic algorithms. IEEE Trans. Parallel Distrib. Syst. 10(8), 825–837 (1999)

    Article  Google Scholar 

  22. Omara, F.A., Arafa, M.M.: Genetic algorithms for task scheduling problem. J. Parallel Distrib. Comput. 70(1), 13–22 (2010)

    Article  MATH  Google Scholar 

  23. Mezmaz, M., Melab, N., Kessaci, Y., Lee, Y.C., Talbi, E.G., Zomaya, A.Y., Tuyttens, D.: A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J. Parallel Distrib. Comput. 71(11), 1497–1508 (2011)

    Article  Google Scholar 

  24. Sheikh, H.F., Ahmad, I., Fan, D.: An evolutionary technique for performance-energy-temperature optimized scheduling of parallel tasks on multi-core processors. IEEE Trans. Parallel Distrib. Syst. 27(3), 668–681 (2016)

    Article  Google Scholar 

  25. Langdon, W.B., Harman, M.: Genetically improved CUDA C++ software. In: Nicolau, M., Krawiec, K., Heywood, M.I., Castelli, M., García-Sánchez, P., Merelo, J.J., Rivas Santos, V.M., Sim, K. (eds.) EuroGP 2014. LNCS, vol. 8599, pp. 87–99. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44303-3_8

    Google Scholar 

  26. Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: Genprog: a generic method for automatic software repair. IEEE Trans. Software Eng. 38(1), 54–72 (2012)

    Article  Google Scholar 

  27. Ryan, C., Ivan, L., Koza, J.R., Banzhaf, W.: Automatic parallelization of loops in sequential programs using genetic programming. In: Genetic Programming 1998: Proceedings of the Third, pp. 344–349. Morgan Kaufmann (1998)

    Google Scholar 

  28. Ryan, C., Ivan, L.: Automatic parallelization of arbitrary programs. In: Poli, R., Nordin, P., Langdon, W.B., Fogarty, T.C. (eds.) EuroGP 1999. LNCS, vol. 1598, pp. 244–254. Springer, Heidelberg (1999). doi:10.1007/3-540-48885-5_21

    Chapter  Google Scholar 

Download references

Acknowledgments

The first author was supported by the Portuguese National Foundation for Science and Technology (FCT) through a Doctoral Grant (SFRH/BD/84448/2012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alcides Fonseca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Fonseca, A., Lourenço, N., Cabral, B. (2017). Evolving Cut-Off Mechanisms and Other Work-Stealing Parameters for Parallel Programs. In: Squillero, G., Sim, K. (eds) Applications of Evolutionary Computation. EvoApplications 2017. Lecture Notes in Computer Science(), vol 10199. Springer, Cham. https://doi.org/10.1007/978-3-319-55849-3_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55849-3_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55848-6

  • Online ISBN: 978-3-319-55849-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics