Skip to main content
Log in

Using Machine Learning Techniques to Detect Parallel Patterns of Multi-threaded Applications

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Multicore hardware and software are becoming increasingly more complex. The programmability problem of multicore software has led to the use of parallel patterns. Parallel patterns reduce the effort and time required to develop multicore software by effectively capturing its thread communication and data sharing characteristics. Hence, detecting the parallel pattern used in a multi-threaded application is crucial for performance improvements and enables many architectural optimizations; however, this topic has not been widely studied. We apply machine learning techniques in a novel approach to automatically detect parallel patterns and compare these techniques in terms of accuracy and speed. We experimentally validate the detection ability of our techniques on benchmarks including PARSEC and Rodinia. Our experiments show that the k-nearest neighbor, decision trees, and naive Bayes classifier are the most accurate techniques. Overall, decision trees are the fastest technique with the lowest characterization overhead producing the best combination of detection results. We also show the usefulness of the proposed techniques on synthetic benchmark generation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M.: Design patterns percolating to parallel programming framework implementation. Int. J. Parallel Prog. 42(6), 1012–1031 (2014)

    Article  Google Scholar 

  2. Alpaydin, E.: Introduction to Machine Learning, 2nd edn. The MIT Press, Cambridge (2010)

    MATH  Google Scholar 

  3. Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: a view from Berkeley. Tech. Rep. UCB/EECS-2006-183, EECS Department, University of California, Berkeley (2006)

  4. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5, 537–550 (1994)

    Article  Google Scholar 

  5. Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC Benchmark suite: characterization and architectural implications. In: International Conference on Parallel Architectures and Compilation Techniques, pp. 72–81 (2008)

  6. Bird, S., Phansalkar, A., John, L.K., Mercas, A., Idukuru, R.: Performance characterization of SPEC CPU benchmarks on Intel’s Core microarchitecture based processor. In: SPEC Benchmark Workshop, pp. 1–7 (2007)

  7. Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Secaucus (2006)

    MATH  Google Scholar 

  8. Cammarota, R., Beni, L.A., Nicolau, A., Veidenbaum, A.V.: Effective evaluation of multi-core based systems. In: International Symposium on Parallel and Distributed Computing (ISPDC), pp. 19–25. IEEE (2013)

  9. Cammarota, R., Kejariwal, A., D’Alberto, P., Panigrahi, S., Veidenbaum, A.V., Nicolau, A.: Pruning hardware evaluation space via correlation-driven application similarity analysis. In: Proceedings of the 8th ACM International Conference on Computing Frontiers, p. 4. ACM (2011)

  10. Cammarota, R., Nicolau, A., Veidenbaum, A.V., Kejariwal, A., Donato, D., Madhugiri, M.: On the Determination of inlining vectors for program optimization. In: Jhala, R., De Bosschere, K. (eds.) Compiler Construction. Lecture Notes in Computer Science, vol. 7791, pp. 164–183. Springer Berlin Heidelberg (2013)

  11. Campa, S., Danelutto, M., Goli, M., González-Vélez, H., Popescu, A.M., Torquati, M.: Parallel patterns for heterogeneous CPU/GPU architectures: structured parallelism from cluster to cloud. Fut. Gener. Comput. Syst. 37, 354–366 (2014)

    Article  Google Scholar 

  12. Cavazos, J., Fursin, G., Agakov, F., Bonilla, E., O’Boyle, M.F., Temam, O.: Rapidly selecting good compiler optimizations using performance counters. In: International Symposium on Code Generation and Optimization (CGO), pp. 185–197. IEEE (2007)

  13. Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 44–54 (2009)

  14. Che, S., Sheaffer, J., Boyer, M., Szafaryn, L., Wang, L., Skadron, K.: A Characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 1–11 (2010)

  15. Deniz, E., Sen, A., Kahne, B., Holt, J.: MINIME: pattern-aware multicore benchmark synthesizer. IEEE Trans. Comput. 64(8), 2239–2252 (2015)

    Article  MathSciNet  Google Scholar 

  16. Deshpande, A., Riehle, D.: The total growth of open source. In: Open Source Development, Communities and Quality, IFIP? The International Federation for Information Processing, vol. 275, pp. 197–209. Springer, Berlin (2008)

  17. Ding, W., Hernandez, O., Curtis, T., Chapman, B.: Porting applications with OpenMP using similarity analysis. In: Caşcaval, C., Montesinos, P. (eds.) Languages and Compilers for Parallel Computing. Lecture Notes in Computer Science, vol. 8664, pp. 20–35. Springer International Publishing (2014)

  18. DiscoPoP: a profiling tool to identify parallelization opportunities. http://www.grs-sim.de/research/parallel-programming/multicore-programming/discopop-project.html (2015)

  19. Dunteman, G.H.: Principal Component Analysis. Sage, London (1989)

    Google Scholar 

  20. DynamoRIO Dynamic Instrumentation Tool Platform, http://dynamorio.org/ (2015)

  21. Eeckhout, L., Vandierendonck, H., Bosschere, K.D.: Quantifying the impact of input data sets on program behavior and its applications. J. Instr. Level Parallelism 5, 1–33 (2003)

    Google Scholar 

  22. Embedded microprocessor benchmark consortium. http://www.eembc.org (2015)

  23. FastFlow: Pattern-based multi/many-core parallel programming framework. http://sourceforge.net/projects/mc-fastflow/ (2015)

  24. Ferrari, D.: On the foundations of artificial workload design. Perform. Eval. 3(2), 153 (1983)

    Article  Google Scholar 

  25. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software, 1st edn. Addison-Wesley Professional, Reading (1994)

    MATH  Google Scholar 

  26. Ganapathi, A., Datta, K., Fox, A., Patterson, D.: A case for machine learning to optimize multicore performance. In: First USENIX Workshop on Hot Topics in Parallelism (HotPar), pp. 1–6 (2009)

  27. Ganesan, K., John, L.K.: Automatic generation of miniaturized synthetic proxies for target applications to efficiently design multicore processors. IEEE Trans. Comput. 63(4), 833–846 (2014)

    Article  MathSciNet  Google Scholar 

  28. Ganesan, K., John, L.K., Salapura, V., Sexton, J.C.: A performance counter based workload characterization on blue gene/P. In: International Conference on Parallel Processing (ICPP), pp. 330–337 (2008)

  29. Goswami, D., Singh, A., Preiss, B.R.: Building parallel applications using design patterns. In: Erdogmus, H., Tanir, O. (eds.) Advances in Software Engineering: Topics in Comprehension, Evolution and Evaluation, pp. 243–265. Springer, New York (2002)

    Chapter  Google Scholar 

  30. Hammond, K., Aldinucci, M., Brown, C., Cesarini, F., Danelutto, M., González-Vélez, H., Kilpatrick, P., Keller, R., Rossbory, M., Shainer, G.: The ParaPhrase Project: parallel patterns for adaptive heterogeneous multicore systems. In: Formal Methods for Components and Objects, Lecture Notes in Computer Science, vol. 7542, pp. 218–236. Springer, Berlin (2013)

  31. Hoste, K., Eeckhout, L.: Comparing benchmarks using key microarchitecture-independent characteristics. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 83–92 (2006)

  32. Huda, Z.U., Jannesari, A., Wolf, F.: Using template matching to infer parallel design patterns. ACM Trans. Archit. Code Optim. (TACO) 11(4), 64 (2015)

    Google Scholar 

  33. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc, Upper Saddle River (1988)

    MATH  Google Scholar 

  34. John, G., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann (1995)

  35. Joshi, A., Eeckhout, L., Jr., R.H.B., John, L.K.: Performance cloning: a technique for disseminating proprietary applications as benchmarks. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 105–115 (2006)

  36. Joshi, A., Phansalkar, A., Eeckhout, L., John, L.K.: Measuring benchmark similarity using inherent program characteristics. IEEE Trans. Comput. (TC) 55, 769–782 (2006)

    Article  Google Scholar 

  37. Lin, C.-Y., Kuan, C.-B., Shih, W.-L., Lee, J.K.: Compilers for low power with design patterns on embedded multicore systems. J. Signal Process. Syst. 80(3), 277–293 (2015). doi:10.1007/s11265-014-0917-9

  38. MATLAB: The Language of Technical Computing—MathWorks. http://www.mathworks.com/products/matlab/ (2015)

  39. Mattson, T., Sanders, B., Massingill, B.: Patterns for Parallel Programming. Addison-Wesley, Reading (2005)

    MATH  Google Scholar 

  40. McCool, M.D.: Structured parallel programming with deterministic patterns. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Parallelism, HotPar’10, pp. 1–6 (2010)

  41. Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc., New York (1997)

    MATH  Google Scholar 

  42. Mitchell, T.M.: The discipline of machine learning. Tech. Rep. CMU-ML-06-108, Machine Learning Department, School of Computer Science, Carnegie Mellon University (2006)

  43. Moller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6(4), 525–533 (1993)

    Article  Google Scholar 

  44. The OpenMP API Specification for Parallel Programming. http://openmp.org (2015)

  45. Ortega-Arjona, J.L., Roberts, G.: Architectural patterns for parallel programming. In: European Conference on Pattern Languages of Programs (EuroPLoP), pp. 225–260 (1998)

  46. Poovey, J.A., Railing, B.P., Conte, T.M.: Parallel pattern detection for architectural improvements. In: USENIX Conference on Hot Topic in Parallelism (HotPar), pp. 12–12 (2011)

  47. Poovey, J.A., Rosier, M.C., Conte, T.M.: Pattern-aware dynamic thread mapping mechanisms for asymmetric manycore architectures. Tech. Rep. 2011-1, School of Computer Science, Georgia Institute of Technology (2011)

  48. IEEE Std 1003.1, 2013 Edition. http://www.unix.org/version4/ieee_std.html (2015)

  49. Ruparelia, N.B.: Software development lifecycle models. ACM SIGSOFT Softw. Eng. Notes 35(3), 8–13 (2010)

    Article  Google Scholar 

  50. Skillicorn, D.B.: Models for practical parallel computation. Int. J. Parallel Prog. 20(2), 133–158 (1991)

    Article  Google Scholar 

  51. Wang, Z., O’boyle, M.F.P.: Using machine learning to partition streaming programs. ACM Trans. Archit. Code Optim. (TACO) 10(3), 20:1–20:25 (2008)

    Google Scholar 

  52. Zandifar, M., Abdul Jabbar, M., Majidi, A., Keyes, D., Amato, N.M., Rauchwerger, L.: Composing algorithmic skeletons to express high-performance scientific applications. In: Proceedings of the 29th ACM on International Conference on Supercomputing, ICS ’15, pp. 415–424. ACM (2015)

  53. Zanoni, M., Fontana, F.A., Stella, F.: On applying machine learning techniques for design pattern detection. J. Syst. Softw. 103, 102–117 (2015)

    Article  Google Scholar 

  54. Zhao, Q., Bruening, D., Amarasinghe, S.: Umbra: Efficient and scalable memory shadowing. In: IEEE/ACM international symposium on code generation and optimization, pp. 22–31 (2010)

Download references

Acknowledgments

We would like to thank Prof. Ethem Alpaydin for his very helpful comments on early versions of the paper. This work was supported in part by Semiconductor Research Corporation under task 2082.001, Bogazici University Research Fund 7223, and the Turkish Academy of Sciences.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Etem Deniz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deniz, E., Sen, A. Using Machine Learning Techniques to Detect Parallel Patterns of Multi-threaded Applications. Int J Parallel Prog 44, 867–900 (2016). https://doi.org/10.1007/s10766-015-0396-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-015-0396-z

Keywords

Navigation