Abstract
In this paper, we address the matrix chain multiplication problem, i.e., the multiplication of several matrices. Although several studies have investigated the problem, our approach has some different points. First, we propose MapReduce algorithms that allow us to provide scalable computation for large matrices. Second, we transform the matrix chain multiplication problem from sequential multiplications of two matrices into a single multiplication of several matrices. Since matrix multiplication is associative, this approach helps to improve the performance of the algorithms. To implement the idea, we adopt multi-way join algorithms in MapReduce that have been studied in recent years. In our experiments, we show that the proposed algorithms are fast and scalable, compared to several baseline algorithms.
Similar content being viewed by others
References
Afrati FN, Ullman JD (2011) Optimizing multiway joins in a map-reduce environment. IEEE Trans Knowl Data Eng 23:1282–1298. doi:10.1109/TKDE.2011.47
Amossen RR, Pagh R (2009) Faster join-projects and sparse matrix multiplications. In: Proceedings of the 12th International Conference on Database Theory, ICDT ’09. ACM, New York, pp 121–126. doi:10.1145/1514894.1514909
Apache Giraph. http://incubator.apache.org/giraph/
Apache Hadoop. http://hadoop.apache.org/common/docs/r1.0.3/
Blanas S, Patel JM, Ercegovac V, Rao J, Shekita EJ, Tian Y (2010) A comparison of join algorithms for log processing in mapreduce. In: Proceedings of the 2010 International Conference on Management of Data, SIGMOD ’10. ACM, New York, pp 975–986. doi:10.1145/1807167.1807273
Cormen TH (2001) Introduction to algorithms. MIT Press, Cambridge
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51:107–113. doi:10.1145/1327452.1327492
Ghoting A, Krishnamurthy R, Pednault EPD, Reinwald B, Sindhwani V, Tatikonda S, Tian Y, Vaithyanathan S (2011) Systemml: declarative machine learning on MapReduce. In: ICDE, pp 231–242
Kang U, Tsourakakis CE, Faloutsos C (2009) Pegasus: a peta-scale graph mining system implementation and observations. In: Proceedings of the 2009 ninth IEEE International Conference on Data Mining, ICDM ’09. IEEE Comput Soc, Washington, pp 229–238. doi:10.1109/ICDM.2009.14
Kitsuregawa M, Tanaka H, Moto-Oka T (1983) Application of hash to data base machine and its architecture. New Gener Comput 1(1):63–74. doi:10.1007/BF03037022
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 International Conference on Management of Data, SIGMOD ’10. ACM, New York, pp 135–146. doi:10.1145/1807167.1807184
Milentijević IZ, Milovanović IZ, Milovanović EI, Tošić MB, Stojčev MK (1998) Two-level pipelined systolic arrays for matrix-vector multiplication. J Syst Archit 44(5):383–387. doi:10.1016/S1383-7621(97)83828-3
Myung J, Lee Sg (2012) Matrix chain multiplication via multi-way join algorithms in mapreduce. In: Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication, ICUIMC ’12. ACM, New York, pp 53:1–53:5. doi:10.1145/2184751.2184817
Myung J, Yeon J, Lee Sg (2010) SPARQL basic graph pattern processing with iterative MapReduce. In: Proceedings of the 2010 workshop on Massive Data Analytics on the Cloud, MDAC ’10. ACM, New York, pp 6:1–6:6. doi:10.1145/1779599.1779605
Norstad J (2009) A mapreduce algorithm for matrix multiplication. http://homepage.mac.com/j.norstad/matrix-multiply/index.html
Pace MF (2012) Bsp vs mapreduce. CoRR. arXiv:1203.2081
Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab. Previous number = SIDL-WP-1999-0120. http://ilpubs.stanford.edu:8090/422/
Rajaraman A, Ullman JD (2012) Mining of massive datasets. Cambridge University Press, Cambridge
Seo S, Yoon EJ, Kim J, Jin S, Kim JS, Maeng S (2010) Hama: an efficient matrix computation with the MapReduce framework. In: Proceedings of the 2010 IEEE second international conference on Cloud Computing Technology and Science, CLOUDCOM ’10. IEEE Comput Soc, Washington, pp 721–726. doi:10.1109/CloudCom.2010.17
Stanford large network dataset collection. http://snap.stanford.edu/data/index.html
Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a MapReduce framework. Proc VLDB Endow 2(2):1626–1629. http://dl.acm.org/citation.cfm?id=1687553.1687609
Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111. doi:10.1145/79173.79181
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 20120005695).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Myung, J., Lee, Sg. Exploiting inter-operation parallelism for matrix chain multiplication using MapReduce. J Supercomput 66, 594–609 (2013). https://doi.org/10.1007/s11227-013-0936-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-0936-5