Abstract
Matrix operations are fundamental to a wide range of scientific applications such as Graph Theory, Linear Equation System, Image Processing, Geometric Optics, and Probability Analysis. As the workload in these applications has increased, the sizes of matrices involved have also significantly increased. Parallel execution of matrix operations in existing cluster-based systems performs effectively for relatively small matrices but significantly suffers as matrices become larger due to limited resources. Cloud Computing offers scalable resources to handle this limitation; however, the benefits of having access to almost-infinite scalable resources in the Cloud also come with challenges of ensuring time and resource-efficient matrix operations. To the best of our knowledge, there is no specific Cloud service that optimizes the efficiency of matrix operations on Cloud infrastructure. To address this gap and offer convenient service of matrix operations, the paper proposes a novel scalable service framework called Scalable Matrix Operation as a Service. Our framework uses Dynamic Matrix Partition techniques, based on matrix operation and sizes, to achieve efficient work distribution, and scales based on demand to achieve time and resource-efficient operations. The framework also embraces the basic features of security, fault tolerance, and reliability. Experimental results show that the adopted dynamic partitioning technique ensures faster and better performance when compared to the existing static partitioning technique.










Similar content being viewed by others
References
Fiedler M (1975) A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czech Math J 25(4):619–633
Campbell SL, Meyer CD (2009) Generalized inverses of linear transformations. SIAM, Philadelphia
Mitzenmacher M, Upfal E (2017) Probability and computing: randomization and probabilistic techniques in algorithms and data analysis. Cambridge University Press, Cambridge
Krishnan M, Nieplocha J (2004) SRUMMA: a matrix multiplication algorithm suitable for clusters and scalable shared memory systems. In: 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings, p 70. IEEE
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Gittens A, Devarakonda A, Racah E, Ringenburg M, Gerhardt L, Kottalam J, Liu J, Maschhoff K, Canon S, Chhugani J et al (2016) Matrix factorizations at scale: a comparison of scientific data analytics in spark and C+ MPI using three case studies. In: 2016 IEEE International Conference on Big Data (Big Data). IEEE, pp 204–213
Gupta V, Wang S, Courtade T, Ramchandran K (2018) Oversketch: approximate matrix multiplication for the cloud. In: 2018 IEEE International Conference on Big Data (Big Data). IEEE, pp 298–304
Beaumont O, Boudet V, Rastello F, Robert Y et al (2002) Partitioning a square into rectangles: NP-completeness and approximation algorithms. Algorithmica 34(3):217–239
Pichel JC, Rivera FF (2013) Sparse matrix vector multiplication on the single-chip cloud computer many-core processor. J Parallel Distrib Comput 73(12):1539–1550 (Heterogeneity in Parallel and Distributed Computing)
Yang X, Parthasarathy S, Sadayappan P (2011) Fast sparse matrix-vector multiplication on GPUs: implications for graph mining. Proc VLDB Endow 4(4):231–242
Ashari A, Sedaghati N, Eisenlohr J, Parthasarath S, Sadayappan P (2014) Fast sparse matrix-vector multiplication on GPUs for graph applications. In: SC ’14:Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp 781–792
Boukaram W, Turkiyyah G, Keyes D (2019) Hierarchical matrix operations on GPUs: matrix-vector multiplication and compression. ACM Trans Math Softw 45(1):1–28
Seo S, Yoon EJ, Kim J, Jin S, Kim J, Maeng S (2010) Hama: an efficient matrix computation with the mapreduce framework. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp 721–726
Gu R, Tang Y, Wang Z, Wang S, Yin X, Yuan C, Huang Y (2015) Efficient large scale distributed matrix computation with spark. In: 2015 IEEE International Conference on Big Data (Big Data), pp 2327–2336
Liu J, Liang Y, Ansari N (2016) Spark-based large-scale matrix inversion for big data processing. IEEE Access 4:2166–2176
DeFlumere A, Lastovetsky A (2014) Searching for the optimal data partitioning shape for parallel matrix matrix multiplication on 3 heterogeneous processors. In: 2014 IEEE International Parallel and Distributed Processing Symposium Workshops. IEEE, pp 17–28
Dovolnov E, Kalinov A, Klimov S (2003) Natural block data decomposition for heterogeneous clusters. In: Proceedings International Parallel and Distributed Processing Symposium. IEEE, p 10
Lastovetsky Alexey (2007) On grid-based matrix partitioning for heterogeneous processors. In: Sixth International Symposium on Parallel and Distributed Computing (ISPDC’07). IEEE, p 51
Clarke D, Lastovetsky A, Rychkov V (2012) Column-based matrix partitioning for parallel matrix multiplication on heterogeneous processors based on functional performance models. In: Euro-Par 2011: Parallel Processing Workshops. Springer, Berlin, pp 450–459
Malik T, Rychkov V, Lastovetsky A (2016) Network-aware optimization of communications for parallel matrix multiplication on hierarchical hpc platforms. Concurr Comput Pract Exp 28(3):802–821
Wang S, Huang J, Lee W, Lee K (2018) Scaling up matrix factorization with cloud computing for collaborative recommendation. In: 2018 International Conference on System Science and Engineering (ICSSE), pp 1–6
Gupta V, Wang S, Courtade T, Ramchandran K (2018) Oversketch: approximate matrix multiplication for the cloud. pp 298–304
Qian Z, Chen X, Kang N, Chen M, Yu Y, Moscibroda T, Zhang Z (2012) Madlinq: large-scale distributed matrix computation for the cloud. In: Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys 12, New York, NY, USA. Association for Computing Machinery, p 197210
Beaumont O, Becker BA, DeFlumere A, Eyraud-Dubois L, Lambert T, Lastovetsky A (2019) Recent advances in matrix partitioning for parallel computing on heterogeneous platforms. IEEE Trans Parallel Distrib Syst 30(1):218–229
Chen Y, Xiao G, Fan W, Tang Z, Li K (2020) tpSpMV: a two-phase large-scale sparse matrix-vector multiplication kernel for manycore architectures. Inf Sci 523:279–295
Garg S, Forbes-Smith N, Hilton J, Prakash M (2018) SparkCloud: a cloud-based elastic bushfire simulation service. Remote Sens 10(1):74
Nectar Cloud (2019) https://nectar.org.au/research-cloud/. Accessed 12 May 2018
Jeremy Unruh (2019) Openstack4j. http://www.openstack4j.com/. Accessed 20 Feb 2019
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ujjwal, K., Battula, S.K., Garg, S. et al. SMOaaS: a Scalable Matrix Operation as a Service model in Cloud. J Supercomput 77, 3381–3401 (2021). https://doi.org/10.1007/s11227-020-03400-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03400-0