Matrix factorization of large scale data using multistage matrix factorization

Bhavana, Prasad; Padmanabhan, Vineet

doi:10.1007/s10489-020-01957-0

Matrix factorization of large scale data using multistage matrix factorization

Published: 25 November 2020

Volume 51, pages 4016–4028, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Prasad Bhavana¹ &
Vineet Padmanabhan¹

437 Accesses
1 Citation
Explore all metrics

Abstract

Matrix Factorization (MF) is a resource intensive task that consumes significant memory and computational effort and is not scalable with the quantum of data. When the size of the input matrix and the latent feature matrices are higher than the available memory, both on a Central Processing Unit (CPU) as well as a Graphical Processing Unit (GPU), loading all the required matrices on to CPU/GPU memory may not be possible. Such scenarios call for alternative techniques that not only allow parallelism but also address memory limitations and plays a crucial role in industrial applications. In this paper we propose a divide and conquer technique based on a two stage factorization process. In the first step, we divide the data set into different groups and factorize each group. In the second step, we use factorization based learning model to combine the latent features derived in the first step. Our motivation is to develop a method that can achieve both parallelism and scalability as well as address factorization of incrementally growing data. Our contribution is a novel multi-stage matrix factorization (MsMF) approach. The experimental results demonstrate improvements in RMSE as well as computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BMF: Matrix Factorization of Large Scale Data Using Block Based Approach

Distributed non-negative matrix factorization with determination of the number of latent features

Article 08 February 2020

Accelerated parallel and distributed algorithm using limited internal memory for nonnegative matrix factorization

Article 04 October 2016

Notes

http://grouplens.org/datasets/

References

Koren Y (2009) 1 the bellkor solution to the netflix grand prize
Kysenko V, Rupp K, Marchenko O, Selberherr S, Anisimov A (2012) Gpu-accelerated non-negative matrix factorization for text mining. In: Bouma A, Ittoo G, Mtais E, Wortmann H (eds) Natural Language Processing and Information Systems. Springer, Berlin, pp 158–163
Ross DA, Lim J, Lin R-S, Yang M-H (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77(1):125–141
Article Google Scholar
Li Y, Sima DM, Cauter SV, Croitor Sava AR, Himmelreich U, Pi Y, Van Huffel S Hierarchical non-negative matrix factorization (hnmf): a tissue pattern differentiation method for glioblastoma multiforme diagnosis using mrsi. NMR Biomed 26(3):307–319
Weinberger KQ, Packer B, Saul LK (2005) Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In: AISTATS, vol 2, pp 6
Mackey LW, Jordan MI, Talwalkar A (2011) Divide-and-conquer matrix factorization. In: Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems, vol 24, Curran Associates, Inc., pp 1134–1142
Sadowski T, Zdunek R (2018) Image Completion with Smooth Nonnegative Matrix Factorization, pp 62–72
Hosseini-Asl E, Zurada JM (2014) Nonnegative matrix factorization for document clustering: A survey. In: Rutkowski M, Korytkowski L, Scherer R, Tadeusiewicz R, Zadeh LA, Zurada JM (eds) Artificial Intelligence and Soft Computing. Springer. International Publishing
Fu X, Huang K, Sidiropoulos ND, Ma W (2019) Nonnegative matrix factorization for signal and data analytics: Identifiability, algorithms, and applications. IEEE Signal Proc Mag 36(2):59–80
Article Google Scholar
Proximal maximum margin matrix factorization for collaborative filtering. Pattern Recogn Lett 86(2017):62–67
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Wang Q, Cao Z, Xu J, Li H (2012) Group matrix factorization for scalable topic modeling. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12
Du R, Kuang D, Drake B, Park H (2017) Dc-nmf: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling. J Glob Optim 68(4):777–798
Article MathSciNet Google Scholar
Yun H, Yu H-F, Hsieh C-J, Vishwanathan SVN, Dhillon I Nomad: Non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. Proceedings of VLDB Endow
Li B, Tata S, Sismanis Y (2013) Sparkler: supporting large-scale matrix factorization. In: Joint 2013 EDBT/ICDT Conferences, EDBT ’13 Proceedings, Genoa, Italy, pp 625–636
Zhuang Y, Chin Wx-S, Juan Y-C, Lin C-J (2013) A fast parallel sgd for matrix factorization in shared memory systems. In: Proceedings of the 7th ACM Conference on Recommender Systems, RecSys ’13
Recht B, Rė C, Wright SJ, Niu F (2011) Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems 24: NIPS 2011. Proceedings of a meeting held, Granada, pp 693–701
Oh J, Han W, Yu H, Jiang X (2015) Fast and robust parallel SGD matrix factorization. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, pp 865–874
Yu H, Hsieh C, Si S, Dhillon IS (2012) Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: 12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, pp 765–774
Chin W, Zhuang Y, Juan Y, Lin C (2015) A learning-rate schedule for stochastic gradient methods to matrix factorization. In: Advances in knowledge discovery and data mining - 19th pacific-asia conference, PAKDD 2015, Ho Chi Minh City, Proceedings, Part I, pp 442–455
Schelter S, Satuluri V, Zadeh R Factorbird - a parameter server approach to distributed matrix factorization, arXiv:1411.0602
Zhu B, Li W, Li R, Xue X (2013) Multi-stage non-negative matrix factorization for monaural singing voice separation. IEEE Trans Audio Speech Lang Process 21(10):2096–2107
Article Google Scholar
Park S, Kim Y-D, Choi S (2013) Hierarchical bayesian matrix factorization with side information. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI ’13, pp 1593–1599
Li H, Liu Y, Qian Y, Mamoulis N, Tu W, Cheung DW Hhmf: hidden hierarchical matrix factorization for recommender systems, Data Mining and Knowledge Discovery
Shan H, Kattge J, Reich P, Banerjee A, Schrodt F, Reichstein M (2012) Gap filling in the plant kingdom—trait prediction using hierarchical probabilistic matrix factorization
Gillis N, Glineur F (2012) Accelerated multiplicative updates and hierarchical als algorithms for nonnegative matrix factorization. Neural Comput 24(4):1085–1105
Article MathSciNet Google Scholar
Cichocki A, Zdunek R, Amari S-I (2007) Hierarchical als algorithms for nonnegative matrix and 3d tensor factorization. In: Davies ME, James CJ, Abdallah SA, Plumbley MD (eds) Independent Component Analysis and Signal Separation. Springer, Berlin, pp 169–176
Basbug ME, Engelhardt BE (2016) Hierarchical compound poisson factorization. In: ICML
Gopalan P, Hofman JM, Blei DM (2013) Scalable recommendation with poisson factorization
Anyosa SC, Vinagre JA, Jorge AM (2018) Incremental matrix co-factorization for recommender systems with implicit feedback WWW ’18
Koitka S, Friedrich CM (2016) Nmfgpu4r: Gpu-accelerated computation of the non-negative matrix factorization (nmf) using cuda capable hardware. R J 8(2):382–392
Article Google Scholar
Tan W, Chang S, Fong LL, Li C, Wang Z, Cao L (2018) Matrix factorization on gpus with memory optimization and approximate computing. In: ICPP
Zhong E, Fan W, Yang Q (2012) Contextual collaborative filtering via hierarchical matrix factorization. In: Proceedings of the 2012 SIAM International Conference on Data Mining. SIAM, pp 744–755
Sawant S (2013) Collaborative filtering using weighted bipartite graph projection: a recommendation system for yelp. In: Proceedings of the CS224W: Social and information network analysis conference, vol 33
Genre-based link prediction in bipartite graph for music recommendation. Procedia Comput Sci 91(2016):959–965, promoting Business Analytics and Quantitative Management of Technology: 4th International Conference on Information Technology and Quantitative Management (ITQM 2016)
Ban Y On finding dense subgraphs in bipartite graphs: Linear algorithms, arXiv:1810.06809
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Article Google Scholar
Salakhutdinov R, Mnih A In: Advances in Neural Information Processing Systems
Bhavana P, Kumar V, Padmanabhan V (2019) Block based singular value decomposition approach to matrix factorization for recommender systems
Sarwar B, Karypis G, Konstan J, Riedl J (2002) Incremental singular value decomposition algorithms for highly scalable recommender systems. In: Fifth International Conference on Computer and Information Science, pp 27–28
Tzeng J (2013) Split-and-combine singular value decomposition for large-scale matrix. Journal of Applied Mathematics
Huang X, Wu L, Chen E, Zhu H, Liu Q, Wang Y (2017) Incremental matrix factorization: a linear feature transformation perspective. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI’17
Incremental collaborative filtering recommender based on regularized matrix factorization. Knowl-Based Syst 27(2012):271– 280
Yu T, Mengshoel OJ, Jude A, Feller E, Forgeat J, Radia N (2016) Incremental learning for matrix factorization in recommender systems. In: 2016 IEEE International Conference on Big Data (Big Data), pp 1056–1063

Download references

Author information

Authors and Affiliations

School of Computer and Information Sciences, University of Hyderabad, Hyderabad, India
Prasad Bhavana & Vineet Padmanabhan

Authors

Prasad Bhavana
View author publications
You can also search for this author in PubMed Google Scholar
Vineet Padmanabhan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prasad Bhavana.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhavana, P., Padmanabhan, V. Matrix factorization of large scale data using multistage matrix factorization. Appl Intell 51, 4016–4028 (2021). https://doi.org/10.1007/s10489-020-01957-0

Download citation

Accepted: 17 September 2020
Published: 25 November 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s10489-020-01957-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Matrix factorization of large scale data using multistage matrix factorization

Abstract

Access this article

Similar content being viewed by others

BMF: Matrix Factorization of Large Scale Data Using Block Based Approach

Distributed non-negative matrix factorization with determination of the number of latent features

Accelerated parallel and distributed algorithm using limited internal memory for nonnegative matrix factorization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Matrix factorization of large scale data using multistage matrix factorization

Abstract

Access this article

Similar content being viewed by others

BMF: Matrix Factorization of Large Scale Data Using Block Based Approach

Distributed non-negative matrix factorization with determination of the number of latent features

Accelerated parallel and distributed algorithm using limited internal memory for nonnegative matrix factorization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation