Abstract
Matrix Factorization on large scale matrices is a memory intensive task. Alternative convergence techniques are needed when the size of the input matrix and the latent feature matrices are higher than the available memory, both on a Central Processing Unit (CPU) as well as a Graphical Processing Unit (GPU). While alternating least squares (ALS) convergence on a CPU could last forever, loading all the required matrices on to a GPU memory may not be possible when the dimensions are significantly high. In this paper, we introduce a novel technique based on dividing the entire data into block matrices and make use of the Stochastic Gradient Descent (SGD) based factorization at the block level.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Tan, W., Chang, S., Fong, L.L., Li, C., Wang, Z., Cao, L.: Matrix factorization on GPUs with memory optimization and approximate computing. In: ICPP (2018)
Mackey, L.W., Jordan, M.I., Talwalkar, A.: Divide-and-conquer matrix factorization. In: NIPS, pp. 1134–1142 (2011)
Zhang, Y., Zhang, M., Liu, Y., Ma, S., Feng, S.: Localized matrix factorization for recommendation based on matrix block diagonal forms. In: WWW, pp. 1511–1520. ACM (2013)
Du, R., Kuang, D., Drake, B., Park, H.: DC-NMF: non-negative matrix factorization based on divide-and-conquer for fast clustering and topic modeling. J. Glob. Optim. 68(4), 777–798 (2017)
Koitka, S., Friedrich, C.M.: nmfgpu4R: GPU-accelerated computation of the non-negative matrix factorization using cuda capable hardware. R J. 8(2), 382–392 (2016)
Kysenko, V., Rupp, K., Marchenko, O., Selberherr, S., Anisimov, A.: GPU-accelerated non-negative matrix factorization for text mining. In: NIPS, pp. 158–163 (2012)
Schelter, S., Satuluri, V., Zadeh, R.: Factorbird - a parameter server approach to distributed matrix factorization. CoRR arXiv:abs/1411.0602 (2014)
Gemulla, R., Nijkamp, E., Haas, P.J., Sismanis, Y.: Large-scale matrix factorization with distributed stochastic gradient descent. In: SIGKDD, pp. 69–77 (2011)
Yun, H., Yu, H.F., Hsieh, C.J., Vishwanathan, S.V.N., Dhillon, I.: NOMAD: non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. Proc. VLDB Endow. 7(11), 975–986 (2014)
Li, B., Tata, S., Sismanis, Y.: Sparkler: supporting large-scale matrix factorization. In: Joint EDBT/ICDT Conferences, pp. 625–636 (2013)
Zhuang, Y., Chin, W.S., Juan, Y.C., Lin, C.J.: A fast parallel stochastic gradient descent for matrix factorization in shared memory systems. In: RecSys, pp. 249–256 (2013)
Recht, B., Ré, C., Wright, S.J., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: NIPS, pp. 693–701 (2011)
Oh, J., Han, W., Yu, H., Jiang, X.: Fast and robust parallel SGD matrix factorization. In: KDD, pp. 865–874 (2015)
Yu, H., Hsieh, C., Si, S., Dhillon, I.S.: Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: ICDM, pp. 765–774 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Derivation of SGD Update Equations for BMF
A Derivation of SGD Update Equations for BMF
Our objective is to find \(\underline{U_i}, \underline{V_j}\) for \(\underline{X_{ij}}\) such that \(\underline{X_{ij}} \approx \underline{U_i}.{\underline{V_j}}^T\). Let \(\underline{E_{ij}}\) represent the deviation of estimate \(\underline{X_{ij}}^\prime \) from actual (\(\underline{X_{ij}}\)). Hence, sum of squared deviations (\(\mathcal {E}\)) can be represented as:
The goal is to find \(\underline{U_i}\) and \(\underline{V_j}\) such that the sum of squared deviations is minimal. The optimum value for \(g^{th}\) latent feature of \(\underline{U_i}\) and \(\underline{V_j}\) blocks can be obtained by minimizing Eq. (4) with respect to \(\underline{U_i}\) and \(\underline{V_j}\) respectively as:
Using the block level gradients for a latent feature, the update equations for \(g^{th}\) feature of \(\underline{U_i}\) and \(\underline{V_j}\) can be arrived as:
Factoring in regularization term into the Eqs. (7) and (8) gives us:
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bhavana, P., Padmanabhan, V. (2019). BMF: Matrix Factorization of Large Scale Data Using Block Based Approach. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-29911-8_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)