Skip to main content

BMF: Matrix Factorization of Large Scale Data Using Block Based Approach

  • Conference paper
  • First Online:
  • 2607 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Abstract

Matrix Factorization on large scale matrices is a memory intensive task. Alternative convergence techniques are needed when the size of the input matrix and the latent feature matrices are higher than the available memory, both on a Central Processing Unit (CPU) as well as a Graphical Processing Unit (GPU). While alternating least squares (ALS) convergence on a CPU could last forever, loading all the required matrices on to a GPU memory may not be possible when the dimensions are significantly high. In this paper, we introduce a novel technique based on dividing the entire data into block matrices and make use of the Stochastic Gradient Descent (SGD) based factorization at the block level.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Tan, W., Chang, S., Fong, L.L., Li, C., Wang, Z., Cao, L.: Matrix factorization on GPUs with memory optimization and approximate computing. In: ICPP (2018)

    Google Scholar 

  2. Mackey, L.W., Jordan, M.I., Talwalkar, A.: Divide-and-conquer matrix factorization. In: NIPS, pp. 1134–1142 (2011)

    Google Scholar 

  3. Zhang, Y., Zhang, M., Liu, Y., Ma, S., Feng, S.: Localized matrix factorization for recommendation based on matrix block diagonal forms. In: WWW, pp. 1511–1520. ACM (2013)

    Google Scholar 

  4. Du, R., Kuang, D., Drake, B., Park, H.: DC-NMF: non-negative matrix factorization based on divide-and-conquer for fast clustering and topic modeling. J. Glob. Optim. 68(4), 777–798 (2017)

    Article  Google Scholar 

  5. Koitka, S., Friedrich, C.M.: nmfgpu4R: GPU-accelerated computation of the non-negative matrix factorization using cuda capable hardware. R J. 8(2), 382–392 (2016)

    Article  Google Scholar 

  6. Kysenko, V., Rupp, K., Marchenko, O., Selberherr, S., Anisimov, A.: GPU-accelerated non-negative matrix factorization for text mining. In: NIPS, pp. 158–163 (2012)

    Chapter  Google Scholar 

  7. Schelter, S., Satuluri, V., Zadeh, R.: Factorbird - a parameter server approach to distributed matrix factorization. CoRR arXiv:abs/1411.0602 (2014)

  8. Gemulla, R., Nijkamp, E., Haas, P.J., Sismanis, Y.: Large-scale matrix factorization with distributed stochastic gradient descent. In: SIGKDD, pp. 69–77 (2011)

    Google Scholar 

  9. Yun, H., Yu, H.F., Hsieh, C.J., Vishwanathan, S.V.N., Dhillon, I.: NOMAD: non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. Proc. VLDB Endow. 7(11), 975–986 (2014)

    Article  Google Scholar 

  10. Li, B., Tata, S., Sismanis, Y.: Sparkler: supporting large-scale matrix factorization. In: Joint EDBT/ICDT Conferences, pp. 625–636 (2013)

    Google Scholar 

  11. Zhuang, Y., Chin, W.S., Juan, Y.C., Lin, C.J.: A fast parallel stochastic gradient descent for matrix factorization in shared memory systems. In: RecSys, pp. 249–256 (2013)

    Google Scholar 

  12. Recht, B., Ré, C., Wright, S.J., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: NIPS, pp. 693–701 (2011)

    Google Scholar 

  13. Oh, J., Han, W., Yu, H., Jiang, X.: Fast and robust parallel SGD matrix factorization. In: KDD, pp. 865–874 (2015)

    Google Scholar 

  14. Yu, H., Hsieh, C., Si, S., Dhillon, I.S.: Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: ICDM, pp. 765–774 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prasad Bhavana .

Editor information

Editors and Affiliations

A Derivation of SGD Update Equations for BMF

A Derivation of SGD Update Equations for BMF

Our objective is to find \(\underline{U_i}, \underline{V_j}\) for \(\underline{X_{ij}}\) such that \(\underline{X_{ij}} \approx \underline{U_i}.{\underline{V_j}}^T\). Let \(\underline{E_{ij}}\) represent the deviation of estimate \(\underline{X_{ij}}^\prime \) from actual (\(\underline{X_{ij}}\)). Hence, sum of squared deviations (\(\mathcal {E}\)) can be represented as:

$$\begin{aligned} \displaystyle {\mathcal {E}} = ||\underline{X_{ij}} -\underline{X_{ij}}^\prime ||^2 = ||\underline{X_{ij}} - \underline{U_{i}} {\underline{V_j}}^T||^2 \end{aligned}$$
(4)

The goal is to find \(\underline{U_i}\) and \(\underline{V_j}\) such that the sum of squared deviations is minimal. The optimum value for \(g^{th}\) latent feature of \(\underline{U_i}\) and \(\underline{V_j}\) blocks can be obtained by minimizing Eq. (4) with respect to \(\underline{U_i}\) and \(\underline{V_j}\) respectively as:

$$\begin{aligned}&\displaystyle {\frac{\partial }{\partial \underline{U_i}}_{*g}\mathcal {E}} = -2 ( \underline{X_{ij}} - \underline{X_{ij}}^\prime ).\underline{V_j}_{*g} \end{aligned}$$
(5)
$$\begin{aligned}&\displaystyle {\frac{\partial }{\partial \underline{V_j}}_{*g}\mathcal {E}} = -2{( \underline{X_{ij}} - \underline{X_{ij}}^\prime )^T}\underline{U_i}_{*g} \end{aligned}$$
(6)

Using the block level gradients for a latent feature, the update equations for \(g^{th}\) feature of \(\underline{U_i}\) and \(\underline{V_j}\) can be arrived as:

$$\begin{aligned}&\displaystyle {\underline{U_i}_{*g}^\prime } = \underline{U_i}_{*g} + \alpha \frac{\partial }{\partial \underline{U_i}_{*g}} \mathcal {E} = \underline{U_i}_{*g} + 2\alpha \underline{E}_{ij}\underline{V_j}_{*g} \end{aligned}$$
(7)
$$\begin{aligned}&\displaystyle {\underline{V_j}_{*g}^\prime } = \underline{V_j}_{*g} + \alpha \frac{\partial }{\partial \underline{V_j}_{*g}} \mathcal {E} = \underline{V_j}_{*g} + 2\alpha \underline{E_{ij}}^{T}\underline{U_i}_{*g} \end{aligned}$$
(8)

Factoring in regularization term into the Eqs. (7) and (8) gives us:

$$\begin{aligned}&\displaystyle {\underline{U_i}_{*g}^\prime } = \underline{U_i}_{*g} + \alpha (2\underline{E_{ij}}\underline{V_j}_{*g} - \beta \underline{U_i}_{*g}) \end{aligned}$$
(9)
$$\begin{aligned}&\displaystyle {\underline{V_i}_{*g}^\prime } = \underline{V_j}_{*g} + \alpha (2\underline{E_{ij}}^{T}\underline{U_i}_{*g} - \beta \underline{V_j}_{*g}) \end{aligned}$$
(10)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bhavana, P., Padmanabhan, V. (2019). BMF: Matrix Factorization of Large Scale Data Using Block Based Approach. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29911-8_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29910-1

  • Online ISBN: 978-3-030-29911-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics