Skip to main content
Log in

Large-scale tucker Tensor factorization for sparse and accurate decomposition

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

How can we generate sparse tensor decomposition results for better interpretability? Typical tensor decomposition results are dense. Dense results require additional postprocessing for data interpretation, especially when the data are large. Thus, we present a large-scale Tucker factorization method for sparse and accurate tensor decomposition, which we call the Very Sparse Tucker factorization (VeST) method. The proposed VeST outputs highly sparse decomposition results from a large-scale partially observable tensor data. The approach starts by decomposing the input tensor data, then iteratively determining unimportant elements, removing them, and updating the remaining elements until a terminal state is reached. We define ‘responsibility’ of each element on the reconstruction error to determine unimportant elements in the decomposition results. The decomposition results are updated iteratively in parallel using carefully constructed coordinate descent rules for scalable computation. Furthermore, the suggested method automatically looks for the optimal sparsity ratio, resulting in a balanced sparsity-accuracy trade-off. Extensive experiments using real-world datasets showed that our method produces more accurate results than that of the competitors. Experiments further showed that the proposed method is scalable in terms of the input dimensionality, the number of observable entries, and the thread count.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://grouplens.org/datasets/movielens/.

  2. http://www.yelp.com/dataset_challenge/.

  3. http://snap.stanford.edu/data/web-FineFoods.html.

References

  1. The 100 greatest movies (2018). https://www.empireonline.com/movies/features/best-movies/

  2. Allen G (2012) Sparse higher-order principal components analysis. Artif Intell Stat, pp 27–36

  3. de Araujo MR, Ribeiro PMP, Faloutsos C (2017) Tensorcast: Forecasting with context using coupled tensors (best paper award). In: Raghavan V, Aluru S, Karypis G, Miele L, Wu X (eds) 2017 IEEE International Conference on Data Mining, ICDM 2017, New Orleans, LA, USA, 18–21 Nov 2017, pp 71–80. IEEE Computer Society

  4. Austin W, Ballard G, Kolda TG (2016) Parallel tensor compression for large-scale scientific data. In: IPDPS, pp 912–922

  5. Bader BW, Kolda TG, et la Tensor toolbox for matlab v. 3.0, version 00

  6. Ballard G, Klinvex A, Kolda TG (2020) Tuckermpi: A parallel c++/mpi software package for large-scale data compression via the tucker tensor decomposition. ACM Trans Math Softw 46(2)

  7. Chakaravarthy VT, Choi JW, Joseph DJ, Liu X, Murali P, Sabharwal Y, Sreedhar D (2017) On optimizing distributed tucker decomposition for dense tensors. In: IPDPS, pp 1038–1047

  8. Choi D, Jang JG, Kang U (2019) S3cmtf: fast, accurate, and scalable method for incomplete coupled matrix-tensor factorization. PLOS One 14(6):1–20

    Article  Google Scholar 

  9. Choi JW, Liu X, Chakaravarthy VT (2018) High-performance dense tucker decomposition on GPU clusters. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018, Dallas, TX, USA, 11–16 Nov 2018, pp 42:1–42:11. IEEE / ACM

  10. Dagum L, Menon R (1998) Openmp: an industry-standard api for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55

    Article  Google Scholar 

  11. Gao J (2008) Robust L1 principal component analysis and its Bayesian variational inference. Neural Comput 20(2):555–572

    Article  MathSciNet  MATH  Google Scholar 

  12. Hao F, Park D, Yin X, Wang X, Phonexay V (2019) A location-sensitive over-the-counter medicines recommender based on tensor decomposition. J Supercomput 75(4):1953–1970

    Article  Google Scholar 

  13. Jang J, Kang U (2020) D-tucker: Fast and memory-efficient tucker decomposition for dense tensors. In: 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, 20–24 Apr 2020, pp 1850–1853. IEEE

  14. Jang JG, Part M, Sael L (2021) \({\sf VEST}\): Very sparse tucker factorization of large-scale tensors. In: BigComp2021, pp 1–8

  15. Jeon I, Papalexakis EE, Kang U, Faloutsos C (2015) Haten2: billion-scale tensor decompositions. In: Gehrke J, Lehner W, Shim K, Cha SK, Lohman GM (eds) 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, 13–17 Apr 2015, pp 1047–1058. IEEE Computer Society

  16. Jeong S, Cho J (2013) A framework for online gait recognition based on multilinear tensor analysis. J Supercomput 65(1):106–121

    Article  Google Scholar 

  17. Jiang F, Liu Xy, Lu H, Shen R (2018) Efficient multi-dimensional tensor sparse coding using t-linear combination. In: AAAI 2018, pp 3326–3333

  18. Kaliyar RK, Goswami A, Narang P (2021) Deepfake: improving fake news detection using tensor decomposition-based deep neural network. J Supercomput 77(2):1015–1037

    Article  Google Scholar 

  19. Kang U, Papalexakis EE, Harpale A, Faloutsos C (2012) Gigatensor: scaling tensor analysis up by 100 times—algorithms and discoveries. In: KDD, pp 316–324

  20. Kim S, Lee S, Kim J, Yoon Y (2020) Mrtensorcube: tensor factorization with data reduction for context-aware recommendations. J Supercomput 76(10):7847–7857

    Article  Google Scholar 

  21. Kim YD, Choi S (2007) Nonnegative tucker decomposition. In: CVPR’07, pp 1–8. IEEE

  22. Kolda TG (2006) Multilinear operators for higher-order decompositions, vol 2. United States, Department of Energy

  23. Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500

    Article  MathSciNet  MATH  Google Scholar 

  24. Lathauwer LD, Moor BD, Vandewalle J (2000) On the best rank-1 and rank-(\({R_{1}},{R_{2}}, \cdots ,{R_{N}}\)) approximation of higher-order tensors. SIAM J Matrix Anal Appl 21(4):1324–1342

    Article  MathSciNet  MATH  Google Scholar 

  25. Lee D, Lee J, Yu H (2018) Fast tucker factorization for large-scale tensor completion. In: ICDM 2018, pp 1098–1103

  26. Lee J, Choi D, Sael L (2018) CTD: fast, accurate, and interpretable method for static and dynamic tensor decompositions. PloS One 13(7):e0200579

    Article  Google Scholar 

  27. Lee J, Oh S, Sael L (2018) GIFT: guided and interpretable factorization for tensors with an application to large-scale multi-platform cancer analysis. Bioinformatics 34(24):4151–4158

    Google Scholar 

  28. Li J, Ma Y, Wu X, Li A, Barker KJ (2019) PASTA: a parallel sparse tensor algorithm benchmark suite. CCF Trans High Perform Comput 1:111–130

    Article  Google Scholar 

  29. Madrid-Padilla OH, Scott J (2017) Tensor decomposition with generalized lasso penalties. J Comput Graph Stat 26(3):537–546

    Article  MathSciNet  Google Scholar 

  30. Mahoney MW, Maggioni M, Drineas P (2008) Tensor-CUR decompositions for tensor-based data. SIAM J Matrix Anal Appl 30(3):957–987

    Article  MathSciNet  MATH  Google Scholar 

  31. Malik OA, Becker S (2018) Low-rank Tucker decomposition of large tensors using tensorsketch. In: Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, 3–8 Dec 2018, Montréal, Canada, pp 10117–10127

  32. Mørup M, Hansen LK, Arnfred SM (2008) Algorithms for sparse nonnegative Tucker decompositions. Neural comput 20(8):2112–2131

    Article  MATH  Google Scholar 

  33. Oh J, Shin K, Papalexakis EE, Faloutsos C, Yu H (2017) S-HOT: scalable high-order Tucker decomposition. In: de Rijke M, Shokouhi M, Tomkins A, Zhang M (eds) Proceedings of the tenth ACM international conference on web search and data mining, WSDM 2017, Cambridge, United Kingdom, 6–10 Feb 2017, pp 761–770. ACM

  34. Oh S, Park N, Jang J, Sael L, Kang U (2019) High-performance Tucker factorization on heterogeneous platforms. TPDS 30(10):2237–2248

    Google Scholar 

  35. Oh S, Park N, Sael L, Kang U (2018) Scalable Tucker factorization for sparse tensors—algorithms and discoveries. In: ICDE. IEEE Computer Society, Paris, France

  36. Papalexakis EE, Faloutsos C, Sidiropoulos ND (2012) Parcube: Sparse parallelizable tensor decompositions. In: ECML-PKDD, Lecture Notes in Computer Science, vol 7523, pp 521–536. Springer

  37. Pascual-Montano A, Carazo JM, Kochi K, Lehmann D, Pascual-Marqui RD (2006) Nonsmooth nonnegative matrix factorization (nsNMF). IEEE TPAMIT 28(3):403–415

    Article  Google Scholar 

  38. Qi N, Shi Y, Sun X, Yin B(2016) TenSR: Multi-dimensional tensor sparse representation. In: 2016 IEEE CVPR, pp 5916–5925

  39. Ribeiro MT, Singh S, Guestrin C (2016) “why should I trust you?”: Explaining the predictions of any classifier. In: ACM SIGKDD’16, pp 1135–1144

  40. Sanderson C, Curtin R (2016) Armadillo: a template-based c++ library for linear algebra. J Open Source Softw

  41. Smith S, Karypis G (2017) Accelerating the tucker decomposition with compressed sparse tensors. In: Euro-Par 2017, Lecture Notes in Computer Science, vol 10417, pp 653–668. Springer

  42. Sun WW, Lu J, Liu H, Cheng G (2017) Provable sparse tensor decomposition. J R Stat Soc Ser B 79(3):899–916

    Article  MathSciNet  MATH  Google Scholar 

  43. Tsourakakis CE (2010) MACH: fast randomized tensor decompositions. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2010, April 29–May 1, 2010, Columbus, Ohio, USA, pp 689–700. SIAM

  44. Xu Y (2015) Alternating proximal gradient method for sparse nonnegative Tucker decomposition. Math Program Comput 7(1):39–70

    Article  MathSciNet  MATH  Google Scholar 

  45. Yi S, Lai Z, He Z, Cheung Y, Liu Y (2017) Joint sparse principal component analysis. Pattern Recog 61(2):524–536

    Article  MATH  Google Scholar 

  46. Zhang M, Ding CHQ (2013) Robust tucker tensor decomposition for effective image representation. In: ICCV, pp 2448–2455

  47. Zhang Z, Aeron S (2016) Denoising and completion of 3d data via multidimensional dictionary learning. In: IJCAI, pp 2371–2377. IJCAI/AAAI Press

Download references

Acknowledgements

The publication of this article has been funded by the Basic Science Research Program through the National Research Foundation of Korea (2018R1A1A3A0407953, 2018R1A5A1060031).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lee Sael.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jang, JG., Park, M., Lee, J. et al. Large-scale tucker Tensor factorization for sparse and accurate decomposition. J Supercomput 78, 17992–18022 (2022). https://doi.org/10.1007/s11227-022-04559-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04559-4

Keywords

Navigation