A Guide to the Tucker Tensor Decomposition for Data Mining: Exploratory Analysis, Clustering and Classification

Gillet, Annabelle; Leclercq, Éric; Sautot, Lucile

doi:10.1007/978-3-662-68014-8_3

Annabelle Gillet¹¹,
Éric Leclercq¹¹ &
Lucile Sautot¹²

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 14160))

119 Accesses

Abstract

Tensors are powerful multi-dimensional mathematical objects, that easily embed various data models such as relational, graph or time series. Furthermore, tensor decomposition operators are of great utility to reveal hidden patterns and complex relationships in data. Among these decompositions, the Tucker decomposition allows to factorize a tensor into a smaller core tensor and a set of factor matrices. In this article, we propose to study the capabilities of the Tucker decomposition when it is used in data mining techniques such as exploratory analysis, clustering and classification of data. We apply these different techniques on practical examples using several datasets having a ground truth. It is a preliminary work to add the Tucker decomposition to the Tensor Data Model, a model aiming at making tensors data-centric, and at optimizing operators in order to enable the manipulation of large tensors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/AnnabelleGillet/Tucker-experiments.
2.
https://archive.ics.uci.edu/ml/datasets/iris.
3.
Most of the works presenting the classification technique do not perform a duplication, and directly compare the partial core tensor against each sample and each class [5, 11]. It is less efficient as it implies at most \(s \times c\) comparisons, while duplicating the element reduce the number of comparisons to c. During our experiments, we find it more efficient to duplicate the element, as it allows to compare a unified pattern of a class with the sample without focusing on an outlier that could negatively impact the result.

References

Al-Sharoa, E., Al-Khassaweneh, M., Aviyente, S.: A tensor based framework for community detection in dynamic networks. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2312–2316. IEEE (2017)
Google Scholar
Angles, R., Arenas, M., Barceló, P., Hogan, A., Reutter, J., Vrgoč, D.: Foundations of modern query languages for graph databases. ACM Comput. Surv. (CSUR) 50(5), 1–40 (2017)
Article Google Scholar
Araujo, M., et al.: Com2: fast automatic discovery of temporal (‘Comet’) communities. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014. LNCS (LNAI), vol. 8444, pp. 271–283. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06605-9_23
Chapter Google Scholar
Atikoglu, B., Xu, Y., Frachtenberg, E., Jiang, S., Paleczny, M.: Workload analysis of a large-scale key-value store. In: ACM SIGMETRICS Performance Evaluation Review, vol. 40, pp. 53–64. ACM (2012)
Google Scholar
Brandoni, D., Simoncini, V.: Tensor-train decomposition for image recognition. Calcolo 57, 1–24 (2020)
Article MathSciNet MATH Google Scholar
Chachlakis, D.G., Prater-Bennette, A., Markopoulos, P.P.: L1-norm tucker tensor decomposition. IEEE Access 7, 178454–178465 (2019)
Article Google Scholar
Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Wiley, Chichester (2009)
Book Google Scholar
De Lathauwer, L., De Moor, B., Vandewalle, J.: A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21(4), 1253–1278 (2000)
Article MathSciNet MATH Google Scholar
Deng, L.: The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)
Article Google Scholar
Duan, L., Xiao, C., Li, M., Ding, M., Yang, C.: a-tucker: fast input-adaptive and matricization-free tucker decomposition of higher-order tensors on GPUs. CCF Trans. High Perform. Comput. 5(1), 12–25 (2023)
Article Google Scholar
Eldén, L.: Matrix methods in data mining and pattern recognition. In: SIAM (2007)
Google Scholar
Fernandes, S., Fanaee-T, H., Gama, J.: Tensor decomposition for analysing time-evolving social networks: an overview. Artif. Intell. Rev. 54, 2891–2916 (2021)
Article Google Scholar
Gillet, A., Leclercq, É., Cullot, N.: MuLOT: multi-level optimization of the canonical polyadic tensor decomposition at large-scale. In: Bellatreche, L., Dumas, M., Karras, P., Matulevičius, R. (eds.) ADBIS 2021. LNCS, vol. 12843, pp. 198–212. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82472-3_15
Chapter Google Scholar
Gillet, A., Leclercq, E., Sautot, L.: The tucker tensor decomposition for data analysis: capabilities and advantages. In: 38ème Conférence sur la Gestion de Données (BDA) (2022)
Google Scholar
Gillet, A., Leclercq, É., Savonnet, M., Cullot, N.: Empowering big data analytics with polystore and strongly typed functional queries. In: Symposium on International Database Engineering & Applications, pp. 1–10 (2020)
Google Scholar
Gray, J., et al.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Disc. 1(1), 29–53 (1997)
Article Google Scholar
Hamilton, J.D.: Time Series Analysis. Princeton University Press, Princeton (2020)
Book Google Scholar
Hore, V., et al.: Tensor decomposition for multiple-tissue gene expression experiments. Nat. Genet. 48(9), 1094–1100 (2016)
Article Google Scholar
Hou, Z., Li, W., Tao, R., Du, Q.: Three-order tucker decomposition and reconstruction detector for unsupervised hyperspectral change detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 6194–6205 (2021)
Article Google Scholar
Huang, H., Ding, C., Luo, D., Li, T.: Simultaneous tensor subspace selection and clustering: the equivalence of high order SVD and k-means clustering. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 327–335 (2008)
Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)
Article MATH Google Scholar
Jang, J.G., Kang, U.: D-tucker: fast and memory-efficient tucker decomposition for dense tensors. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1850–1853. IEEE (2020)
Google Scholar
Jang, J.G., Kang, U.: Static and streaming tucker decomposition for dense tensors. ACM Trans. Knowl. Discov. Data 17(5), 1–34 (2023)
Article Google Scholar
Kanellakis, P.C.: Elements of relational database theory. In: Formal Models and Semantics, pp. 1073–1156. Elsevier (1990)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (2009)
MATH Google Scholar
Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J.P., Moreno, Y., Porter, M.A.: Multilayer networks. J. Complex Netw. 2(3), 203–271 (2014)
Article Google Scholar
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)
Article MathSciNet MATH Google Scholar
Leclercq, É., Gillet, A., Grison, T., Savonnet, M.: Polystore and tensor data model for logical data independence and impedance mismatch in big data analytics. In: Hameurlain, A., Wagner, R. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XLII. LNCS, vol. 11860, pp. 51–90. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-662-60531-8_3
Chapter Google Scholar
Lee, J., Chon, K.W., Kim, M.S.: A GPU-based tensor decomposition method for large-scale tensors. In: 2023 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 77–80. IEEE (2023)
Google Scholar
Li, L., Lin, X., Liu, H., Lu, W., Zhou, B., Zhu, J.: Displacement data imputation in urban internet of things system based on tucker decomposition with l2 regularization. IEEE Internet Things J. 9(15), 13315–13326 (2022)
Article Google Scholar
Nene, S.A., Nayar, S.K., Murase, H., et al.: Columbia object image library (coil-20) (1996)
Google Scholar
Osman, A.S.: Data mining techniques. Int. J. Data Sci. Res. 2 (2019)
Google Scholar
Pandey, S.K., Shekhawat, H.S., Prasanna, S.: Attention gated tensor neural network architectures for speech emotion recognition. Biomed. Signal Process. Control 71, 103173 (2022)
Article Google Scholar
Papalexakis, E.E., Akoglu, L., Ience, D.: Do more views of a graph help? Community detection and clustering in multi-graphs. In: International Conference on Information Fusion, pp. 899–905. IEEE (2013)
Google Scholar
Papalexakis, E.E., Faloutsos, C., Sidiropoulos, N.D.: Tensors for data mining and data fusion: models, applications, and scalable algorithms. Trans. Intell. Syst. Technol. (TIST) 8(2), 16 (2016)
Google Scholar
Petersohn, D., et al.: Towards scalable dataframe systems. arXiv preprint arXiv:2001.00888 (2020)
Phan, A.H., Cichocki, A.: Extended HALS algorithm for nonnegative tucker decomposition and its applications for multiway analysis and classification. Neurocomputing 74(11), 1956–1969 (2011)
Article Google Scholar
Romeo, S., Tagarelli, A., Ienco, D.: Semantic-based multilingual document clustering via tensor modeling. In: EMNLP: Empirical Methods in Natural Language Processing, pp. 600–609 (2014)
Google Scholar
Rush, A.: Tensor Considered Harmful. Technical report, Harvard NLP (2010). http://nlp.seas.harvard.edu/NamedTensor
Shao, P., Zhang, D., Yang, G., Tao, J., Che, F., Liu, T.: Tucker decomposition-based temporal knowledge graph completion. Knowl.-Based Syst. 238, 107841 (2022)
Article Google Scholar
Sidiropoulos, N.D., De Lathauwer, L., Fu, X., Huang, K., Papalexakis, E.E., Faloutsos, C.: Tensor decomposition for signal processing and machine learning. Trans. Signal Process 65(13), 3551–3582 (2017)
Article MathSciNet MATH Google Scholar
Stehlé, J., et al.: High-resolution measurements of face-to-face contact patterns in a primary school. PLoS ONE 6(8), e23176 (2011)
Article Google Scholar
Sun, J., Papadimitriou, S., Lin, C.Y., Cao, N., Liu, S., Qian, W.: Multivis: content-based social network exploration through multi-way visual analysis. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 1064–1075. SIAM (2009)
Google Scholar
Sun, J., Tao, D., Faloutsos, C.: Beyond streams and graphs: dynamic tensor analysis. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 374–383. ACM (2006)
Google Scholar
Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika 31(3), 279–311 (1966)
Article MathSciNet Google Scholar
Yang, K., et al.: Tagited: predictive task guided tensor decomposition for representation learning from electronic health records. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Zhou, G., Cichocki, A., Zhao, Q., Xie, S.: Efficient nonnegative tucker decompositions: algorithms and uniqueness. IEEE Trans. Image Process. 24(12), 4990–5003 (2015)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

LIB Univ. Bourgogne Franche Comté EA7534, Dijon, France
Annabelle Gillet & Éric Leclercq
UMR TETIS, AgroParisTech, Montpellier, France
Lucile Sautot

Authors

Annabelle Gillet
View author publications
You can also search for this author in PubMed Google Scholar
Éric Leclercq
View author publications
You can also search for this author in PubMed Google Scholar
Lucile Sautot
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Annabelle Gillet .

Editor information

Editors and Affiliations

Paul Sabatier University, IRIT, Toulouse, France
Abdelkader Hameurlain
Technical University of Vienna, IFS, Vienna, Austria
A Min Tjoa
Aix-Marseille University, LIS, Marseille, France
Omar Boucelma
Université Clermont Auvergne, CNRS, LIMOS, Aubiere, France
Farouk Toumani

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gillet, A., Leclercq, É., Sautot, L. (2023). A Guide to the Tucker Tensor Decomposition for Data Mining: Exploratory Analysis, Clustering and Classification. In: Hameurlain, A., Tjoa, A.M., Boucelma, O., Toumani, F. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems LIV. Lecture Notes in Computer Science(), vol 14160. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-68014-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-662-68014-8_3
Published: 22 September 2023
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-68013-1
Online ISBN: 978-3-662-68014-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Guide to the Tucker Tensor Decomposition for Data Mining: Exploratory Analysis, Clustering and Classification