Abstract
The demand for searching, querying multimedia data such as image, video and audio is omnipresent, how to effectively access data for various applications is a critical task. Nevertheless, these data usually are encoded as multi-dimensional arrays, or Tensor, and traditional data mining techniques might be limited due to the curse of dimensionality. Tensor decomposition is proposed to alleviate this issue, commonly used tensor decomposition algorithms include CP-decomposition (which seeks a diagonal core) and Tucker-decomposition (which seeks a dense core). Naturally, Tucker maintains more information, but due to the denseness of the core, it also is subject to exponential memory growth with the number of tensor modes. Tensor train (TT) decomposition addresses this problem by seeking a sequence of three-mode cores: but unfortunately, currently, there are no guidelines to select the decomposition sequence. In this paper, we propose a GTT method for guiding the tensor train in selecting the decomposition sequence. GTT leverages the data characteristics (including number of modes, length of the individual modes, density, distribution of mutual information, and distribution of entropy) as well as the target decomposition rank to pick a decomposition order that will preserve information. Experiments with various data sets demonstrate that GTT effectively guides the TT-decomposition process towards decomposition sequences that better preserve accuracy.
This work has been supported by: NSF grants #1633381, #1909555, #1629888, #2026860, #1827757, DOD grant W81XWH-19-1-0514, a DOE CYDRES grant, and a European Commission grant #690817. Experiments for the paper were conducted using NSF testbed: “Chameleon: A Large-Scale Re-configurable Experimental Environment for Cloud Research”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Alternative definitions of entropy may be used for dense tensors.
- 2.
- 3.
Our implementation and data sets can be found: https://shorturl.at/DMOSY.
References
Batselier, K.: The trouble with tensor ring decompositions. CoRR abs/1811.03813 (2018)
Batselier, K., Yu, W., Daniel, L., Wong, N.: Computing low-rank approximations of large-scale matrices with the tensor network randomized SVD. SIAM J. Matrix Anal. Appl. 39(3), 1221–1244 (2018)
Bedo, M.V.N., Ciaccia, P., Martinenghi, D., de Oliveira, D.: A k-skyband approach for feature selection. In: SISAP 2019, Newark, NJ, USA, Proceedings (2019)
Candan, K.S., Sapino, M.L.: Data Management for Multimedia Retrieval. Cambridge University Press, USA (2010)
Carroll, J.D., Chang, J.J.: Analysis of individual differences in multidimensional scaling via an n-way generalization of "eckart-young" decomposition. Psychometrika 35(3), 283–319 (1970). https://doi.org/10.1007/BF02310791
Chen, Y., Jin, X., Kang, B., Feng, J., Yan, S.: Sharing residual units through collective tensor factorization to improve deep neural networks. In: IJCAI-18
Dash, M., Choi, K., Scheuermann, P., Liu, H.: Feature selection for clustering - A filter solution. In: 2002 IEEE ICDM, 2002, Proceedings, pp. 115–122 (2002)
Dash, M., Liu, H., Yao, J.: Dimensionality reduction of unsupervised data. In: IEEE International Conference on Tools with Artificial Intelligence (Nov 1997)
Dua, D., Graff, C.: UCI machine learning repository (2017)
Harshman, R.: Foundations of the parafac procedure: Models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics, vol. 16 (1970)
Houle, M.E., Kashima, H., Nett, M.: Fast similarity computation in factorized tensors. In: Navarro, G., Pestov, V. (eds.) SISAP 2012. LNCS, vol. 7404, pp. 226–239. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32153-5_16
Huang, S., Candan, K.S., Sapino, M.L.: Bicp: block-incremental CP decomposition with update sensitive refinement. In: CIKM 2016. ACM, New York (2016)
Imaizumi, M., Maehara, T., Hayashi, K.: On tensor train rank minimization: Statistical efficiency and scalable algorithm. In: NIPS, pp. 3930–3939 (2017)
Jeon, I., Papalexakis, E.E., Kang, U., Faloutsos, C.: Haten2: Billion-scale tensor decompositions. In: 2015 IEEE 31st ICDE, pp. 1047–1058 (2015)
Kim, M., Candan, K.S.: Decomposition-by-normalization (DBN): leveraging approximate functional dependencies for efficient CP and tucker decompositions. Data Min. Knowl. Disc. 30(1), 1–46 (2016)
Ko, C.Y., Lin, R., Li, S., Wong, N.: Misc: Mixed strategies crowdsourcing. In: IJCAI-19, pp. 1394–1400 (2019). https://doi.org/10.24963/ijcai.2019/193
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997). Relevance
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009). https://doi.org/10.1137/07070111X
Li, L., Yu, W., Batselier, K.: Faster tensor train decomposition for sparse data. ArXiv (2019)
Mickelin, O., Karaman, S.: Tensor ring decomposition. CoRR abs/1807.02513 (2018)
Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.: Tensorizing neural networks. In: NIPS 2015, pp. 442–450. MIT Press, Cambridge (2015)
Novikov, A., Trofimov, M., Oseledets, I.: Exponential Machines (2016). arXiv e-prints arXiv:1605.03795
Oseledets, I.: Tensor-train decomposition. SIAM J. Sci. Comput. 33(5), 2295–2317 (2011). https://doi.org/10.1137/090752286
Tucker, L.: Some mathematical notes on three-mode factor analysis. Psychometrika 31(3), 279–311 (1966)
Yamaguchi, Y., Hayashi, K.: Tensor decomposition with missing indices. In: IJCAI 2017, pp. 3217–3223. AAAI Press (2017)
Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: ICML 2003, vol. 2, pp. 856–863 (2003)
Zhao, Q., Zhou, G., Xie, S., Zhang, L., Cichocki, A.: Tensor ring decomposition. CoRR abs/1606.05535 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, ML., Candan, K.S., Sapino, M.L. (2020). GTT: Guiding the Tensor Train Decomposition. In: Satoh, S., et al. Similarity Search and Applications. SISAP 2020. Lecture Notes in Computer Science(), vol 12440. Springer, Cham. https://doi.org/10.1007/978-3-030-60936-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-60936-8_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60935-1
Online ISBN: 978-3-030-60936-8
eBook Packages: Computer ScienceComputer Science (R0)