Skip to main content

Co-clustering from Tensor Data

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11439))

Included in the following conference series:

Abstract

With the exponential growth of collected data in different fields like recommender system (user, items), text mining (document, term), bioinformatics (individual, gene), co-clustering which is a simultaneous clustering of both dimensions of a data matrix, has become a popular technique. Co-clustering aims to obtain homogeneous blocks leading to an easy simultaneous interpretation of row clusters and column clusters. Many approaches exist, in this paper we rely on the latent block model (LBM) which is flexible allowing to model different types of data matrices. We extend its use to the case of a tensor (3D matrix) data in proposing a Tensor LBM (TLBM) allowing different relations between entities. To show the interest of TLBM, we consider continuous and binary datasets. To estimate the parameters, a variational EM algorithm is developed. Its performances are evaluated on synthetic and real datasets to highlight different possible applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://grouplens.org/datasets/movielens/.

References

  1. Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., Mooney, R.J.: Model-based overlapping clustering. In: Proceedings of the Eleventh ACM SIGKDD, pp. 532–537 (2005)

    Google Scholar 

  2. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  3. Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proceedings of the Ninth ACM SIGKDD, pp. 89–98 (2003)

    Google Scholar 

  4. Feizi, S., Javadi, H., Tse, D.: Tensor biclustering. In: Advances in Neural Information Processing Systems 30, pp. 1311–1320. Curran Associates, Inc. (2017)

    Google Scholar 

  5. Fraley, C., Raftery, A.E.: How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput. J. 41(8), 578–588 (1998)

    Article  MATH  Google Scholar 

  6. Govaert, G., Nadif, M.: An EM algorithm for the block mixture model. IEEE Trans. Pattern Anal. Mach. Intell. 27(4), 643–647 (2005)

    Article  MATH  Google Scholar 

  7. Govaert, G., Nadif, M.: Fuzzy clustering to estimate the parameters of block mixture models. Soft Comput. 10(5), 415–422 (2006)

    Article  Google Scholar 

  8. Govaert, G., Nadif, M.: Co-clustering. Wiley-IEEE Press (2013)

    Google Scholar 

  9. Haralick, R., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)

    Article  Google Scholar 

  10. Kumar, R.M., Sreekumar, K.: A survey on image feature descriptors. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 5(1), 7668–7673 (2014)

    Google Scholar 

  11. Steinley, D.: Properties of the hubert-arabie adjusted rand index. Psychol. Methods 9(3), 386 (2004)

    Article  Google Scholar 

  12. Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  13. Vu, D., Aitkin, M.: Variational algorithms for biclustering models. Comput. Stat. Data Anal. 89, 12–24 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  14. Wu, T., Benson, A.R., Gleich, D.F.: General tensor spectral co-clustering for higher-order data. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29, pp. 2559–2567. Curran Associates, Inc. (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafika Boutalbi .

Editor information

Editors and Affiliations

A Appendix: Update and \(\forall i,k,j,\ell \)

A Appendix: Update and \(\forall i,k,j,\ell \)

To obtain the expression of , we maximize the above soft criterion \(F_C(\tilde{\mathbf {z}},\tilde{\mathbf {w}};\varOmega )\) with respect to , subject to the constraint . The corresponding Lagrangian, up to terms which are not function of , is given by:

Taking derivatives with respect to , we obtain:

Setting this derivative to zero yields: Summing both sides over all \(k'\) yields \(\exp (\beta + 1)= \sum _{k'} \pi _{k'} \exp (\sum _{j,\ell }w_{j\ell }\log (\varPhi (\mathbf {x}_{ij},\varvec{\lambda }_{k'\ell })).\) Plugging \(\exp (\beta )\) in leads to: In the same way, we can estimate maximizing \(F_C(\tilde{\mathbf {z}},\tilde{\mathbf {w}};\varOmega )\) with respect to , subject to the constraint ; we obtain

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Boutalbi, R., Labiod, L., Nadif, M. (2019). Co-clustering from Tensor Data. In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11439. Springer, Cham. https://doi.org/10.1007/978-3-030-16148-4_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16148-4_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16147-7

  • Online ISBN: 978-3-030-16148-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics