Skip to main content

Multi-Manifold Matrix Tri-Factorization for Text Data Clustering

  • Conference paper
  • First Online:
  • 2141 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9489))

Abstract

We propose a novel algorithm that we called Multi-Manifold Co-clustering (MMC). This algorithm considers the geometric structures of both the sample manifold and the feature manifold simultaneously. Specifically, multiple Laplacian graph regularization terms are constructed separately to take local invariance into account; the optimal intrinsic manifold is constructed by linearly combining multiple manifolds. We employ multi-manifold learning to approximate the intrinsic manifold using a subset of candidate manifolds, which better reflects the local geometrical structure by graph Laplacian. The candidate manifolds are obtained using various representative manifold-based dimensionality reduction methods. These selected methods are based on different rationales and use different metrics for data distances. Experimental results on several real world text data sets demonstrate the effectiveness of MMC.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3), 167–175 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  2. Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: SIGKDD, pp. 269–274 (2001)

    Google Scholar 

  3. Dhillon, I.S., Mallela, S., Kumar, R.: A divisive infomation-theoretic feature clustering algorithm for text classification. Mach. Learn. Res. 3, 1265–1287 (2003)

    MathSciNet  MATH  Google Scholar 

  4. Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix trifactorizations for clustering. In: ACM SIGKDD, pp. 126–135 (2006)

    Google Scholar 

  5. Engel, D., Hüttenberger, L., Hamann, B.: A survey of dimension reduction methods for high-dimensional data analysis and visualization. In: IRTG 1131 Workshop, vol. 27, pp. 135–149 (2012)

    Google Scholar 

  6. Gittins, R.: Canonical Analysis: A Review with Applications in Ecology. Biomathematics, vol. 12. Springer, Heidelberg (1985)

    MATH  Google Scholar 

  7. Govaert, G., Nadif, M.: Clustering with block mixture models. Pattern Recogn. 36(2), 463–473 (2003)

    Article  MATH  Google Scholar 

  8. Govaert, G., Nadif, M.: Co-Clustering: Models, Algorithms and Applications. Wiley, London (2013)

    Book  MATH  Google Scholar 

  9. Gu, Q., Zhou, J.: Co-clustering on manifolds. In: ACM SIGKDD (2009)

    Google Scholar 

  10. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  MATH  Google Scholar 

  11. Jun, H., Richi, N.: Robust clustering of multi-type relational data via a heterogeneous manifold ensemble. In: The 31st International Conference on Data Engineering, ICDE 2015 (2015)

    Google Scholar 

  12. Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  13. Li, P., Bu, J., Chen, C., He, Z.: Relational co-clustering via manifold ensemble learning. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 1687–1691, ACM (2012)

    Google Scholar 

  14. Long, B., Zhang, Z., Yu, P.S.: Unsupervised learning on k-partite graphs. In: ACM SIGKDD, pp. 317–326 (2005)

    Google Scholar 

  15. van der Maaten, L.J.P., Postma, E.O., van den Herik, H.J.: Dimensionality Reduction: A Comparative Review. Tilburg University Technical Report, TiCC-TR 2009-005 (2009)

    Google Scholar 

  16. Strehl, A., Ghosh, J.: Cluster ensembles:a knowledge reuse framework for combining multiple partitions. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  17. Wang, H., Nie, F., Huang, H., Makedon, F.: Fast nonnegative matrix tri-factorization for large-scale data co-clustering. In: IJCAI (2011)

    Google Scholar 

  18. Wang, Y., Jiang, Y., Wu, Y., Zhou, Z.-H.: Multi-manifold clustering. In: Zhang, B.-T., Orgun, M.A. (eds.) PRICAI 2010. LNCS, vol. 6230, pp. 280–291. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  19. Wang, Y., Jiang, Y., Wu, Y., Zhou, Z.: Spectral clustering on multiple manifolds. IEEE Trans. Neural Netw. Learn. Syst. 22(7), 1149–1161 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kais Allab .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Allab, K., Labiod, L., Nadif, M. (2015). Multi-Manifold Matrix Tri-Factorization for Text Data Clustering. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9489. Springer, Cham. https://doi.org/10.1007/978-3-319-26532-2_78

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26532-2_78

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26531-5

  • Online ISBN: 978-3-319-26532-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics