Skip to main content

A Generative Dyadic Aspect Model for Evidence Accumulation Clustering

  • Conference paper
Similarity-Based Pattern Recognition (SIMBAD 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7005))

Included in the following conference series:

Abstract

Evidence accumulation clustering (EAC) is a clustering combination method in which a pair-wise similarity matrix (the so-called co-association matrix) is learnt from a clustering ensemble. This co-association matrix counts the co-occurrences (in the same cluster) of pairs of objects, thus avoiding the cluster correspondence problem faced by many other clustering combination approaches. Starting from the observation that co-occurrences are a special type of dyads, we propose to model co-association using a generative aspect model for dyadic data. Under the proposed model, the extraction of a consensus clustering corresponds to solving a maximum likelihood estimation problem, which we address using the expectation-maximization algorithm. We refer to the resulting method as probabilistic ensemble clustering algorithm (PEnCA). Moreover, the fact that the problem is placed in a probabilistic framework allows using model selection criteria to automatically choose the number of clusters. To compare our method with other combination techniques (also based on probabilistic modeling of the clustering ensemble problem), we performed experiments with synthetic and real benchmark data-sets, showing that the proposed approach leads to competitive results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Ayad, H.G., Kamel, M.S.: Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans. Pattern Anal. Mach. Intell. 30(1), 160–173 (2008)

    Article  Google Scholar 

  2. Buhmann, J.: Information theoretic model validation for clustering. In: IEEE International Symposium on Information Theory (2010)

    Google Scholar 

  3. Bulò, S.R., Lourenço, A., Fred, A., Pelillo, M.: Pairwise probabilistic clustering using evidence accumulation. In: Hancock, E.R., Wilson, R.C., Windeatt, T., Ulusoy, I., Escolano, F. (eds.) SSPR&SPR 2010. LNCS, vol. 6218, pp. 395–404. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  4. Demspter, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society (B) 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  5. Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proc. ICML 2004 (2004)

    Google Scholar 

  6. Figueiredo, M., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 381–396 (2002)

    Article  Google Scholar 

  7. Fischer, B., Roth, V., Buhmann, J.: Clustering with the connectivity kernel. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Neural Information Processing Systems – NIPS, vol. 16 (2004)

    Google Scholar 

  8. Fred, A.: Finding consistent clusters in data partitions. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 309–318. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Fred, A., Jain, A.: Combining multiple clustering using evidence accumulation. IEEE Trans. Pattern Analysis and Machine Intelligence 27(6), 835–850 (2005)

    Article  Google Scholar 

  10. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. USA 101 suppl. 1, 5228–5235 (2004)

    Article  Google Scholar 

  11. Hofmann, T.: Unsupervised learning from dyadic data, pp. 466–472. MIT Press, Cambridge (1998)

    Google Scholar 

  12. Hofmann, T., Puzicha, J.: Statistical models for co-occurrence data. Technical report, Cambridge, MA, USA (1998)

    Google Scholar 

  13. Hofmann, T., Puzicha, J., Jordan, M.I.: Learning from dyadic data. In: Advances in Neural Information Processing Systems (NIPS), vol. 11. MIT Press, Cambridge (1999)

    Google Scholar 

  14. Lourenço, A., Fred, A., Jain, A.K.: On the scalability of evidence accumulation clustering. In: ICPR, Istanbul, Turkey (August 23-26, 2010)

    Google Scholar 

  15. Rissanen, J.: Stochastic COmplexity in Statistical Inquiry. World Scientific, Singapore (1989)

    MATH  Google Scholar 

  16. Steyvers, M., Griffiths, T.: Latent Semantic Analysis: A Road to Meaning. In: Probabilistic Topic Models. Lawrence Erlbaum, Mahwah (2007)

    Google Scholar 

  17. Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. of Machine Learning Research 3 (2002)

    Google Scholar 

  18. Topchy, A., Jain, A., Punch, W.: A mixture model of clustering ensembles. In: Proc. of the SIAM Conf. on Data Mining (April 2004)

    Google Scholar 

  19. Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: Models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)

    Article  Google Scholar 

  20. Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. In: 9th SIAM International Conference on Data Mining. SIAM, Philadelphia (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lourenço, A., Fred, A., Figueiredo, M. (2011). A Generative Dyadic Aspect Model for Evidence Accumulation Clustering. In: Pelillo, M., Hancock, E.R. (eds) Similarity-Based Pattern Recognition. SIMBAD 2011. Lecture Notes in Computer Science, vol 7005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24471-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24471-1_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24470-4

  • Online ISBN: 978-3-642-24471-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics