Abstract
The SIMBAD project puts forward a unified theory of data analysis under a (dis)similarity based object representation framework. Our work builds on the duality of probabilistic and similarity notions on pairwise object comparison. We address the Evidence Accumulation Clustering paradigm as a means of learning pairwise similarity between objects, summarized in a co-association matrix. We show the dual similarity/probabilistic interpretation of the co-association matrix and exploit these for coherent consensus clustering methods, either exploring embeddings over learned pairwise similarities, in an attempt to better highlight the clustering structure of the data, or by means of a unified probabilistic approach leading to soft assignments of objects to clusters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Technically, these distances are computed along a graph formed by connecting all k-nearest neighbors.
References
Aidos, H., Fred, A.: A study of embedding methods under the evidence accumulation framework. In: Pelillo, M., Hancock, E. (eds.) Similarity-Based Pattern Recognition. Lecture Notes in Computer Science, vol. 7005, pp. 290–305. Springer, Berlin (2011). http://link.springer.com/chapter/10.1007/978-3-642-24471-1_21
Ayad, H., Kamel, M.S.: Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans. Pattern Anal. Mach. Intell. 30(1), 160–173 (2008)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems (NIPS 2001), vol. 14, pp. 585–591 (2002)
Bezdek, J., Hathaway, R.: Vat: a tool for visual assessment of (cluster) tendency. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02, vol. 3, pp. 2225–2230 (2002)
Boyd, S., Vandenberghe, L.: Convex Optimization, 1st edn. Cambridge University Press, Cambridge (2004)
Demartines, P., Hérault, J.: Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets. IEEE Trans. Neural Netw. 8(1), 148–154 (1997)
Dimitriadou, E., Weingessel, A., Hornik, K.: A combination scheme for fuzzy clustering. In: AFSS’02, 332–338 (2002)
Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proc. ICML’04 (2004)
Fred, A.: Finding consistent clusters in data partitions. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, vol. 2096, pp. 309–318. Springer, Berlin (2001)
Fred, A., Jain, A.: Data clustering using evidence accumulation. In: Proc. of the 16th Int’l Conference on Pattern Recognition, pp. 276–280 (2002)
Fred, A., Jain, A.: Combining multiple clustering using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)
Fred, A.L., Jain, A.K.: Learning pairwise similarity for data clustering. In: Proc. of the 18th Int’l Conference on Pattern Recognition (ICPR 2006), pp. 925–928. IEEE Comput. Soc., Washington (2006). doi:10.1109/ICPR.2006.754
Hadjitodorov, S.T., Kuncheva, L.I., Todorova, L.P.: Moderate diversity for better cluster ensembles. Inf. Fusion 7(3), 264–275 (2006)
He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems (NIPS 2003), vol. 16 (2004)
He, X., Cai, D., Yan, S., Zhang, H.J.: Neighborhood preserving embedding. In: Proc. of the 10th Int. Conf. on Computer Vision (ICCV 2005), vol. 2, pp. 1208–1213 (2005)
Hofmann, T., Puzicha, J., Jordan, M.I.: Learning from Dyadic Data. Advances in Neural Information Processing Systems (NIPS), vol. 11. MIT Press, Cambridge (1999)
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31, 264–323 (1999)
Kachurovskii, I.R.: On monotone operators and convex functionals. Usp. Mat. Nauk 15(4), 213–215 (1960)
Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 10th Supercomputing Conference (1998)
Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: applications in vlsi domain. In: Proc. Design Automation Conf. (1997)
Kuncheva, L.I., Hadjitodorov, S.T.: Using diversity in cluster ensembles. In: Proc. of the IEEE International Conference on Systems, Man & Cybernetics, Hague, Netherlands, pp. 1214–1219 (2004)
Kuncheva, L., Hadjitodorov, S., Todorova, L.: Experimental comparison of cluster ensemble methods. In: 9th International Conference on Information Fusion, pp. 1–7 (2006). doi:10.1109/ICIF.2006.301614
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Information Science and Statistics. Springer, Berlin (2007)
Lee, J.A., Lendasse, A., Verleysen, M.: Nonlinear projection with curvilinear distances: isomap versus curvilinear distance analysis. Neurocomputing 57, 49–76 (2004)
Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: Advances in Neural Information Processing Systems (NIPS 2004), vol. 17 (2004)
Lourenço, A., Fred, A.: Selectively learning clusters in multi-EAC. In: International Conference on Knowledge Discovery and Information Retrieval (KDIR 2010), Valencia, Spain (2010)
Lourenço, A., Fred, A., Jain, A.K.: On the scalability of evidence accumulation clustering. In: ICPR. Istanbul Turkey (2010)
Lourenço, A., Fred, A., Figueiredo, M.: A generative dyadic aspect model for evidence accumulation clustering. In: Pelillo, M., Hancock, E. (eds.) Similarity-Based Pattern Recognition. Lecture Notes in Computer Science, vol. 7005, pp. 104–116. Springer, Berlin (2011). http://link.springer.com/chapter/10.1007/978-3-642-24471-1_8
Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, Berlin (2008)
Meila, M.: Comparing clusterings by the variation of information. In: Proc. of the Sixteenth Annual Conf. of Computational Learning Theory (COLT). Springer, Berlin (2003)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: NIPS, pp. 849–856. MIT Press, Cambridge (2001)
Punera, K., Ghosh, J.: Advances in Fuzzy Clustering and Its Applications, Chap. Soft Consensus Clustering. Wiley, New York (2007)
Rota Bulò, S., Lourenço, A., Fred, A., Pelillo, M.: Pairwise probabilistic clustering using evidence accumulation. In: Proc. 2010 Int. Conf. on Structural, Syntactic, and Statistical Pattern Recognition, SSPR&SPR’10, pp. 395–404 (2010)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18(5), 401–409 (1969)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Steyvers, M., Griffiths, T.: Probabilistic Topic Models, Chap. Latent Semantic Analysis: a Road to Meaning. Laurence Erlbaum, Hillsdale (2007)
Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Elsevier, Amsterdam (2003)
Topchy, A., Jain, A., Punch, W.: Combining multiple weak clusterings. In: IEEE Intl. Conf. on Data Mining, Melbourne, FL, pp. 331–338 (2003)
Topchy, A., Jain, A., Punch, W.: A mixture model of clustering ensembles. In: Proc. of the SIAM Conf. on Data Mining (2004)
Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)
Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. In: 9th SIAM Int. Conf. on Data Mining (2009)
Wang, P., Domeniconi, C., Laskey, K.B.: Nonparametric Bayesian clustering ensembles. In: ECML PKDD’10, pp. 435–450 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Fred, A.L.N. et al. (2013). Learning Similarities from Examples Under the Evidence Accumulation Clustering Paradigm. In: Pelillo, M. (eds) Similarity-Based Pattern Analysis and Recognition. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5628-4_5
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5628-4_5
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5627-7
Online ISBN: 978-1-4471-5628-4
eBook Packages: Computer ScienceComputer Science (R0)