An exemplar-based clustering using efficient variational message passing

Ibrahim, Mohamed Hamza; Missaoui, Rokia

doi:10.1007/s10618-020-00720-w

An exemplar-based clustering using efficient variational message passing

Published: 28 October 2020

Volume 35, pages 248–289, (2021)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

337 Accesses
Explore all metrics

Abstract

Clustering is a crucial step in scientific data analysis and engineering systems. Thus, an efficient cluster analysis method often remains a key challenge. In this paper, we introduce a general purpose exemplar-based clustering method called (MEGA), which performs a novel message-passing strategy based on variational expectation–maximization and generalized arc-consistency techniques. Unlike message passing clustering methods, MEGA formulates the message-passing schema as E- and M-steps of variational expectation–maximization based on a reparameterized factor graph. It also exploits an adaptive variant of generalized arc consistency technique to perform a variational mean-field approximation in E-step to minimize a Kullback–Leibler divergence on the model evidence. Dissimilar to density-based clustering methods, MEGA has no sensitivity to initial parameters. In contrast to partition-based clustering methods, MEGA does not require pre-specifying the number of clusters. We focus on the binary-variable factor graph to model the clustering problem but MEGA is applicable to other graphical models in general. Our experiments on real-world problems demonstrate the efficiency of MEGA over existing prominent clustering algorithms such as Affinity propagation, Agglomerative, DBSCAN, K-means, and EM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A MAP Approach to Evidence Accumulation Clustering

OPE-HCA: an optimal probabilistic estimation approach for hierarchical clustering algorithm

Article 05 August 2015

Partition-Based Clustering Using Constraint Optimization

Notes

The constraints in Eqs. (1) and (2) are assigned the same weights (\(-\infty \) when unsatisfied, and 0 when unsatisfied ). Thus, without loss of accuracy, we can recast them as factors that allow \(\{+,-\}\) values without recourse to infinity.
We say that a factor node has deterministic dependency if at least one of its tuples has zero probability.
Note that the local factors have been summed since we use log-domain formulation of the objective function.
Note that, based on Jensen’s inequality, each update step that minimizes the Kullback–Leibler divergence, also maximizes the lower bounding on the model evidence (cf. Beal and Ghahramani 2003).
Publicly available: http://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008.
Publicly available at: http://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones.
Publicly available: http://archive.ics.uci.edu/ml/datasets/Wall-Following+Robot+Navigation+Data.
Publicly available: http://konect.uni-koblenz.de/networks/ucidata-zachary.
Note that the Jaccard distance satisfies all conditions of the distance measure, including the triangle inequality.
publicly available: http://scikit-learn.org/stable/modules/clustering.html
Publicly available: https://github.com/bnpy/bnpy.

References

Ahmadi B, Kersting K, Mladenov M, Natarajan S (2013) Exploiting symmetries for scaling loopy belief propagation and relational training. Mach Learn 92(1):91–132
Article MathSciNet Google Scholar
Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL (2013) A public domain dataset for human activity recognition using smartphones. In: 21th European symposium on artificial neural networks, computational intelligence and machine learning, ESANN
Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics, pp 1027–1035
Beal MJ, Ghahramani Z (2003) The variational bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Bayesian Stat. 7:453–464
MathSciNet Google Scholar
Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data. Springer, Berlin, pp 25–71
Cannistraci CV, Ravasi T, Montevecchi FM, Ideker T, Alessio M (2010) Nonlinear dimension reduction and clustering by minimum curvilinearity unfold neuropathic pain and tissue embryological classes. Bioinformatics 26(18):i531–i539
Article Google Scholar
Cheeseman PC, Stutz JC (1996) Bayesian classification (autoclass): theory and results. In: Advances in knowledge discovery and data mining, CA, USA, pp 153–180
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Article Google Scholar
Dalli A (2003) Adaptation of the f-measure to cluster based lexicon quality evaluation. In: Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: are evaluation methods, metrics and resources reusable? Association for Computational Linguistics, pp 51–56
Danon L, Diaz-Guilera A, Duch J, Arenas A (2005) Comparing community structure identification. J Stat Mech Theory Exp. https://doi.org/10.1088/1742-5468/2005/09/P09008
Article MATH Google Scholar
Day WH, Edelsbrunner H (1984) Efficient algorithms for agglomerative hierarchical clustering methods. J Classif 1(1):7–24
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 39:1–38
MathSciNet MATH Google Scholar
Elidan G, McGraw I, Koller D (2006) Residual belief propagation: informed scheduling for asynchronous message passing. In: Proceedings of the twenty-second conference annual conference on uncertainty in artificial intelligence (UAI-06). AUAI Press, Arlington, Virginia, pp 165–173
Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: The second international conference on knowledge discovery and data mining, vol 96, pp 226–231
Fraley C, Raftery AE (1998) How many clusters? which clustering method? Answers via model-based cluster analysis. Comput J 41(8):578–588
Article Google Scholar
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976
Article MathSciNet Google Scholar
Fujiwara Y, Irie G, Kitahara T et al (2011) Fast algorithm for affinity propagation. In: IJCAI proceedings-international joint conference on artificial intelligence, vol 22:3, p 2238
Givoni IE (2012) Beyond affinity propagation: message passing algorithms for clustering. Citeseer
Givoni I, Frey B (2009a) Semi-supervised affinity propagation with instance-level constraints. In: Artificial intelligence and statistics, pp 161–168
Givoni IE, Frey BJ (2009b) A binary variable model for affinity propagation. Neural Comput 21(6):1589–1600
Article MathSciNet Google Scholar
Givoni IE, Chung C, Frey BJ (2011) Hierarchical affinity propagation. In: Proceedings of the twenty-seventh conference on uncertainty in artificial intelligence. AUAI Press, Cambridge, pp 238–246
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J. Intell. Inf. Syst. 17(2–3):107–145
Article Google Scholar
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. J R Stat Soc Ser B (Methodological) 58:155–176
MathSciNet MATH Google Scholar
Heskes T (2004) On the uniqueness of loopy belief propagation fixed points. Neural Comput 16(11):2379–2413
Article Google Scholar
Horsch MC, Havens WS (2000) Probabilistic arc consistency: a connection between constraint reasoning and probabilistic reasoning. In: Proceedings of the sixteenth conference on uncertainty in artificial intelligence, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc, pp 282–290
Ibrahim MH, Pal C, Pesant G (2017) Improving probabilistic inference in graphical models with determinism and cycles. Mach Learn 106(1):1–54
Article MathSciNet Google Scholar
Jamshidian M, Jennrich RI (1997) Acceleration of the EM algorithm by using quasi-Newton methods. J R Stat Soc Ser B (Stat Methodol) 59(3):569–587
Article MathSciNet Google Scholar
Jiang B, Pei J, Tao Y, Lin X (2013) Clustering uncertain data based on probability distribution similarity. IEEE Trans Knowl Data Eng 25(4):751–763
Article Google Scholar
Jiang Y, Liao Y, Yu G (2016) Affinity propagation clustering using path based similarity. Algorithms 9(3):46
Article MathSciNet Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge
MATH Google Scholar
Lam D, Wunsch DC (2014) Clustering. In: Academic Press library in signal processing, vol 1, pp 1115–1149. Elsevier, Amsterdam
Lashkari D, Golland P (2008) Convex clustering with exemplar-based models. In: Advances in neural information processing systems, pp 825–832
Leone M, Weigt M (2007) Clustering by soft-constraint affinity propagation: applications to gene-expression data. Bioinformatics 23(20):2708–2715
Article Google Scholar
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Mai ST, Assent I, Jacobsen J, Dieu MS (2018) Anytime parallel density-based clustering. In: Data mining and knowledge discovery pp 1–56
McLachlan G, Krishnan T (2007) The EM algorithm and extensions, vol 382. Wiley, New York
MATH Google Scholar
Mooij JM, Kappen HJ (2005) Sufficient conditions for convergence of loopy belief propagation. In: Proceedings of the twenty-first conference on uncertainty in artificial intelligence, UAI’05, pp. 396–403. AUAI Press, Arlington, Virginia, USA. http://dl.acm.org/citation.cfm?id=3020336.3020386
Murphy K, Weiss Y, Jordan M (1999) Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of the fifteenth conference annual conference on uncertainty in artificial intelligence (UAI-99), Stockholm, Sweden. Morgan Kaufmann, pp 467–476
Neal RM, Hinton GE (1999) Learning in graphical models. chap. In: A view of the EM algorithm that justifies incremental, sparse, and other variants, MIT Press, Cambridge, pp 355–368
Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems. MIT Press, Cambridge, pp 849–856
Nguyen DT, Chen L, Chan CK (2012) Clustering with multiviewpoint-based similarity measure. IEEE Trans Knowl Data Eng 24(6):988–1001
Article Google Scholar
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, Burlington
MATH Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Petersen KB, Winther O, Hansen LK (2005) On the slow convergence of EM and VBEM in low-noise linear models. Neural Comput 17(9):1921–1926
Article MathSciNet Google Scholar
Potetz B (2007) Efficient belief propagation for vision using linear constraint nodes. In: Proceeding of IEEE conference on computer vision and pattern recognition (CVPR’07), IEEE computer society, Minneapolis, MN, USA, pp 1–8
Rasmussen CE (2000) The infinite Gaussian mixture model. In: Advances in neural information processing systems, pp. 554–560
Rawashdeh A, Ralescu AL (2015) Similarity measure for social networks—A brief survey. In: Proceedings of the 26th modern AI and cognitive science conference 2015, Greensboro, NC, USA, 25–26 April 2015, pp 153–159
Roosta T, Wainwright MJ, Sastry SS (2008) Convergence analysis of reweighted sum-product algorithms. IEEE Trans Signal Process 56(9):4293–4305
Article MathSciNet Google Scholar
Rossi F, Van Beek P, Walsh T (2006) Handbook of constraint programming. Elsevier, Amsterdam
MATH Google Scholar
Ruiz C, Spiliopoulou M, Menasalvas E (2010) Density-based semi-supervised clustering. Data Min Knowl Disc 21(3):345–370
Article MathSciNet Google Scholar
Sander J, Ester M, Kriegel HP, Xu X (1998) Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Min Knowl Disc 2(2):169–194
Article Google Scholar
Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin CT (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Article Google Scholar
Shang F, Jiao L, Shi J, Wang F, Gong M (2012) Fast affinity propagation clustering: a multilevel approach. Pattern Recogn 45(1):474–486
Article Google Scholar
Singla P, Nath A, Domingos P (2010) Approximate lifted belief propagation. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence, Atlanta, Georgia, USA, 11–15 July 2010. AAAI Press, pp 92–97
Strack B, DeShazo JP, Gennings C, Olmo JL, Ventura S, Cios KJ, Clore JN (2014) Impact of hba1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed research international 2014
Sun L, Guo C (2014) Incremental affinity propagation clustering based on message passing. IEEE Trans Knowl Data Eng 26(11):2731–2744
Article Google Scholar
Tarlow D, Zemel RS, Frey BJ (2008) Flexible priors for exemplar-based clustering. In: Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence. AUAI Press, pp 537–545
Teh YW, Jordan MI, Beal MJ, Blei DM (2005) Sharing clusters among related groups: hierarchical Dirichlet processes. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems, vol 17. MIT Press, Cambridge, pp 1385–1392
Google Scholar
Wang CD, Lai JH, Suen CY, Zhu JY (2013) Multi-exemplar affinity propagation. IEEE Trans Pattern Anal Mach Intell 35(9):2223–2237
Article Google Scholar
Weiss Y (1997) Belief propagation and revision in networks with loops. Technical Report
Winn JM, Bishop CM (2005) Variational message passing. J Mach Learn Res 6:661–694
MathSciNet MATH Google Scholar
Wu CJ (1983) On the convergence properties of the EM algorithm. Ann Stat 11:95–103
Article MathSciNet Google Scholar
Xu X, Ester M, Kriegel HP, Sander J (1998) A distribution-based clustering algorithm for mining in large spatial databases. In: 14th international conference on data engineering, 1998. Proceedings IEEE, pp 324–331
Yang Y, Chu X, Liang F, Huang TS (2012) Pairwise exemplar clustering. In: Twenty-sixth AAAI conference on artificial intelligence
Yedidia J, Freeman W, Weiss Y (2005) Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans Inf Theory 7:2282–2312
Article MathSciNet Google Scholar
Yu J, Jia C (2009) Convergence analysis of affinity propagation. In: International conference on knowledge science, engineering and management. Springer, Berlin, pp 54–65
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473
Article Google Scholar
Zhang X, Furtlehner C, Germain-Renaud C, Sebag M (2014) Data stream clustering with affinity propagation. IEEE Trans Knowl Data Eng 26(7):1644–1656
Article Google Scholar
Zopf M, Mencía EL, Fürnkranz J (2016) Sequential clustering and contextual importance measures for incremental update summarization. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 1071–1082

Download references

Acknowledgements

We acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC) for the financial support of this work.

Author information

Authors and Affiliations

Département d’informatique et d’ingénierie, Université du Québec en Outaouais, Gatineau, QC, Canada
Mohamed Hamza Ibrahim & Rokia Missaoui
Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, Egypt
Mohamed Hamza Ibrahim

Authors

Mohamed Hamza Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar
Rokia Missaoui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Hamza Ibrahim.

Additional information

Responsible editor: Fei Wang

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ibrahim, M.H., Missaoui, R. An exemplar-based clustering using efficient variational message passing. Data Min Knowl Disc 35, 248–289 (2021). https://doi.org/10.1007/s10618-020-00720-w

Download citation

Received: 07 February 2019
Accepted: 21 October 2020
Published: 28 October 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s10618-020-00720-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An exemplar-based clustering using efficient variational message passing

Abstract

Access this article

Similar content being viewed by others

A MAP Approach to Evidence Accumulation Clustering

OPE-HCA: an optimal probabilistic estimation approach for hierarchical clustering algorithm

Partition-Based Clustering Using Constraint Optimization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation