Skip to main content

Condorcet’s Jury Theorem for Consensus Clustering

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11117))

Abstract

Condorcet’s Jury Theorem has been invoked for ensemble classifiers to indicate that the combination of many classifiers can have better predictive performance than a single classifier. Such a theoretical underpinning is unknown for consensus clustering. This article extends Condorcet’s Jury Theorem to the mean partition approach under the additional assumptions that a unique but unknown ground-truth partition exists and sample partitions are drawn from a sufficiently small ball containing the ground-truth.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The support of Q is the smallest closed subset \({\mathcal {S}}_Q \subseteq {\mathcal {P}}\) such that \(Q({\mathcal {S}}_Q) = 1\).

  2. 2.

    Recall that a mean partition is not unique in general.

References

  1. Berend, D., Paroush, J.: When is condorcet’s jury theorem valid? Soc. Choice Welf. 15(4), 481–488 (1998)

    Article  MathSciNet  Google Scholar 

  2. Bhattacharya, A., Bhattacharya, R.: Nonparametric Inference on Manifolds with Applications to Shape Spaces. Cambridge University Press, Cambridge (2012)

    Book  Google Scholar 

  3. Bredon, G.E.: Introduction to Compact Transformation Groups. Elsevier, New York City (1972)

    MATH  Google Scholar 

  4. de Condorcet, N.C.: Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix. Imprimerie Royale, Paris (1785)

    Google Scholar 

  5. Dimitriadou, E., Weingessel, A., Hornik, K.: A combination scheme for fuzzy clustering. In: Advances in Soft Computing (2002)

    Google Scholar 

  6. Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1

    Chapter  Google Scholar 

  7. Domeniconi, C., Al-Razgan, M.: Weighted cluster ensembles: methods and analysis. ACM Trans. Knowl. Discov. Data 2(4), 1–40 (2009)

    Article  Google Scholar 

  8. Dryden, I.L., Mardia, K.V.: Statistical Shape Analysis. Wiley, Hoboken (1998)

    MATH  Google Scholar 

  9. Feragen, A., Lo, P., De Bruijne, M., Nielsen, M., Lauze, F.: Toward a theory of statistical tree-shape analysis. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2008–2021 (2013)

    Article  Google Scholar 

  10. Filkov, V., Skiena, S.: Integrating microarray data by consensus clustering. Int. J. Artif. Intell. Tools 13(4), 863–880 (2004)

    Article  Google Scholar 

  11. Franek, L., Jiang, X.: Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recognit. 47(2), 833–842 (2014)

    Article  Google Scholar 

  12. Fréchet, M.: Les éléments aléatoires de nature quelconque dans un espace distancié. Annales de l’institut Henri Poincaré 10, 215–310 (1948)

    MATH  Google Scholar 

  13. Ginestet, C.E.: Strong Consistency of Fréchet Sample Mean Sets for Graph-Valued Random Variables. arXiv: 1204.3183 (2012)

  14. Ghaemi, R., Sulaiman, N., Ibrahim, H., Mustapha, N.: A survey: clustering ensembles techniques. Proc. World Acad. Sci. Eng. Technol. 38, 644–657 (2009)

    Google Scholar 

  15. Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Trans. Knowl. Discov. Data 1(1), 341–352 (2007)

    Article  Google Scholar 

  16. Grofman, B., Owen, G., Feld, S.L.: Thirteen theorems in search of the truth. Theory Decis. 15(3), 261–278 (1983)

    Article  MathSciNet  Google Scholar 

  17. Huckemann, S., Hotz, T., Munk, A.: Intrinsic shape analysis: geodesic PCA for Riemannian manifolds modulo isometric Lie group actions. Statistica Sinica 20, 1–100 (2010)

    MathSciNet  MATH  Google Scholar 

  18. Jain, B.J., Obermayer, K.: Structure spaces. J. Mach. Learn. Res. 10, 2667–2714 (2009)

    MathSciNet  MATH  Google Scholar 

  19. Jain, B.J.: Geometry of Graph Edit Distance Spaces. arXiv: 1505.08071 (2015)

  20. Jain, B.J.: Asymptotic Behavior of Mean Partitions in Consensus Clustering. arXiv:1512.06061 (2015)

  21. Jain, B.J.: Statistical analysis of graphs. Pattern Recognit. 60, 802–812 (2016)

    Article  Google Scholar 

  22. Jain, B.J.: Homogeneity of Cluster Ensembles. arXiv:1602.02543 (2016)

  23. Jain, B.J.: The Mean Partition Theorem of Consensus Clustering. arXiv:1604.06626 (2016)

  24. Kendall, D.G.: Shape manifolds, procrustean metrics, and complex projective spaces. Bul. Lond. Math. Soc. 16, 81–121 (1984)

    Article  MathSciNet  Google Scholar 

  25. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2004)

    Book  Google Scholar 

  26. Lam, L., Suen, C.Y.: Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Trans. Syst. Man Cybern.- Part A: Syst. Hum. 27(5), 553–568 (1997)

    Article  Google Scholar 

  27. Li, T., Ding, C., Jordan, M.I.: Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In: IEEE International Conference on Data Mining (2007)

    Google Scholar 

  28. Marron, J.S., Alonso, A.M.: Overview of object oriented data analysis. Biom. J. 56(5), 732–753 (2014)

    Article  MathSciNet  Google Scholar 

  29. Polikar, R.: Ensemble learning. Scholarpedia 4(1), 2776 (2009)

    Article  Google Scholar 

  30. Ratcliffe, J.G.: Foundations of Hyperbolic Manifolds. Springer, New York (2006). https://doi.org/10.1007/978-0-387-47322-2

    Book  MATH  Google Scholar 

  31. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2010)

    Article  Google Scholar 

  32. Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  33. Surowiecki, J.: The Wisdom of Crowds. Anchor, New York City (2005)

    Google Scholar 

  34. Topchy, A.P., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)

    Article  Google Scholar 

  35. Vega-Pons, S., Correa-Morris, J., Ruiz-Shulcloper, J.: Weighted partition consensus via kernels. Pattern Recognit. 43(8), 2712–2724 (2010)

    Article  Google Scholar 

  36. Vega-Pons, S., Ruiz-Shulcloper, J.: A survey of clustering ensemble algorithms. Int. J. Pattern Recognit. Artif. Intell. 25(03), 337–372 (2011)

    Article  MathSciNet  Google Scholar 

  37. Waldron, J.: The wisdom of the multitude: some reflections on Book III chapter 11 of the politics. Polit. Theory 23, 563–84 (1995)

    Article  Google Scholar 

  38. Wang, H., Marron, J.S.: Object oriented data analysis: sets of trees. Ann. Stat. 35, 1849–1873 (2007)

    Article  MathSciNet  Google Scholar 

  39. Yang, F., Li, X., Li, Q., Li, T.: Exploring the diversity in cluster ensemble generation: random sampling and random projection. Expert Syst. Appl. 41(10), 4844–4866 (2014)

    Article  Google Scholar 

  40. Zhou, Z.: Ensemble Methods: Foundations and Algorithms. Taylor & Francis Group, LLC, Abingdon (2012)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brijnesh Jain .

Editor information

Editors and Affiliations

A Proof of Theorem 2

A Proof of Theorem 2

To prove Theorem 2, it is helpful to use a suitable representation of partitions. We suggest to represent partitions as points of some geometric space, called orbit space [20]. Orbit spaces are well explored, possess a rich geometrical structure and have a natural connection to Euclidean spaces [3, 19, 30].

1.1 A.1 Partition Spaces

We denote the natural projection that sends matrices to the partitions they represent by

$$ \pi : {\mathcal {X}} \rightarrow {\mathcal {P}}, \quad {\varvec{X}} \mapsto \pi ({\varvec{X}}) = X. $$

The group \(\varPi = \varPi ^\ell \) of all (\(\ell \times \ell \))-of all (\(\ell \times \ell \))-permutation matrices is a discontinuous group that acts on \({\mathcal {X}}\) by matrix multiplication, that is

$$ \cdot : \varPi \times {\mathcal {X}} \rightarrow {\mathcal {X}}, \quad ({\varvec{P}}, {\varvec{X}}) \mapsto {\varvec{PX}}. $$

The orbit of \({\varvec{X}} \in {\mathcal {X}}\) is the set \(\mathop {\left[ {\varvec{X}} \right] } = \mathop {\left\{ {\varvec{PX}} \,:\, {\varvec{P}} \in \varPi \right\} }\). The orbit space of partitions is the quotient space \({\mathcal {X}}/\varPi = \mathop {\left\{ \mathop {\left[ {\varvec{X}} \right] } \,:\, {\varvec{X}} \in {\mathcal {X}} \right\} }\) obtained by the action of the permutation group \(\varPi \) on the set \({\mathcal {X}}\). We write \({\mathcal {P}} = {\mathcal {X}}/\varPi \) to denote the partition space and \(X \in {\mathcal {P}}\) to denote an orbit \([{\varvec{X}}] \in {\mathcal {X}}/\varPi \). The natural projection \(\pi : {\mathcal {X}} \rightarrow {\mathcal {P}}\) sends matrices \({\varvec{X}}\) to the partitions \(\pi ({\varvec{X}}) = \mathop {\left[ {\varvec{X}} \right] }\) they represent. The partition space \({\mathcal {P}}\) is endowed with the intrinsic metric \(\delta \) defined by \(\delta (X, Y) = \min \mathop {\left\{ \Vert {{\varvec{X}} - {\varvec{Y}}}\Vert \,:\, {\varvec{X}} \in X, {\varvec{Y}} \in Y \right\} }\).

1.2 A.2 Dirichlet Fundamental Domains

We use the following notations: By \(\overline{{\mathcal {U}}}\) we denote the closure of a subset \({\mathcal {U}} \subseteq {\mathcal {X}}\), by \(\partial {\mathcal {U}}\) the boundary of \({\mathcal {U}}\), and by \({\mathcal {U}}^\circ \) the open subset \(\overline{{\mathcal {U}}} \setminus \partial {\mathcal {U}}\). The action of permutation \({\varvec{P}} \in \varPi \) on the subset \({\mathcal {U}}\subseteq {\mathcal {X}}\) is the set defined by \({\varvec{P}}\,{\mathcal {U}} = \mathop {\left\{ {\varvec{PX}} \, :\, {\varvec{X}} \in {\mathcal {U}} \right\} }\). By \(\varPi ^* = \varPi \setminus \mathop {\left\{ {\varvec{I}} \right\} }\) we denote the subset of (\(\ell \times \ell \))-permutation matrices without identity matrix \({\varvec{I}}\).

A subset \({\mathcal {F}}\) of \({\mathcal {X}}\) is a fundamental set for \(\varPi \) if and only if \({\mathcal {F}}\) contains exactly one representation \({\varvec{X}}\) from each orbit \(\mathop {\left[ {\varvec{X}} \right] } \in {\mathcal {X}}/\varPi \). A fundamental domain of \(\varPi \) in \({\mathcal {X}}\) is a closed connected set \({\mathcal {F}} \subseteq {\mathcal {X}}\) that satisfies

  1. 1.

    \(\displaystyle {\mathcal {X}} = \bigcup _{{\varvec{P}} \in \varPi } {\varvec{P}}{\mathcal {F}}\)

  2. 2.

    \({\varvec{P}} {\mathcal {F}}^\circ \cap {\mathcal {F}}^\circ = \emptyset \) for all \({\varvec{P}} \in \varPi ^*\).

Proposition 1

Let \({\varvec{Z}}\) be a representation of an asymmetric partition \(Z \in {\mathcal {P}}\). Then

$$ {\mathcal {D}}_{{\varvec{Z}}} = \mathop {\left\{ {\varvec{X}} \in {\mathcal {X}} \,:\, \Vert {{\varvec{X}} - {\varvec{Z}}}\Vert \le \Vert {{\varvec{X}} - {\varvec{PZ}}}\Vert \text { for all }{\varvec{P}} \in \varPi \right\} } $$

is a fundamental domain, called Dirichlet fundamental domain of \({\varvec{Z}}\).

Proof

[30], Theorem 6.6.13.    \(\square \)

Lemma 1

Let \({\mathcal {D}}_{{\varvec{Z}}}\) be a Dirichlet fundamental domain of representation \({\varvec{Z}}\) of an asymmetric partition \(Z \in {\mathcal {P}}\). Suppose that \({\varvec{X}}\) and \({\varvec{X}}'\) are two different representations of a partition X such that \({\varvec{X}}, {\varvec{X}}' \in {\mathcal {D}}_{{\varvec{Z}}}\). Then \({\varvec{X}}, {\varvec{X}}' \in \partial {\mathcal {D}}_{{\varvec{Z}}}\).

Proof

[19], Prop. 3.13 and [22], Prop. A.2.    \(\square \)

1.3 A.3 Multiple Alignments

Let \({\mathcal {S}}_n = \mathop {\left( X_1, \ldots , X_n \right) }\) be a sample of n partitions \(X_i \in {\mathcal {P}}\). A multiple alignment of \({\mathcal {S}}_n\) is an n-tuple consisting of representations \({\varvec{X}}_{i}\in X_{i}\). By

we denote the set of all multiple alignments of \({\mathcal {S}}_n\). A multiple alignment is said to be in optimal position with representation \({\varvec{Z}}\) of a partition Z, if all representations \({\varvec{X}}_{i}\) of are in optimal position with \({\varvec{Z}}\). The mean of a multiple alignment is denoted by

$$\begin{aligned} {\varvec{M}}_{\mathfrak {X}} = \frac{1}{n} \sum _{i=1}^n {\varvec{X}}_{i}. \end{aligned}$$

An optimal multiple alignment is a multiple alignment that minimizes the function

$$\begin{aligned} f_n\!\mathop {\left( \mathfrak {X} \right) } = \frac{1}{n^2}\sum _{i=1}^n \sum _{j=1}^n \Vert {{\varvec{X}}_{i} - {\varvec{X}}_{j}}\Vert {^2}. \end{aligned}$$

The problem of finding an optimal multiple alignment is that of finding a multiple alignment with smallest average pairwise squared distances in \({\mathcal {X}}\). To show equivalence between mean partitions and an optimal multiple alignments, we introduce the sets of minimizers of the respective functions \(F_n\) and \(f_n\):

For a given sample \({\mathcal {S}}_n\), the set \({\mathcal {M}}(F_n)\) is the mean partition set and \({\mathcal {M}}(f_n)\) is the set of all optimal multiple alignments. The next result shows that any solution of \(F_n\) is also a solution of \(f_n\) and vice versa.

Theorem 3

For any sample \({\mathcal {S}}_n \in {\mathcal {P}}^n\), the map

is surjective.

Proof

[23], Theorem 4.1.    \(\square \)

1.4 A.4 Proof of Theorem 2

Parts 1–8 show the assertion of Eq. (2) and Part 9 shows the assertion of Eq. (3).

1 Without loss of generality, we pick a representation \({\varvec{X}}_{*}\) of the ground-truth partition \(X_*\). Let \({\varvec{Z}}\) be a representation of Z in optimal position with \({\varvec{X}}_{*}\). By

$$ {\mathcal {A}}_{{\varvec{Z}}} = \mathop {\left\{ {\varvec{X}} \in {\mathcal {X}} \,:\, \Vert {{\varvec{X}}-{\varvec{Z}}}\Vert \le \alpha _Z/4 \right\} } $$

we denote the asymmetry ball of representation \({\varvec{Z}}\). By construction, we have \({\varvec{X}}_{*} \in {\mathcal {A}}_{{\varvec{Z}}}\).

2 Since \(\varPi \) acts discontinuously on \({\mathcal {X}}\), there is a bijective isometry

$$ \phi :{\mathcal {A}}_{{\varvec{Z}}} \rightarrow {\mathcal {A}}_Z, \quad {\varvec{X}} \mapsto \pi ({\varvec{X}}) $$

according to [30], Theorem 13.1.1.

3 From [22], Theorem 3.1 follows that the mean partition M of \({\mathcal {S}}_n\) is unique. We show that \(M \in {\mathcal {A}}_Z\). Suppose that is a multiple alignment in optimal position with \({\varvec{Z}}\). Since \(\phi :{\mathcal {A}}_{{\varvec{Z}}} \rightarrow {\mathcal {A}}_Z\) is a bijective isometry, we have

showing that the multiple alignment is optimal. From Theorem 3 follows that

is a representation of a mean partition M of \({\mathcal {S}}_n\). Since \({\mathcal {A}}_{{\varvec{Z}}}\) is convex, we find that \({\varvec{M}} \in {\mathcal {A}}_{{\varvec{Z}}}\) and therefore \(M \in {\mathcal {A}}_Z\).

4 From Part 1–3 of this proof follows that the multiple alignment is in optimal position with \({\varvec{X}}_{*}\). We show that there is no other multiple alignment of \({\mathcal {S}}_n\) with this property. Observe that \({\mathcal {A}}_{{\varvec{Z}}}\) is contained in the Dirichlet fundamental domain \({\mathcal {D}}_{{\varvec{Z}}}\) of representation \({\varvec{Z}}\). Let \({\mathcal {S}}_{{\varvec{Z}}} = \phi ({\mathcal {S}}_Q)\) be a representation of the support in \( {\mathcal {A}}_{{\varvec{Z}}}^\circ \). Then by assumption, we have \({\mathcal {S}}_{{\varvec{Z}}} \subseteq {\mathcal {A}}_{{\varvec{Z}}}^\circ \subset {\mathcal {D}}_{{\varvec{Z}}}\) showing that \({\mathcal {S}}_{{\varvec{Z}}}\) lies in the interior of \({\mathcal {D}}_{{\varvec{Z}}}\). From the definition of a fundamental domain together with Lemma 1 follows that is the unique optimal alignment in optimal position with \({\varvec{X}}_{*}\).

5 With the same argumentation as in the previous part of this proof, we find that \({\varvec{M}}\) is the unique representation of M in optimal position with \({\varvec{X}}_{*}\).

6 Let \(z \in {\mathcal {Z}}\) be a data point. Since \({\varvec{X}}_{i} \in X_i\) is the unique representation in optimal position with \({\varvec{X}}_{*}\), the vote of \(X_i\) on data point z is of the form \(V_{X_i}(z) = V_{{\varvec{X}}_{i}}(z)\) for all \(i \in \mathop {\left\{ 1, \ldots , n \right\} }\). With the same argument, we have \(V_n(z) = V_{M}(z) = V_{{\varvec{M}}}(z)\).

7 By \({\varvec{x}}^{(i)}(z)\) we denote the column of \({\varvec{X}}_{i}\) that represents z. By definition, we have

$$ p_z = \mathbb {P}\mathop {\left( V_{X_i}(z) = 1 \right) } = \mathbb {P}\mathop {\left( \mathop {\left\langle {\varvec{x}}^{(i)}(z), {\varvec{x}}^*(z) \right\rangle } > 0.5 \right) } $$

for all \(i \in \mathop {\left\{ 1, \ldots , n \right\} }\). Since \(X_i\) and \(X_*\) are both hard partitions, we find that

$$ \mathop {\left\langle {\varvec{x}}^{(i)}(z), {\varvec{x}}^*(z) \right\rangle } = \mathbb {I}\mathop {\left\{ {\varvec{x}}^{(i)}(z) = {\varvec{x}}^*(z) \right\} }, $$

where \(\mathbb {I}\) denotes the indicator function.

8 From the Mean Partition Theorem follows that

$$ {\varvec{m}}(z) = \frac{1}{n} \sum _{i=1}^n {\varvec{x}}^{(i)}(z) $$

is the column of \({\varvec{M}}\) that represents z. Then the agreement of \({\varvec{M}}\) on z is given by

$$\begin{aligned} k_{{\varvec{M}}}(z)&= \mathop {\left\langle {\varvec{m}}(z), {\varvec{x}}^*(z) \right\rangle }\\&= \frac{1}{n}\sum _{i=1}^n \mathop {\left\langle {\varvec{x}}^{(i)}(z), {\varvec{x}}^*(z) \right\rangle }\\&= \frac{1}{n}\sum _{i=1}^n \mathbb {I}\mathop {\left\{ {\varvec{x}}^{(i)}(z) = {\varvec{x}}^*(z) \right\} }. \end{aligned}$$

Thus, the agreement \(k_{{\varvec{M}}}(z)\) counts the fraction of sample partitions \(X_i\) that correctly classify z. Let

$$ p_n = \mathbb {P}\mathop {\left( h_n(z) = 1 \right) } = \mathbb {P}\mathop {\left( k_{{\varvec{M}}}(z) > 0.5 \right) } $$

denote the probability that the majority of the sample partitions \(X_i\) correctly classifies z. Since the votes of the sample partitions are assumed to be independent, we can compute \(p_n\) using the binomial distribution

$$ p_n = \sum _{i=r}^n \left( {\begin{array}{c}n\\ i\end{array}}\right) p^i (1-p)^{n-i}, $$

where \(r = \lfloor n/2 \rfloor + 1\) and \(\lfloor a \rfloor \) is the largest integer b with \(b \le a\). Then the assertion of Eq. (2) follows from [16], Theorem 1.

9 We show the assertion of Eq. (3). By assumption, the support \({\mathcal {S}}_Q\) is contained in an open subset of the asymmetry ball \({\mathcal {A}}_Z\). From [22], Theorem 3.1 follows that the expected partition \(M_Q\) of Q is unique. Then the sequence \((M_n)_{n \in \mathbb {N}}\) converges almost surely to the expected partition \(M_Q\) according to [20], Theorem 3.1 and Theorem 3.3. From the first eight parts of the proof follows that the limit partition \(M_Q\) agrees on any data point z almost surely with the ground-truth partition \(X_*\). This shows the assertion.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jain, B. (2018). Condorcet’s Jury Theorem for Consensus Clustering. In: Trollmann, F., Turhan, AY. (eds) KI 2018: Advances in Artificial Intelligence. KI 2018. Lecture Notes in Computer Science(), vol 11117. Springer, Cham. https://doi.org/10.1007/978-3-030-00111-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00111-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00110-0

  • Online ISBN: 978-3-030-00111-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics