ABSTRACT
BCubed is a mathematically clean, elegant and intuitively well behaved external performance metric for clustering tasks. BCubed compares a predicted clustering to a known ground truth through elementwise precision and recall scores. For each element, the predicted and ground truth clusters containing the element are compared, and the mean over all elements is taken. We argue that BCubed overestimates performance, for the intuitive reason that the clustering gets credit for putting an element in its own cluster. This is repaired, and we investigate the repaired version, called "Elements Like Me (ELM)". We extensively evaluate ELM and conclude that it retains all positive properties of BCubed and gives a minimum 0 zero score when it should.
- Enrique Amigó, Julio Gonzalo, Javier Artiles, and Felisa Verdejo. 2009. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval , Vol. 12, 4 (2009), 461--486.Google ScholarDigital Library
- Amit Bagga and Breck Baldwin. 1998. Entity-Based Cross-Document Coreferencing Using the Vector Space Model. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1 (Montreal, Quebec, Canada) (ACL '98/COLING '98). Association for Computational Linguistics, USA, 79--85. https://doi.org/10.3115/980845.980859Google Scholar
- Albert-László Barabási and Márton Pósfai. 2016. Network science .Cambridge University Press, Cambridge. http://barabasi.com/networksciencebook/Google Scholar
- Marcilio CP de Souto, André LV Coelho, Katti Faceli, Tiemi C Sakata, Viviane Bonadia, and Ivan G Costa. 2012. A comparison of external clustering evaluation indices in the context of imbalanced data sets. In 2012 Brazilian Symposium on Neural Networks . IEEE, IEEE Computer Society, Curitiba, Paraná, Brazil, 49--54.Google ScholarDigital Library
- Filippo Menczer, Santo Fortunato, and Clayton A. Davis. 2020. A First Course in Network Science .Cambridge University Press, Cambridge. https://doi.org/10.1017/9781108653947Google Scholar
- Jose G. Moreno and Gaël Dias. 2015. Adapted B-CUBED Metrics to Unbalanced Datasets. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (Santiago, Chile) (SIGIR '15). Association for Computing Machinery, New York, NY, USA, 911--914. https://doi.org/10.1145/2766462.2767836Google Scholar
- Lev Pevzner and Marti A Hearst. 2002. A critique and improvement of an evaluation metric for text segmentation. Computational Linguistics , Vol. 28, 1 (2002), 19--36.Google ScholarDigital Library
- Lior Rokach. 2009. A survey of clustering algorithms. In Data Mining and knowledge discovery handbook. Springer, Boston, MA, 269--298.Google Scholar
- Gregor Wiedemann and Gerhard Heyer. 2021. Multi-Modal Page Stream Segmentation with Convolutional Neural Networks. Lang. Resour. Eval. , Vol. 55, 1 (2021), 127--150. https://doi.org/10.1007/s10579-019-09476--2Google ScholarDigital Library
Index Terms
- BCubed Revisited: Elements Like Me
Recommendations
Robust hybrid/mixed finite elements for rubber-like materials under severe compression
AbstractA new family of hybrid/mixed finite elements optimized for numerical stability is introduced. It comprises a linear hexahedral and quadratic hexahedral and tetrahedral elements. The element formulation is derived from a consistent linearization of ...
The Enriched Crouzeix---Raviart Elements are Equivalent to the Raviart---Thomas Elements
For both the Poisson model problem and the Stokes problem in any dimension, this paper proves that the enriched Crouzeix---Raviart elements are actually identical to the first order Raviart---Thomas elements in the sense that they produce the same ...
Mortar Boundary Elements
We establish a mortar boundary element scheme for hypersingular boundary integral equations representing elliptic boundary value problems in three dimensions. We prove almost quasi-optimal convergence of the scheme in broken Sobolev norms of order $1/2$...
Comments