research-article

A methodology for selecting the most suitable cluster validation internal indices

Authors:
Caroline Tomasini

Universidade Federal do Rio Grande, Rio Grande, Brazil

Universidade Federal do Rio Grande, Rio Grande, Brazil
View Profile

,
Leonardo Emmendorfer

Universidade Federal do Rio Grande, Rio Grande, Brazil

Universidade Federal do Rio Grande, Rio Grande, Brazil
View Profile

,
Eduardo N. Borges

Federal University of Rio Grande (FURG)

Federal University of Rio Grande (FURG)
View Profile

,
Karina Machado

Universidade Federal do Rio Grande, Rio Grande, Brazil

Universidade Federal do Rio Grande, Rio Grande, Brazil
View Profile

SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied ComputingApril 2016Pages 901–903https://doi.org/10.1145/2851613.2851885

Published:04 April 2016Publication History

SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing

Pages 901–903

ABSTRACT

Validation of clustering results is an important issue in the context of machine learning research and it is essential for the success of clustering applications. Choosing the appropriate validation index for evaluating the results of a particular clustering algorithm remains a challenge. The quality of partitions generated by different clustering algorithms can be evaluated using different indices based on external or internal criteria. In this paper, we have proposed a methodology for selecting the most suitable cluster validation internal index, relating external and internal criteria through a regression model applied on the results of partitioning clustering algorithm.

References

D. L. Davies and D. W. Bouldin. A cluster separation measure. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (2):224--227, 1979. Google ScholarDigital Library
M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On clustering validation techniques. J. Intell. Inf. Syst., 17(2-3):107--145, Dec. 2001. Google ScholarDigital Library
J. Han, M. Kamber, and J. Pei. Data mining, southeast asia edition: Concepts and techniques. Morgan kaufmann, 2006. Google ScholarDigital Library
J. Handl, J. Knowles, and D. B. Kell. Computational cluster validation in post-genomic data analysis. Bioinformatics, 21(15):3201--3212, 2005. Google ScholarDigital Library
J. A. Hartigan and M. A. Wong. Algorithm as 136: A k-means clustering algorithm. Applied statistics, pages 100--108, 1979.Google Scholar
G. Kou, Y. Peng, and G. Wang. Evaluation of clustering algorithms for financial risk analysis using mcdm methods. Information Sciences, 275:1--12, 2014.Google ScholarCross Ref
Y. Liu, Z. Li, H. Xiong, X. Gao, and J. Wu. Understanding of internal clustering validation measures. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM '10, pages 911--916, Washington, DC, USA, 2010. Google ScholarDigital Library
J. R. Quinlan et al. Learning with continuous classes. In 5th Australian joint conference on artificial intelligence, volume 92, pages 343--348. Singapore, 1992.Google Scholar
L. Vendramin, R. J. Campello, and E. R. Hruschka. Relative clustering validity criteria: A comparative overview. Statistical Analysis and Data Mining, 3(4):209--235, 2010. Google ScholarDigital Library
R. Xu and D. Wunsch. Clustering. piscataway, 2009. Google ScholarDigital Library

Index Terms

A methodology for selecting the most suitable cluster validation internal indices
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

A metric to evaluate a cluster by eliminating effect of complement cluster
KI'11: Proceedings of the 34th Annual German conference on Advances in artificial intelligence

In this paper a new criterion for clusters validation is proposed. This new cluster validation criterion is used to approximate the goodness of a cluster. A clustering ensmble framework based on the new metric is proposed. In the framework first a large ...
Read More
A new asymmetric criterion for cluster validation
CIARP'11: Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

In this paper a new criterion for clusters validation is proposed. Many stability measures to validate a cluster have been proposed such as Normalized Mutual Information. We propose a new criterion for clusters validation. The drawback of the common ...
Read More
Towards a standard methodology to evaluate internal cluster validity indices

The evaluation and comparison of internal cluster validity indices is a critical problem in the clustering area. The methodology used in most of the evaluations assumes that the clustering algorithms work correctly. We propose an alternative methodology ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing
April 2016
2360 pages
ISBN:9781450337397
DOI:10.1145/2851613
Conference Chair:
Sascha Ossowski
University Rey Juan Carlos, Spain
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 April 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cluster evaluation
linear regression
validation criteria
Qualifiers
- research-article
Conference

Acceptance Rates
SAC '16 Paper Acceptance Rate252of1,047submissions,24%Overall Acceptance Rate1,650of6,669submissions,25%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 119
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A methodology for selecting the most suitable cluster validation internal indices

SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

A metric to evaluate a cluster by eliminating effect of complement cluster

A new asymmetric criterion for cluster validation

Towards a standard methodology to evaluate internal cluster validity indices