skip to main content
10.1145/2320765.2320833acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Validating cluster structures in data mining tasks

Authors Info & Claims
Published:30 March 2012Publication History

ABSTRACT

Clustering has been a subject of wide research since it arises in many application domains. One of the clustering process issues is the evaluation of clustering results. Estimation of the obtained cluster structure quality is the main subject of cluster validity. In several years many cluster validity indexes were presented in the research community, but the general approach for clustering evaluation was not developed. In our work we are going to produce some methodology for cluster validity estimation and construct a special framework for its measure, which will combine a couple of current methods in one suitable tool. We suggest that these investigations will help a wide range of analyst in theirs work with clustering.

References

  1. Iso standard 9000-2000: Quality management systems: Fundamentals and vocabulary, 2000.Google ScholarGoogle Scholar
  2. M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On clustering validation techniques. Intelligent Information Systems Journal, 17:107--145, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. J. A. Berry and G. Linoff. Data Mining Techniques For Marketing, Sales and Customer Support. John Wiley & Sons, Inc., 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7:179--188, 1936.Google ScholarGoogle ScholarCross RefCross Ref
  6. E. Sivogolovko. Cluster validity measurement for crisp clustering. Komp'juternye instrumenty v obrazovanii, 4:14--31, 2011. (In Russian).Google ScholarGoogle Scholar
  7. S. A. Knight and J. Burn. Developing a framework for assessing information quality on the world wide web. Informing Science, 8:159--172, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  8. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: An update. SIGKDD Explorations, 11, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. P. Ballou and H. L. Pazer. Modeling completeness versus consistency tradeoffs in information decision contexts. Knowledge and Data Engineering, IEEE Transactions on, 15:240--243, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Dasu and T. Johnson. Exploratory Data Mining and Data Cleaning. Wiley, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Ordonez and J. Garcia-Garcia. Referential integrity quality metrics. Decision Support Systems, 44:495--508, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Blake and P. Mangiameli. The effects and interactions of data quality and problem complexity on classification. ACM Journal of Data and Information Quality, 2(2), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Sivogolovko. Evaluation of impact of data quality on clustering with syntactic cluster validity methods. Technical report, Christian-Albrechts University, August 2011.Google ScholarGoogle Scholar
  14. O. I. Lindland, G. Sindre, and A. Solvberg. Understanding quality in conceptual modelling. In IEEE Software, pages 42--49, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Manola and E. Miller, editors. W3C Recommendation, chapter RDF Primer. 2004.Google ScholarGoogle Scholar
  1. Validating cluster structures in data mining tasks

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          EDBT-ICDT '12: Proceedings of the 2012 Joint EDBT/ICDT Workshops
          March 2012
          265 pages
          ISBN:9781450311434
          DOI:10.1145/2320765

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 March 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate7of10submissions,70%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader