skip to main content
10.1145/564691.564739acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

A Monte Carlo algorithm for fast projective clustering

Published:03 June 2002Publication History

ABSTRACT

We propose a mathematical formulation for the notion of optimal projective cluster, starting from natural requirements on the density of points in subspaces. This allows us to develop a Monte Carlo algorithm for iteratively computing projective clusters. We prove that the computed clusters are good with high probability. We implemented a modified version of the algorithm, using heuristics to speed up computation. Our extensive experiments show that our method is significantly more accurate than previous approaches. In particular, we use our techniques to build a classifier for detecting rotated human faces in cluttered images.

References

  1. C. C. Aggarwal, C. M. Procopiuc, J. L. Wolf, P. S. Yu, and J. S. Park. Fast algorithms for projected clustering. In Proc. of ACM SIGMOD Intl. Conf. Management of Data, pages 61-72, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. C. Aggarwal, and P. S. Yu. Finding generalized projected clusters in high dimensional spaces. In Proc. of ACM SIGMOD Intl. Conf. Management of Data, pages 70-81, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proc. ACM SIGMOD Conf. on Management of Data, pages 94-105, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Chakrabarti and S. Mehrotra. Local dimensionality reduction: A new approach to indexing high dimensional spaces. In Proc. 26th Intl. Conf. Very Large Data Bases, pages 89-100, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. 2nd Intl. Conf. Knowledge Discovery and Data Mining, pages 226-231, 1996.Google ScholarGoogle Scholar
  6. M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. Density-connected setsand their application for trend detection in spatial databases. In Proc. 3rd Intl. Conf. Knowledge Discovery and Data Mining, 1997.Google ScholarGoogle Scholar
  7. S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clustering algorithm for large databases. In Proc. ACM SIGMOD Intl. Conf. Management of Data, pages 73-84, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Hinneburg, C. C. Aggarwal, and D. A. Keim. What is the nearest neighbor in high dimensional spaces? In Proc. 26th Intl. Conf. Very Large Data Bases, pages 506-515, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Hinneburg and D. A. Keim. Optimal grid-clustering: Towards breaking the curse of dimensionality in high-dimensional clustering. In Proc. 25th Intl. Conf. Very Large Data Bases, pages 506-517, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Hinneburg and D. A. Keim. An efficient approach to clustering in large multimedia databases with noise In Proc. 4th Intl. Conf. Knowledge Discovery and Data Mining, 1998.Google ScholarGoogle Scholar
  11. R. T. Ng and J. Hart. Efficient and effective clustering methods for spatial data mining. In Proc. 20th Intl. Conf. Very Large Data Bases, pages 144-155, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection. IEEE Trans. Pattern Analysis and Machine Intelligence, 20:22-38, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Schneiderman and T. Kanade. A statistical method for 3d object detection applied to faces and cars. In Proc. IEEE Intl. Conf. Comput. Vision, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  14. P. Viola and M. Jones. Robust real-time object detection. Technical Report 2001/01, Compaq Cambridge Research Lab, 2001.Google ScholarGoogle Scholar
  15. T. Zhang, R. Ramakrishnan, and M. Livny. Birch: an efficient data clustering method for very large databases. In Proc. ACM-SIGMOD Intl. Conf. Management of Data, pages 103-114, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Monte Carlo algorithm for fast projective clustering

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SIGMOD '02: Proceedings of the 2002 ACM SIGMOD international conference on Management of data
              June 2002
              654 pages
              ISBN:1581134975
              DOI:10.1145/564691

              Copyright © 2002 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 3 June 2002

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              SIGMOD '02 Paper Acceptance Rate42of240submissions,18%Overall Acceptance Rate785of4,003submissions,20%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader