skip to main content
10.1145/2020408.2020418acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

CHIRP: a new classifier based on composite hypercubes on iterated random projections

Published:21 August 2011Publication History

ABSTRACT

We introduce a classifier based on the L-infinity norm. This classifier, called CHIRP, is an iterative sequence of three stages (projecting, binning, and covering) that are designed to deal with the curse of dimensionality, computational complexity, and nonlinear separability. CHIRP is not a hybrid or modification of existing classifiers; it employs a new covering algorithm. The accuracy of CHIRP on widely-used benchmark datasets exceeds the accuracy of competitors. Its computational complexity is sub-linear in number of instances and number of variables and subquadratic in number of classes.

References

  1. M. R. Abdullah, K.-A. Toh, and D. Srinivasan. A framework for empirical classifiers comparison. In Industrial Electronics and Applications. IEEE, 2006.Google ScholarGoogle Scholar
  2. D. Achlioptas. Database-friendly random projections. In PODS '01: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 274--281, New York, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the 1998 ACM SIGMOD, pages 94--105, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Aguilar, J. Riquelme, and M. Toro. Decision queue classifier for supervised learning using rotated hyperboxes. In Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence, volume 4045 of Lecture Notes in Computer Science, pages 326--336. Springer, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Alpern and L. Carter. The hyperbox. In Proceedings of the IEEE Information Visualization 1991, pages 133--134, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Anand, L. Wilkinson, and D. N. Tuan. An L-infinity norm visual classifier. In ICDM, pages 687--692, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Asuncion and D. Newman. UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html, 2007.Google ScholarGoogle Scholar
  8. Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57:289--300, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  9. P. Bickel and E. Levina. Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations. Bernoulli, 10:989--1010, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  10. L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth, Belmont, CA, 1984.Google ScholarGoogle Scholar
  11. S. Bu, L. V. S. Lakshmanan, and R. T. Ng. MDL summarization with holes. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 433--444. VLDB Endowment, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. G. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of AI Research, 2:263--286, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Ding and X. He. K-means clustering via principal component analysis. In ICML '04, page 29, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. E. Flick, L. K. Jones, R. G. Priest, and C. Herman. Pattern classification using projection pursuit. Pattern Recognition, 23:1367--1376, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In EuroCOLT '95, pages 23--37, London, UK, 1995. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Gao. Hyper-rectangle-based discriminative data generalization and applications in data. PhD thesis, Simon Fraser University, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. J. Gao and M. Ester. Turning clusters into patterns: Rectangle-based discriminative data description. In ICDM '06: Proceedings of the Sixth International Conference on Data Mining, pages 200--211, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Guo, T. Hastie, and R. Tibshirani. Regularized discriminant analysis and its application in microarrays. Biostatistics, 1:1--18, 2005.Google ScholarGoogle Scholar
  19. T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, 2001.Google ScholarGoogle Scholar
  20. C. Hegde and R. Baraniuk. Random projections for manifold learning. In NIPS 2007: Proceedings of the 2007 conference on Advances in neural information processing systems, Cambridge, MA, USA, 2007. MIT Press.Google ScholarGoogle Scholar
  21. L. O. Jimenez and D. A. Landgrebe. Projection pursuit for high dimensional feature reduction: paralleland sequential approaches. In Geoscience and Remote Sensing Symposium, 1995. IGARSS '95, volume 1, pages 148--150, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  22. W. B. Johnson and J. Lindenstrauss. Lipschitz mapping into Hilbert space. Contemporary Mathematics, 26:189--206, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  23. R. King, C. Feng, and A. Sutherland. Statlog: comparison of classification algorithms on large real-world problems. Applied Artificial Intelligence, 9, 1995.Google ScholarGoogle Scholar
  24. P. F. Lazarsfeld and N. Henry. Latent Structure Analysis. Houghton Mifflin, Boston, 1968.Google ScholarGoogle Scholar
  25. E.-K. Lee, D. Cook, S. Klinke, and T. Lumley. Projection pursuit for exploratory supervised classification. Journal of Computational and Graphical Statistics, 14:831--846, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  26. P. Li. Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost. In UAI 2010 Proceedings. IEEE, 2010.Google ScholarGoogle Scholar
  27. P. Li, T. J. Hastie, and K. W. Church. Very sparse random projections. In KDD '06: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 287--296, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Marchand and J. Shawe-Taylor. The set covering machine. Journal of Machine Learning Research, 3:723--746, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K. Q. Pu and A. O. Mendelzon. Concise descriptions of subsets of structured sets. ACM Transactions on Database Systems, 30(1):211--248, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. R. Quinlan. C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning). Morgan Kaufmann, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. L. Rivest. Learning decision lists. Machine Learning, 2:229--246, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. W. Scott. On optimal and data-based histograms. Biometrika, 66:605--610, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  33. B. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall, New York, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  34. P. K. Simpson. Fuzzy min-max neural network, i: Classification. IEEE Transactions on Neural Networks, 3:776--786, 1992.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Sokolova, N. Japkowicz, M. Marchand, and J. Shawe-taylor. The decision list machine. In Advances in Neural Information Processing Systems 15, pages 921--928. MIT Press, 2003.Google ScholarGoogle Scholar
  36. A. Statnikov, C. F. Aliferis, I. Tsamardinos, D. Hardin, and S. Levy. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics, 21(5):631--643, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. H. A. Sturges. The choice of a class interval. Journal of the American Statistical Association, 21:65--66, 1926.Google ScholarGoogle ScholarCross RefCross Ref
  38. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267--288, 1995.Google ScholarGoogle Scholar
  39. R. Tibshirani and T. Hastie. Margin trees for high-dimensional classification. Journal of Machine Learning Research, 8:637--652, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. Tukey. A quick, compact, two-sample test to Duckworth's specifications. Technometrics, pages 31--48, 1959.Google ScholarGoogle Scholar
  41. F. Üney and M. Türkay. A mixed-integer programming approach to multi-class data classification problem. European Journal of Operational Research, 173:910--920, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  42. S. S. Vempala. The Random Projection Method. American Mathematical Society, Providence, RI, USA, 2004.Google ScholarGoogle Scholar
  43. H. Wainer. Estimating coefficients in linear models: It don't make no nevermind. Psychological Bulletin, 83(2):213--217, 1976.Google ScholarGoogle ScholarCross RefCross Ref
  44. M. P. Wand. Data-based choice of histogram bin width. The American Statistician, 51(1):59--64, 1997.Google ScholarGoogle Scholar
  45. I. H. Witten, E. Frank, L. Trigg, M. Hall, G. Holmes, and S. J. Cunningham. Weka: Practical machine learning tools and techniques with Java implementations. In Proceedings of the ICONIP/ANZIIS/ANNES'99 Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems, pages 192--196, 1999.Google ScholarGoogle Scholar

Index Terms

  1. CHIRP: a new classifier based on composite hypercubes on iterated random projections

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2011
      1446 pages
      ISBN:9781450308137
      DOI:10.1145/2020408

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 August 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader