research-article

CHIRP: a new classifier based on composite hypercubes on iterated random projections

Authors:
Leland Wilkinson

University of Illinois at Chicago, Chicago, IL, USA

University of Illinois at Chicago, Chicago, IL, USA
View Profile

,
Anushka Anand

University of Illinois at Chicago, Chicago, IL, USA

University of Illinois at Chicago, Chicago, IL, USA
View Profile

,
Dang Nhon Tuan

University of Illinois at Chicago, Chicago, IL, USA

University of Illinois at Chicago, Chicago, IL, USA
View Profile

KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2011Pages 6–14https://doi.org/10.1145/2020408.2020418

Published:21 August 2011Publication History

KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 6–14

ABSTRACT

We introduce a classifier based on the L-infinity norm. This classifier, called CHIRP, is an iterative sequence of three stages (projecting, binning, and covering) that are designed to deal with the curse of dimensionality, computational complexity, and nonlinear separability. CHIRP is not a hybrid or modification of existing classifiers; it employs a new covering algorithm. The accuracy of CHIRP on widely-used benchmark datasets exceeds the accuracy of competitors. Its computational complexity is sub-linear in number of instances and number of variables and subquadratic in number of classes.

References

M. R. Abdullah, K.-A. Toh, and D. Srinivasan. A framework for empirical classifiers comparison. In Industrial Electronics and Applications. IEEE, 2006.Google Scholar
D. Achlioptas. Database-friendly random projections. In PODS '01: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 274--281, New York, 2001. ACM. Google ScholarDigital Library
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the 1998 ACM SIGMOD, pages 94--105, 1998. Google ScholarDigital Library
J. Aguilar, J. Riquelme, and M. Toro. Decision queue classifier for supervised learning using rotated hyperboxes. In Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence, volume 4045 of Lecture Notes in Computer Science, pages 326--336. Springer, 1998. Google ScholarDigital Library
B. Alpern and L. Carter. The hyperbox. In Proceedings of the IEEE Information Visualization 1991, pages 133--134, 1991. Google ScholarDigital Library
A. Anand, L. Wilkinson, and D. N. Tuan. An L-infinity norm visual classifier. In ICDM, pages 687--692, 2009. Google ScholarDigital Library
A. Asuncion and D. Newman. UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html, 2007.Google Scholar
Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57:289--300, 1995.Google ScholarCross Ref
P. Bickel and E. Levina. Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations. Bernoulli, 10:989--1010, 2004.Google ScholarCross Ref
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth, Belmont, CA, 1984.Google Scholar
S. Bu, L. V. S. Lakshmanan, and R. T. Ng. MDL summarization with holes. In VLDB '05: Proceedings of the 31st international conference on Very large data bases, pages 433--444. VLDB Endowment, 2005. Google ScholarDigital Library
T. G. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of AI Research, 2:263--286, 1995. Google ScholarDigital Library
C. Ding and X. He. K-means clustering via principal component analysis. In ICML '04, page 29, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
T. E. Flick, L. K. Jones, R. G. Priest, and C. Herman. Pattern classification using projection pursuit. Pattern Recognition, 23:1367--1376, 1990. Google ScholarDigital Library
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In EuroCOLT '95, pages 23--37, London, UK, 1995. Springer-Verlag. Google ScholarDigital Library
B. Gao. Hyper-rectangle-based discriminative data generalization and applications in data. PhD thesis, Simon Fraser University, 2002. Google ScholarDigital Library
B. J. Gao and M. Ester. Turning clusters into patterns: Rectangle-based discriminative data description. In ICDM '06: Proceedings of the Sixth International Conference on Data Mining, pages 200--211, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
Y. Guo, T. Hastie, and R. Tibshirani. Regularized discriminant analysis and its application in microarrays. Biostatistics, 1:1--18, 2005.Google Scholar
T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, 2001.Google Scholar
C. Hegde and R. Baraniuk. Random projections for manifold learning. In NIPS 2007: Proceedings of the 2007 conference on Advances in neural information processing systems, Cambridge, MA, USA, 2007. MIT Press.Google Scholar
L. O. Jimenez and D. A. Landgrebe. Projection pursuit for high dimensional feature reduction: paralleland sequential approaches. In Geoscience and Remote Sensing Symposium, 1995. IGARSS '95, volume 1, pages 148--150, 1995.Google ScholarCross Ref
W. B. Johnson and J. Lindenstrauss. Lipschitz mapping into Hilbert space. Contemporary Mathematics, 26:189--206, 1984.Google ScholarCross Ref
R. King, C. Feng, and A. Sutherland. Statlog: comparison of classification algorithms on large real-world problems. Applied Artificial Intelligence, 9, 1995.Google Scholar
P. F. Lazarsfeld and N. Henry. Latent Structure Analysis. Houghton Mifflin, Boston, 1968.Google Scholar
E.-K. Lee, D. Cook, S. Klinke, and T. Lumley. Projection pursuit for exploratory supervised classification. Journal of Computational and Graphical Statistics, 14:831--846, 2005.Google ScholarCross Ref
P. Li. Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost. In UAI 2010 Proceedings. IEEE, 2010.Google Scholar
P. Li, T. J. Hastie, and K. W. Church. Very sparse random projections. In KDD '06: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 287--296, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
M. Marchand and J. Shawe-Taylor. The set covering machine. Journal of Machine Learning Research, 3:723--746, 2002. Google ScholarDigital Library
K. Q. Pu and A. O. Mendelzon. Concise descriptions of subsets of structured sets. ACM Transactions on Database Systems, 30(1):211--248, 2005. Google ScholarDigital Library
J. R. Quinlan. C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning). Morgan Kaufmann, 1993. Google ScholarDigital Library
R. L. Rivest. Learning decision lists. Machine Learning, 2:229--246, 1987. Google ScholarDigital Library
D. W. Scott. On optimal and data-based histograms. Biometrika, 66:605--610, 1979.Google ScholarCross Ref
B. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall, New York, 1986.Google ScholarCross Ref
P. K. Simpson. Fuzzy min-max neural network, i: Classification. IEEE Transactions on Neural Networks, 3:776--786, 1992.Google ScholarDigital Library
M. Sokolova, N. Japkowicz, M. Marchand, and J. Shawe-taylor. The decision list machine. In Advances in Neural Information Processing Systems 15, pages 921--928. MIT Press, 2003.Google Scholar
A. Statnikov, C. F. Aliferis, I. Tsamardinos, D. Hardin, and S. Levy. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics, 21(5):631--643, 2005. Google ScholarDigital Library
H. A. Sturges. The choice of a class interval. Journal of the American Statistical Association, 21:65--66, 1926.Google ScholarCross Ref
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267--288, 1995.Google Scholar
R. Tibshirani and T. Hastie. Margin trees for high-dimensional classification. Journal of Machine Learning Research, 8:637--652, 2007. Google ScholarDigital Library
J. Tukey. A quick, compact, two-sample test to Duckworth's specifications. Technometrics, pages 31--48, 1959.Google Scholar
F. Üney and M. Türkay. A mixed-integer programming approach to multi-class data classification problem. European Journal of Operational Research, 173:910--920, 2006.Google ScholarCross Ref
S. S. Vempala. The Random Projection Method. American Mathematical Society, Providence, RI, USA, 2004.Google Scholar
H. Wainer. Estimating coefficients in linear models: It don't make no nevermind. Psychological Bulletin, 83(2):213--217, 1976.Google ScholarCross Ref
M. P. Wand. Data-based choice of histogram bin width. The American Statistician, 51(1):59--64, 1997.Google Scholar
I. H. Witten, E. Frank, L. Trigg, M. Hall, G. Holmes, and S. J. Cunningham. Weka: Practical machine learning tools and techniques with Java implementations. In Proceedings of the ICONIP/ANZIIS/ANNES'99 Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems, pages 192--196, 1999.Google Scholar

Index Terms

CHIRP: a new classifier based on composite hypercubes on iterated random projections
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Substantial improvements in the set-covering projection classifier CHIRP (composite hypercubes on iterated random projections)
Special Issue on the Best of SIGKDD 2011

In Wilkinson et al. [2011] we introduced a new set-covering random projection classifier that achieved average error lower than that of other classifiers in the Weka platform. This classifier was based on an L^∞ norm distance function and exploited an ...
Read More
Ensembles of classifiers based on dimensionality reduction

We present a novel approach for the construction of ensemble classifiers based on dimensionality reduction. The ensemble members are trained based on dimension-reduced versions of the training set. In order to classify a test sample, it is first ...
Read More
A comparison between k-Optimum Path Forest and k-Nearest Neighbors supervised classifiers

This paper presents the k-Optimum Path Forest (k-OPF) supervised classifier, which is a natural extension of the OPF classifier. k-OPF is compared to the k-Nearest Neighbors (k-NN), Support Vector Machine (SVM) and Decision Tree (DT) classifiers, and we ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2011
1446 pages
ISBN:9781450308137
DOI:10.1145/2020408
General Chair:
Chid Apte
IBM Research
,
Program Chairs:
Joydeep Ghosh
UT Austin
,
Padhraic Smyth
UC Irvine
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
random projections
supervised classification
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 812
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

CHIRP: a new classifier based on composite hypercubes on iterated random projections

KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Substantial improvements in the set-covering projection classifier CHIRP (composite hypercubes on iterated random projections)

Ensembles of classifiers based on dimensionality reduction

A comparison between k-Optimum Path Forest and k-Nearest Neighbors supervised classifiers