Abstract
Classifiers serve as tools for classifying data into classes. They directly or indirectly take a distribution of data points around a given query point into account. To express the distribution of points from the viewpoint of distances from a given point, a probability distribution mapping function is introduced here. The approximation of this function in a form of a suitable power of the distance is presented. How to state this power—the distribution mapping exponent—is described. This exponent is used for probability density estimation in high-dimensional spaces and for classification. A close relation of the exponent to a singularity exponent is discussed. It is also shown that this classifier exhibits better behavior (classification accuracy) than other kinds of classifiers for some tasks.
References
AGARWAL, S., GRAEPEL, T., HERBRICH, R., HAR-PELED, S., and ROTH, D. (2005), “Generalization Bounds for the Area Under the ROC Curve”, Journal of Machine Learning Research, 6, 393–425.
ASUNCION, A., and NEWMAN, D.J. (2007), UCI Machine Learning Repository, Irvine, CA: University of California, School of Information and Computer Science, cited January 26, 2008, available at http://www.ics.uci.edu/~mlearn/MLRepository.html.
BARABÁSI, A.-L., and STANLEY, H.E. (1995), Fractal Concepts in Surface Growth, New York: Cambridge University Press.
BELLMAN, R.E. (1961), Adaptive Control Processes, New Jersey: Princeton University Press.
BEYER, K., GOLDSTEIN, J., RAMAKRISHNAN, R., and SHAFT, U. (1999), “When is "Nearest Neighbor" Meaningful?” in Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, pp. 217–235.
BREIMAN, L., FRIEDMAN, J., STONE, C.J., and OLSHEN, R.A. (1984), Classification and Regression Trees, Boca Raton, Florida: Chapman and Hall/CRC.
COSTA, J.A., GIROTRA, A., and HERO, A.O. (2005), “Estimating Local Intrinsic Dimension with k-Nearest Neighbor Graphs”, IEEE Workshop on Statistical Signal Processing (SSP), Bordeaux, pp. 417–422.
COVER, T.M, and HART, P.E. (1967), “Nearest Neighbor Pattern Classification”, IEEE Transactions on Information Theory, 13(1), 21–27.
DUDA, R.O, HART, P.E., and STORK, D.G. (2000), Pattern Classification (2nd ed.), New York: John Wiley and Sons, Inc.
DUDANI, S.A. (1976), “The Distance-Weighted K-Nearest Neighbor Rule”, IEEE Transactions on Systems, Man, and Cybernetics, 6, 325–327.
DVORAK, I., and KLASCHKA, J. (1990), “Modification of the Grassberger-Procaccia Algorithm for Estimating The Correlation Exponent of Chaotic Systems with High Embedding Dimension”, Physics Letters A, 145(5), 225–231.
FABIAN, Z., and VAJDA, I. (2003), “Core Functions and Core Divergencies of Regular Distributions”, Kybernetika, 39, 29–42.
FISHER, R.A. (1936), “The Use of Multiple Measurements in Taxonomic Problems”, Annual Eugenics, 7(2), 179–188.
FRIEDMAN, J.H. (1994), “Flexible Metric Nearest Neighbor Classification”, Technical Report 113, Stanford University, Department of Statistics.
FUKUNAGA, K., and OLSEN, D.R. (1976), “An Algorithm for Finding Intrinsic Dimensionality of Data”, IEEE Transactions on Computers, 20(2), 176–183.
FROEHLING, H., CRUTCHFIELD, J.P., FARMER, D., PACKARD, N.H., and SHAW, R. (1981), “On Determining the Dimension of Chaotic Flows”, Physica, 3D, 605–617.
GAMA, J. (2003), “Iterative Bayes”, Theoretical Computer Science, 292, 417–430.
GRASSBERGER, P., and PROCACCIA, I. (1983), “Measuring the Strangeness of Strange Attractors”, Physica, 9D, 189–208.
GUERRERO, A., and SMITH, L.A. (2003), “Towards Coherent Estimation Of Correlation Dimension”, Physics Letters A, 318, 373–379.
HAKL, F., HLAVÁČEK, M., and KALOUS, R. (2002), “Application of Neural Networks Optimized by Genetic Algorithms to Higgs Boson Search”, in Proceedings of the 6th World Multi-Conference on Systemics, Cybernetics and Informatics (Vol. 11), eds. N. Callaos, M. Margenstern, and B. Sanchez, pp. 55–59.
HASTIE, T., and TIBSHIRANI, R. (1996), “Discriminant Adaptive Nearest Neighbor Classification”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18, 607–616.
HINNENBURG, A., AGGARWAL, C.C., and KEIM, D.A. (2000), “What is the Nearest Neighbor in High Dimensional Spaces?”, in Proceedings of the 26th VLDB Conference, Cairo, Egypt, pp. 506–515.
HAYKIN, S. (1998), Neural Networks: A Comprehensive Foundation (2nd ed.), Englewood Cliffs, NJ: Prentice Hall.
JOACHIMS, T. (1999), “Making Large-Scale SVM Learning Practical”, in Advances in Kernel Methods - Support Vector Learning, eds. B. Schölkopf , C. Burges and A. Smola, MIT-Press.
JOACHIMS, T. (2008), Program Codes for SVM-Light and SVM-Multiclass, cited 1.12.2008, available at http://svmlight.joachims.org/.
KIM, H.Ch., and GHAHRAMANI, Z. (2006), “Bayesian Gaussian Process Classification with the EM-EP Algorithm”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 1948–1959.
LASHERMES, B., ABRY, P., and CHAINAIS, P. (2004), Scaling Exponents Estimation for Multiscaling Processes, CNRS, Physics Laboratory, Ecole Normale Supérieure, Lyon, France, available at http://perso.ens-lyon.fr/patrice.abry/MYWEB/VERSIONSPS/lacicassp04.pdf.
MANDELBROT, B. (1982), The Fractal Theory of Nature, New York: W.H. Freeman and Co.
OSBORNE, A.R., and PROVENZALE, A. (1989), “Finite Correlation Dimension for Stochastic Systems with Power-Law Spectra”, Physica D, 35, 357–381.
PAREDES, R., and VIDAL, E. (2006), “Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7), 1100–1110.
PAREDES, R. (2008), Data Sets Corpora, available at http://algoval.essex.ac.uk/data/vector/UCI/
PAREDES, R. (2009), CPW: Class and Prototype Weights Learning, available at http://www.dsic.upv.es/~rparedes/research/CPW/index.html.
PESTOV, V. (2000a), “On the Geometry of Similarity Search: Dimensionality Course and Concentration of Measure”, Information Processing Letters, 73, 47–51.
PESTOV, V. (2000b), “The Concentration Phenomenon and Topological Groups”, Topology Atlas, 5, 5–10.
S, V., and KABURLASOS, V.G. (2003), “FINkNN: A Fuzzy Interval Number k-Nearest Neighbor Classifier for Prediction of Sugar Production from Populations of Samples”, Journal of Machine Learning Research, 4, 17–37.
SAUL, L.K. (2003), “Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds”, Journal of Machine Learning Research, 4, 119–155.
SILVERMAN, B.W. (1986), Density Estimation for Statistics and Data Analysis, London: Chapman and Hall.
STANLEY, H.E., and MELKIN, P. (1988), “Multifractal Phenomena in Physics and Chemistry (Review)”, Nature, 335, 405–409.
STEELE, J.M. (1997), “Probability Theory and Combinatorial Optimization”, CBMS-NSF Regional Conference Series in Applied Mathematics, CB69, Philadelphia, PA:Society for Industrial and Applied Mathematics (SIAM).
TAKENS, F. (1985), “On the Numerical Determination of the Dimension of the Attractor”, in Dynamical Systems and Bifurcations, Lecture Notes in Mathematics, Vol. 1125, Berlin: Springer, pp. 99–106.
TSOCHANTARIDIS, I., JOACHIMS, T., HOFMANN, T., and ALTUN, Y. (2005), “Large Margin Methods for Structured and Interdependent Output Variables”, Journal of Machine Learning Research (JMLR), 6, 1453–1484.
ZUO, W., WANG, K., ZHANG, H., and ZHANG, D. (2007), “Kernel Difference-Weighted k-Nearest Neighbors Classification”, in ICIC 2007, eds. D.-S. Huang, L. Heutte, and M. Loog, Springer-Verlag LNAI 4682, pp. 861–870.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the Institute of Computer Science of the Czech Academy of Sciences RVO: 67985807 and by the Czech Technical University in Prague: CZ68407700.
Rights and permissions
About this article
Cite this article
Jirina, M., Jirina, M. Utilization of singularity exponent in nearest neighbor based classifier. J Classif 30, 3–29 (2013). https://doi.org/10.1007/s00357-013-9121-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-013-9121-z