Skip to main content
Log in

Utilization of singularity exponent in nearest neighbor based classifier

Journal of Classification Aims and scope Submit manuscript

Abstract

Classifiers serve as tools for classifying data into classes. They directly or indirectly take a distribution of data points around a given query point into account. To express the distribution of points from the viewpoint of distances from a given point, a probability distribution mapping function is introduced here. The approximation of this function in a form of a suitable power of the distance is presented. How to state this power—the distribution mapping exponent—is described. This exponent is used for probability density estimation in high-dimensional spaces and for classification. A close relation of the exponent to a singularity exponent is discussed. It is also shown that this classifier exhibits better behavior (classification accuracy) than other kinds of classifiers for some tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

  • AGARWAL, S., GRAEPEL, T., HERBRICH, R., HAR-PELED, S., and ROTH, D. (2005), “Generalization Bounds for the Area Under the ROC Curve”, Journal of Machine Learning Research, 6, 393–425.

    MathSciNet  Google Scholar 

  • ASUNCION, A., and NEWMAN, D.J. (2007), UCI Machine Learning Repository, Irvine, CA: University of California, School of Information and Computer Science, cited January 26, 2008, available at http://www.ics.uci.edu/~mlearn/MLRepository.html.

  • BARABÁSI, A.-L., and STANLEY, H.E. (1995), Fractal Concepts in Surface Growth, New York: Cambridge University Press.

    Book  MATH  Google Scholar 

  • BELLMAN, R.E. (1961), Adaptive Control Processes, New Jersey: Princeton University Press.

    MATH  Google Scholar 

  • BEYER, K., GOLDSTEIN, J., RAMAKRISHNAN, R., and SHAFT, U. (1999), “When is "Nearest Neighbor" Meaningful?” in Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, pp. 217–235.

  • BREIMAN, L., FRIEDMAN, J., STONE, C.J., and OLSHEN, R.A. (1984), Classification and Regression Trees, Boca Raton, Florida: Chapman and Hall/CRC.

    MATH  Google Scholar 

  • COSTA, J.A., GIROTRA, A., and HERO, A.O. (2005), “Estimating Local Intrinsic Dimension with k-Nearest Neighbor Graphs”, IEEE Workshop on Statistical Signal Processing (SSP), Bordeaux, pp. 417–422.

  • COVER, T.M, and HART, P.E. (1967), “Nearest Neighbor Pattern Classification”, IEEE Transactions on Information Theory, 13(1), 21–27.

    Article  MATH  Google Scholar 

  • DUDA, R.O, HART, P.E., and STORK, D.G. (2000), Pattern Classification (2nd ed.), New York: John Wiley and Sons, Inc.

    Google Scholar 

  • DUDANI, S.A. (1976), “The Distance-Weighted K-Nearest Neighbor Rule”, IEEE Transactions on Systems, Man, and Cybernetics, 6, 325–327.

    Article  Google Scholar 

  • DVORAK, I., and KLASCHKA, J. (1990), “Modification of the Grassberger-Procaccia Algorithm for Estimating The Correlation Exponent of Chaotic Systems with High Embedding Dimension”, Physics Letters A, 145(5), 225–231.

    Article  MathSciNet  Google Scholar 

  • FABIAN, Z., and VAJDA, I. (2003), “Core Functions and Core Divergencies of Regular Distributions”, Kybernetika, 39, 29–42.

    MathSciNet  MATH  Google Scholar 

  • FISHER, R.A. (1936), “The Use of Multiple Measurements in Taxonomic Problems”, Annual Eugenics, 7(2), 179–188.

    Article  Google Scholar 

  • FRIEDMAN, J.H. (1994), “Flexible Metric Nearest Neighbor Classification”, Technical Report 113, Stanford University, Department of Statistics.

  • FUKUNAGA, K., and OLSEN, D.R. (1976), “An Algorithm for Finding Intrinsic Dimensionality of Data”, IEEE Transactions on Computers, 20(2), 176–183.

    Article  Google Scholar 

  • FROEHLING, H., CRUTCHFIELD, J.P., FARMER, D., PACKARD, N.H., and SHAW, R. (1981), “On Determining the Dimension of Chaotic Flows”, Physica, 3D, 605–617.

    MathSciNet  Google Scholar 

  • GAMA, J. (2003), “Iterative Bayes”, Theoretical Computer Science, 292, 417–430.

    Article  MathSciNet  MATH  Google Scholar 

  • GRASSBERGER, P., and PROCACCIA, I. (1983), “Measuring the Strangeness of Strange Attractors”, Physica, 9D, 189–208.

    MathSciNet  Google Scholar 

  • GUERRERO, A., and SMITH, L.A. (2003), “Towards Coherent Estimation Of Correlation Dimension”, Physics Letters A, 318, 373–379.

    Article  MathSciNet  MATH  Google Scholar 

  • HAKL, F., HLAVÁČEK, M., and KALOUS, R. (2002), “Application of Neural Networks Optimized by Genetic Algorithms to Higgs Boson Search”, in Proceedings of the 6th World Multi-Conference on Systemics, Cybernetics and Informatics (Vol. 11), eds. N. Callaos, M. Margenstern, and B. Sanchez, pp. 55–59.

  • HASTIE, T., and TIBSHIRANI, R. (1996), “Discriminant Adaptive Nearest Neighbor Classification”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18, 607–616.

    Article  Google Scholar 

  • HINNENBURG, A., AGGARWAL, C.C., and KEIM, D.A. (2000), “What is the Nearest Neighbor in High Dimensional Spaces?”, in Proceedings of the 26th VLDB Conference, Cairo, Egypt, pp. 506–515.

  • HAYKIN, S. (1998), Neural Networks: A Comprehensive Foundation (2nd ed.), Englewood Cliffs, NJ: Prentice Hall.

    Google Scholar 

  • JOACHIMS, T. (1999), “Making Large-Scale SVM Learning Practical”, in Advances in Kernel Methods - Support Vector Learning, eds. B. Schölkopf , C. Burges and A. Smola, MIT-Press.

  • JOACHIMS, T. (2008), Program Codes for SVM-Light and SVM-Multiclass, cited 1.12.2008, available at http://svmlight.joachims.org/.

  • KIM, H.Ch., and GHAHRAMANI, Z. (2006), “Bayesian Gaussian Process Classification with the EM-EP Algorithm”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 1948–1959.

    Article  Google Scholar 

  • LASHERMES, B., ABRY, P., and CHAINAIS, P. (2004), Scaling Exponents Estimation for Multiscaling Processes, CNRS, Physics Laboratory, Ecole Normale Supérieure, Lyon, France, available at http://perso.ens-lyon.fr/patrice.abry/MYWEB/VERSIONSPS/lacicassp04.pdf.

  • MANDELBROT, B. (1982), The Fractal Theory of Nature, New York: W.H. Freeman and Co.

    Google Scholar 

  • OSBORNE, A.R., and PROVENZALE, A. (1989), “Finite Correlation Dimension for Stochastic Systems with Power-Law Spectra”, Physica D, 35, 357–381.

    Article  MathSciNet  MATH  Google Scholar 

  • PAREDES, R., and VIDAL, E. (2006), “Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7), 1100–1110.

    Article  Google Scholar 

  • PAREDES, R. (2008), Data Sets Corpora, available at http://algoval.essex.ac.uk/data/vector/UCI/

  • PAREDES, R. (2009), CPW: Class and Prototype Weights Learning, available at http://www.dsic.upv.es/~rparedes/research/CPW/index.html.

  • PESTOV, V. (2000a), “On the Geometry of Similarity Search: Dimensionality Course and Concentration of Measure”, Information Processing Letters, 73, 47–51.

    Article  MathSciNet  Google Scholar 

  • PESTOV, V. (2000b), “The Concentration Phenomenon and Topological Groups”, Topology Atlas, 5, 5–10.

    Google Scholar 

  • S, V., and KABURLASOS, V.G. (2003), “FINkNN: A Fuzzy Interval Number k-Nearest Neighbor Classifier for Prediction of Sugar Production from Populations of Samples”, Journal of Machine Learning Research, 4, 17–37.

  • SAUL, L.K. (2003), “Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds”, Journal of Machine Learning Research, 4, 119–155.

    MathSciNet  Google Scholar 

  • SILVERMAN, B.W. (1986), Density Estimation for Statistics and Data Analysis, London: Chapman and Hall.

    MATH  Google Scholar 

  • STANLEY, H.E., and MELKIN, P. (1988), “Multifractal Phenomena in Physics and Chemistry (Review)”, Nature, 335, 405–409.

    Article  Google Scholar 

  • STEELE, J.M. (1997), “Probability Theory and Combinatorial Optimization”, CBMS-NSF Regional Conference Series in Applied Mathematics, CB69, Philadelphia, PA:Society for Industrial and Applied Mathematics (SIAM).

  • TAKENS, F. (1985), “On the Numerical Determination of the Dimension of the Attractor”, in Dynamical Systems and Bifurcations, Lecture Notes in Mathematics, Vol. 1125, Berlin: Springer, pp. 99–106.

    Google Scholar 

  • TSOCHANTARIDIS, I., JOACHIMS, T., HOFMANN, T., and ALTUN, Y. (2005), “Large Margin Methods for Structured and Interdependent Output Variables”, Journal of Machine Learning Research (JMLR), 6, 1453–1484.

    MathSciNet  MATH  Google Scholar 

  • ZUO, W., WANG, K., ZHANG, H., and ZHANG, D. (2007), “Kernel Difference-Weighted k-Nearest Neighbors Classification”, in ICIC 2007, eds. D.-S. Huang, L. Heutte, and M. Loog, Springer-Verlag LNAI 4682, pp. 861–870.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcel Jirina.

Additional information

This work was supported by the Institute of Computer Science of the Czech Academy of Sciences RVO: 67985807 and by the Czech Technical University in Prague: CZ68407700.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jirina, M., Jirina, M. Utilization of singularity exponent in nearest neighbor based classifier. J Classif 30, 3–29 (2013). https://doi.org/10.1007/s00357-013-9121-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-013-9121-z

Keywords

Navigation