Skip to main content

Relationships Between Local Intrinsic Dimensionality and Tail Entropy

  • Conference paper
  • First Online:
Book cover Similarity Search and Applications (SISAP 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13058))

Included in the following conference series:

Abstract

The local intrinsic dimensionality (LID) model assesses the complexity of data within the vicinity of a query point, through the growth rate of the probability measure within an expanding neighborhood. In this paper, we show how LID is asymptotically related to the entropy of the lower tail of the distribution of distances from the query. We establish tight relationships for cumulative Shannon entropy, entropy power, and their generalized Tsallis entropy variants, all with the potential for serving as the basis for new estimators of LID, or as substitutes for LID-based characterization and feature representations in classification and other learning contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amsaleg, L., et al.: The vulnerability of learning to adversarial perturbation increases with intrinsic dimensionality. In: IEEE Workshop on Information Forensics and Security, pp. 1–6 (2017)

    Google Scholar 

  2. Amsaleg, L., et al.: Extreme-value-theoretic estimation of local intrinsic dimensionality. Data Min. Knowl. Disc. 32(6), 1768–1805 (2018)

    Article  MathSciNet  Google Scholar 

  3. Amsaleg, L., Chelly, O., Houle, M.E., Kawarabayashi, K., Radovanović, R., Treeratanajaru, W.: Intrinsic dimensionality estimation within tight localities. In: Proceedings of 2019 SIAM International Conference on Data Mining, pp. 181–189 (2019)

    Google Scholar 

  4. Amsaleg, L., et al.: High intrinsic dimensionality facilitates adversarial attack: theoretical evidence. IEEE Trans. Inf. Forensics Secur. 16, 854–865 (2021)

    Article  Google Scholar 

  5. Ansuini, A., Laio, A., Macke, J.H., Zoccolan, D.: Intrinsic dimension of data representations in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 6111–6122 (2019)

    Google Scholar 

  6. Böhm, K., Keller, F., Müller, E., Nguyen, H.V., Vreeken, J.: CMI: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In: Proceedings of the 13th SIAM International Conference on Data Mining, pp. 198–206 (2013)

    Google Scholar 

  7. Bruske, J., Sommer, G.: Intrinsic dimensionality estimation with optimally topology preserving maps. IEEE Trans. Pattern Anal. Mach. Intell. 20(5), 572–575 (1998)

    Article  Google Scholar 

  8. Calì, C., Longobardi, M., Ahmadi, J.: Some properties of cumulative Tsallis entropy. Phys. A 486, 1012–1021 (2017)

    Article  MathSciNet  Google Scholar 

  9. Camastra, F., Staiano, A.: Intrinsic dimension estimation: advances and open problems. Inf. Sci. 328, 26–41 (2016)

    Article  Google Scholar 

  10. Campadelli, P., Casiraghi, E., Ceruti, C., Lombardi, G., Rozza, A.: Local intrinsic dimensionality based features for clustering. In: International Conference on Image Analysis and Processing, pp. 41–50 (2013)

    Google Scholar 

  11. Campadelli, P., Casiraghi, E., Ceruti, C., Rozza, A.: Intrinsic dimension estimation: relevant techniques and a benchmark framework. Math. Prob. Eng. 2015, 759567 (2015). https://doi.org/10.1155/2015/759567

    Article  MathSciNet  MATH  Google Scholar 

  12. Carter, K.M., Raich, R., Finn, W.G., Hero, A.O., III.: FINE: fisher information non-parametric embedding. IEEE Trans. Pattern Anal. Mach. Intell. 31(11), 2093–2098 (2009)

    Article  Google Scholar 

  13. Ceruti, C., Bassis, S., Rozza, A., Lombardi, G., Casiraghi, E., Campadelli, P.: DANCo: an intrinsic dimensionality estimator exploiting angle and norm concentration. Pattern Recogn. 47, 2569–2581 (2014)

    Article  Google Scholar 

  14. Coles, S., Bawa, J., Trenner, L., Dorazio, P.: An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics, vol. 208, p. 209. Springer, London (2001). https://doi.org/10.1007/978-1-4471-3675-0

    Book  Google Scholar 

  15. Costa, J.A., Hero, A.O., III.: Entropic graphs for manifold learning. In: The 37th Asilomar Conference on Signals, Systems & Computers, vol. 1, pp. 316–320 (2003)

    Google Scholar 

  16. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing, Wiley, USA (2006)

    MATH  Google Scholar 

  17. Di Crescenzo, A., Longobardi, M.: On cumulative entropies. J. Stat. Plan. Inference 139(12), 4072–4087 (2009)

    Article  MathSciNet  Google Scholar 

  18. Facco, E., d’Errico, M., Rodriguez, A., Laio, A.: Estimating the intrinsic dimension of datasets by a minimal neighborhood information. Sci. Rep. 7, 12140 (2017)

    Article  Google Scholar 

  19. Farahmand, A.M., Szepesvári, C., Audibert, J.Y.: Manifold-adaptive dimension estimation. In: Proceedings of the 24th International Conference on Machine Learning, pp. 265–272 (2007)

    Google Scholar 

  20. Hein, M., Audibert, J.Y.: Intrinsic dimensionality estimation of submanifolds in \(R^d\). In: Proceedings of the 22nd International Conference on Machine Learning, pp. 289–296 (2005)

    Google Scholar 

  21. Hill, B.M.: A simple general approach to inference about the tail of a distribution. Ann. Stat. 3(5), 1163–1174 (1975)

    Article  MathSciNet  Google Scholar 

  22. Houle, M.E.: Dimensionality, discriminability, density and distance distributions. In: IEEE 13th International Conference on Data Mining Workshops, pp. 468–473 (2013)

    Google Scholar 

  23. Houle, M.E.: Local intrinsic dimensionality I: an extreme-value-theoretic foundation for similarity applications. In: International Conference on Similarity Search and Applications, pp. 64–79 (2017)

    Google Scholar 

  24. Houle, M.E.: Local intrinsic dimensionality II: multivariate analysis and distributional support. In: International Conference on Similarity Search and Applications, pp. 80–95, (2017)

    Google Scholar 

  25. Houle, M.E., Kashima, H., Nett, M.: Generalized expansion dimension. In: IEEE 12th International Conference on Data Mining Workshops, pp. 587–594 (2012)

    Google Scholar 

  26. Houle, M.E., Ma, X., Nett, M., Oria, V.: Dimensional testing for multi-step similarity search. In: IEEE 12th International Conference on Data Mining, pp. 299–308 (2012)

    Google Scholar 

  27. Houle, M.E., Schubert, E., Zimek, A.: On the correlation between local intrinsic dimensionality and outlierness. In: International Conference on Similarity Search and Applications, pp. 177–191 (2018)

    Google Scholar 

  28. Johnsson, K., Soneson, C., Fontes, M.: Low bias local intrinsic dimension estimation from expected simplex skewness. IEEE TPAMI 37(1), 196–202 (2015)

    Article  Google Scholar 

  29. Jolliffe, I.T.: Principal Component Analysis. Springer Series in Statistics, Springer, New York (2002). https://doi.org/10.1007/b98835

    Book  MATH  Google Scholar 

  30. Kambhatla, N., Leen, T.K.: Dimension reduction by local principal component analysis. Neural Comput. 9(7), 1493–1516 (1997)

    Article  Google Scholar 

  31. Karamata, J.: Sur un mode de croissance régulière. Théorèmes fondamentaux. Bull. Soc. Math. Fr. 61, 55–62 (1933)

    Article  Google Scholar 

  32. Karger, D.R., Ruhl, M.: Finding nearest neighbors in growth-restricted metrics. In: Proceedings of the 34th ACM Symposium on Theory of Computing, pp. 741–750 (2002)

    Google Scholar 

  33. Kostal, L., Lansky, P., Pokora, O.: Measures of statistical dispersion based on Shannon and Fisher information concepts. Inf. Sci. 235, 214–223 (2013). https://doi.org/10.1016/j.ins.2013.02.023

    Article  MathSciNet  MATH  Google Scholar 

  34. Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: Advances in Neural Information Processing Systems, pp. 777–784 (2004)

    Google Scholar 

  35. Rao, M., Chen, Y., Vemuri, B.C., Wang, F.: Cumulative residual entropy: a new measure of information. IEEE Trans. Inf. Theor. 50(6), 1220–1228 (2004)

    Article  MathSciNet  Google Scholar 

  36. Ma, X., et al.: Characterizing adversarial subspaces using local intrinsic dimensionality. In: International Conference on Learning Representations, pp. 1–15 (2018)

    Google Scholar 

  37. Ma, X., et al.: Dimensionality-driven learning with noisy labels. In: International Conference on Machine Learning, pp. 3361–3370 (2018)

    Google Scholar 

  38. Navarro, G., Paredes, R., Reyes, N., Bustos, C.: An empirical evaluation of intrinsic dimension estimators. Inf. Syst. 64, 206–218 (2017)

    Article  Google Scholar 

  39. Nguyen, H.V., Mandros, P., Vreeken, J.: Universal dependency analysis. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 792–800 (2016)

    Google Scholar 

  40. Pele, D.T., Lazar, E., Mazurencu-Marinescu-Pele, M.: Modeling expected shortfall using tail entropy. Entropy 21(12), 1204 (2019)

    Article  MathSciNet  Google Scholar 

  41. Pettis, K.W., Bailey, T.A., Jain, A.K., Dubes, R.C.: An intrinsic dimensionality estimator from near-neighbor information. IEEE Trans. Pattern Anal. Mach. Intell. 1, 25–37 (1979)

    Article  Google Scholar 

  42. Pope, P., Zhu, C., Abdelkader, A., Goldblum, M., Goldstein, T.: The intrinsic dimension of images and its impact on learning. In: International Conference on Learning Representations (2021)

    Google Scholar 

  43. Ratz, H.C.: Entropy power factors for linear discrete systems. Can. Electr. Eng. J. 8(2), 73–78 (1983)

    Article  Google Scholar 

  44. Rozza, A., Lombardi, G., Ceruti, C., Casiraghi, E., Campadelli, P.: Novel high intrinsic dimensionality estimators. Mach. Learn. 89(1–2), 37–65 (2012)

    Article  MathSciNet  Google Scholar 

  45. Rozza, A., Lombardi, G., Rosa, M., Casiraghi, E., Campadelli, P.: IDEA: Intrinsic dimension estimation algorithm. In: International Conference on Image Analysis and Processing, pp. 433–442 (2011)

    Google Scholar 

  46. Stam, A.J.: Some inequalities satisfied by the quantities of information of Fisher and Shannon. Inf. Control 2, 101–112 (1959)

    Article  MathSciNet  Google Scholar 

  47. Tsallis, C.: Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 52, 479–487 (1988)

    Article  MathSciNet  Google Scholar 

  48. Verveer, P.J., Duin, R.P.W.: An evaluation of intrinsic dimensionality estimators. IEEE Trans. Pattern Anal. Mach. Intell. 17(1), 81–86 (1995)

    Article  Google Scholar 

  49. Zhou, S., Tordesillas, A., Pouragha, M., Bailey, J., Bondell, H.: On local intrinsic dimensionality of deformation in complex materials. Nat. Sci. Rep. 11, 10216 (2021)

    Article  Google Scholar 

Download references

Acknowledgments

Michael E. Houle acknowledges the financial support of JSPS Kakenhi Kiban (B) Research Grant 18H03296.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael E. Houle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bailey, J., Houle, M.E., Ma, X. (2021). Relationships Between Local Intrinsic Dimensionality and Tail Entropy. In: Reyes, N., et al. Similarity Search and Applications. SISAP 2021. Lecture Notes in Computer Science(), vol 13058. Springer, Cham. https://doi.org/10.1007/978-3-030-89657-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89657-7_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89656-0

  • Online ISBN: 978-3-030-89657-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics