Abstract
Local Intrinsic Dimensionality (LID) is a measure of data complexity in the vicinity of a query point. In this work, we address the problem of estimating LID from a Bayesian perspective by establishing a theoretical framework that derives the distribution of LID given a data sample. Using this framework, we develop new LID estimators that can outperform the Maximum Likelihood Estimator (MLE) in certain contexts. The framework also provides a convenient way to incorporate prior LID knowledge through informative priors. Additionally, we demonstrate how to aggregate multiple LID distributions in a Bayesian manner using logarithmic pooling. We conduct a variety of experiments, demonstrating that a Bayesian approach to LID is effective with a small number of nearest neighbors and when incorporating informative priors. We also show that in deep neural networks, MLE produces highly volatile LID estimates, whereas a Bayesian approach that incorporates prior LID information smoothes and reduces the variance of these estimates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amsaleg, L., Chelly, O., Furon, T., Girard, S., Houle, M.E., Kawarabayashi, K., Nett, M.: Extreme-value-theoretic estimation of local intrinsic dimensionality. DMKD (2018)
Anderberg, A., Bailey, J., Campello, R.J.G., Houle, M.E., Marques, H., Radovanović, M., Zimek, A.: Dimensionality-aware outlier detection: theoretical and experimental analysis. In: (SDM24) (2024)
Bailey, J., Houle, M., Ma, X.: Local intrinsic dimensionality, entropy and statistical divergences. Ent. (2022). https://doi.org/10.3390/e24091220
Brown, L.: Inadmissibility of the usual estimators of scale parameters in problems with unknown location and scale parameters. Ann. Math. Stat. 39(1), 29–48 (1968)
Campadelli, P., Casiraghi, E., Ceruti, C., Rozza, A.: Intrinsic dimension estimation: relevant techniques and a benchmark framework. Math. Probl. Eng. 2015, 1–21 (2015)
Denti, F., Doimo, D., Laio, A., Mira, A.: The generalized ratios intrinsic dimension estimator. Sci. Rep. 12(1) (2022)
Facco, E., d’Errico, M., Rodriguez, A., Laio, A.: Estimating the intrinsic dimension of datasets by a minimal neighborhood info. Sci. Rep. (2017)
Hill, B.M.: A simple general approach to inference about the tail of a distribution. 3(5), 1163–1174 (1975)
Houle, M.E.: Dimensionality, discriminability, density and distance distributions. In: ICDMW13, pp. 468–473 (2013)
Houle, M.E.: Local intrinsic dimensionality I: an extreme-value-theoretic foundation for similarity applications. In: SISAP, pp. 64–79 (2017)
Huang, H., Campello, R.J.G.B., Erfani, S.M., Ma, X., Houle, M.E., Bailey, J.: LDReg: Local dimensionality regularized self-supervised learning. ICLR 24 . 10.48550/arXiv.2401.10474
Ibrahim, J.G., Chen, M., Gwon, Y., Chen, F.: The power prior: theory and applications. Stat. Med. 34(28), 3724–3749 (2015)
James, W., Stein, C.: Estimation with quadratic loss (1992). https://doi.org/10.1007/978-1-4612-0919-5_30
Jeffreys, H.: An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. A 186(1007), 453–461 (1946)
Jolliffe, I.T.: Principal Component Analysis. Springer (2002)
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: NeurIPS (2004)
Ma, X., et al: Characterizing adversarial subspaces using local intrinsic dimensionality. In: ICLR (2018)
Ma, X., Wang, Y., Houle, M.E., Zhou, S., Erfani, S.M., Xia, S., Wijewickrema, S.N.R., Bailey, J.: Dimensionality-driven learning with noisy labels. In: ICML (2018)
Neyman, E., Roughgarden, T.: From proper scoring rules to max-min optimal forecast aggregation. In: EC ’21, p. 734. ACM (2021)
Rozza, A., Lombardi, G., Ceruti, C., Casiraghi, E., Campadelli, P.: Novel high intrinsic dimensionality estimators. Mach. Learn. (2012)
Thordsen, E., Schubert, E.: ABID: angle based intrinsic dimensionality-theory and analysis. Inf. Syst. 108, 101989 (2022)
Tordesillas, A., Zhou, S., Bailey, J., Bondell, H.: A representation learning framework for detection and characterization of dead versus strain localization zones from pre-to post-failure. Gra, Mat (2022)
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: CVPR (2018)
Zhou, S., Tordesillas, A., Pouragha, M., Bailey, J., Bondell, H.: On local intrinsic dimensionality of deformation in complex materials. Nat. Sci. Rep. 11(10216) (2021). https://doi.org/10.1038/s41598-021-89328-8
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Joukhadar, Z., Huang, H., Erfani, S.M., Campello, R.J.G.B., Houle, M.E., Bailey, J. (2025). Bayesian Estimation Approaches for Local Intrinsic Dimensionality. In: Chávez, E., Kimia, B., Lokoč, J., Patella, M., Sedmidubsky, J. (eds) Similarity Search and Applications. SISAP 2024. Lecture Notes in Computer Science, vol 15268. Springer, Cham. https://doi.org/10.1007/978-3-031-75823-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-75823-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-75822-5
Online ISBN: 978-3-031-75823-2
eBook Packages: Computer ScienceComputer Science (R0)