Skip to main content

Outlier Detection Under False Omission Rate Control

  • Conference paper
  • First Online:
Computational Science – ICCS 2023 (ICCS 2023)

Abstract

We argue that in many practical situations control of False Omission Rate (FOR) or Bayesian False Omission Rate (BFOR) is of primary importance. We develop and investigate such rule in the context of outlier detection, and propose its empirical formulation for practical use. We consider several score statistics used to detect outliers and study how well the introduced method controls FOR in practice. It is shown by analysis of several datasets that FOR control in contrast to FDR control is inherently tied to performance of the score statistic employed on both inlier and outlier data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/wawrzenczyka/FOR-CTL.

  2. 2.

    https://github.com/wawrzenczyka/FOR-CTL-datasets.

References

  1. Bates, S., Cand\(\grave{e}\)s, E., Lei, L., Romano, Y., Sesia, M.: Testing for outliers with conformal p-values. Annals of Statistics, to appear (2023)

    Google Scholar 

  2. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57, 289–300 (1995)

    MathSciNet  Google Scholar 

  3. Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data. Springer (2011)

    Google Scholar 

  4. Chao, M.T., Strawderman, W.E.: Negative moments of positive random variables. J. Am. Stat. Assoc. 67, 429–431 (1972)

    Article  Google Scholar 

  5. Dudoit, S., van der Laan, M.J.: Multiple Testing Procedures with Applications to Genomics. Springer (Jan 2008). https://doi.org/10.1007/978-0-387-49317-6

  6. Efron, B.: Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Cambridge University Press, Institute of Mathematical Statistics Monographs (2010)

    Book  Google Scholar 

  7. Genovese, C., Wasserman, L.: Operating characteristics and extensions of false discovery rate procedures. J. Roy. Stat. Soc.: Ser. B (Methodol.) 64(3), 499–517 (2002)

    Article  MathSciNet  Google Scholar 

  8. Li, Z., Zhao, Y., Hu, X., Botta, N., Ionescu, C., Chen, G.H.: Ecod: Unsupervised outlier detection using empirical cumulative distribution functions. In: IEEE Transactions on Knowledge and Data Engineering (2022)

    Google Scholar 

  9. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008)

    Google Scholar 

  10. Liu, R.Y., Parelius, J.M., Singh, K.: Multivariate analysis by data depth: descriptive statistics, graphics and inference. Ann. Stat. 27(3), 783–858 (1999)

    Article  MathSciNet  Google Scholar 

  11. Meinshausen, N., Meier, L., Bühlmann, P.: P-values for high-dimensional regression. J. Am. Stat. Assoc. 104, 1671–1681 (2008)

    Article  MathSciNet  Google Scholar 

  12. Sperl, P., Schulze, J.-P., Böttinger, K.: Activation anomaly analysis. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds.) ECML PKDD 2020. LNCS (LNAI), vol. 12458, pp. 69–84. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67661-2_5

    Chapter  Google Scholar 

  13. Storey, J.: Direct approach to false discovery rates. J. Royal Stat. Society. Series B (Methodological) 64, 479–498 (2002)

    Google Scholar 

  14. Takahashi, H., Ichinose, N., Yasusei, O.: False-negative rate of sars-cov-2 rt-pcr tests and its relationship to test timing and illness severity. IdCases 28 (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Mielniczuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wawrzeńczyk, A., Mielniczuk, J. (2023). Outlier Detection Under False Omission Rate Control. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 10475. Springer, Cham. https://doi.org/10.1007/978-3-031-36024-4_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36024-4_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36023-7

  • Online ISBN: 978-3-031-36024-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics