Abstract
We argue that in many practical situations control of False Omission Rate (FOR) or Bayesian False Omission Rate (BFOR) is of primary importance. We develop and investigate such rule in the context of outlier detection, and propose its empirical formulation for practical use. We consider several score statistics used to detect outliers and study how well the introduced method controls FOR in practice. It is shown by analysis of several datasets that FOR control in contrast to FDR control is inherently tied to performance of the score statistic employed on both inlier and outlier data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bates, S., Cand\(\grave{e}\)s, E., Lei, L., Romano, Y., Sesia, M.: Testing for outliers with conformal p-values. Annals of Statistics, to appear (2023)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57, 289–300 (1995)
Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data. Springer (2011)
Chao, M.T., Strawderman, W.E.: Negative moments of positive random variables. J. Am. Stat. Assoc. 67, 429–431 (1972)
Dudoit, S., van der Laan, M.J.: Multiple Testing Procedures with Applications to Genomics. Springer (Jan 2008). https://doi.org/10.1007/978-0-387-49317-6
Efron, B.: Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Cambridge University Press, Institute of Mathematical Statistics Monographs (2010)
Genovese, C., Wasserman, L.: Operating characteristics and extensions of false discovery rate procedures. J. Roy. Stat. Soc.: Ser. B (Methodol.) 64(3), 499–517 (2002)
Li, Z., Zhao, Y., Hu, X., Botta, N., Ionescu, C., Chen, G.H.: Ecod: Unsupervised outlier detection using empirical cumulative distribution functions. In: IEEE Transactions on Knowledge and Data Engineering (2022)
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008)
Liu, R.Y., Parelius, J.M., Singh, K.: Multivariate analysis by data depth: descriptive statistics, graphics and inference. Ann. Stat. 27(3), 783–858 (1999)
Meinshausen, N., Meier, L., Bühlmann, P.: P-values for high-dimensional regression. J. Am. Stat. Assoc. 104, 1671–1681 (2008)
Sperl, P., Schulze, J.-P., Böttinger, K.: Activation anomaly analysis. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds.) ECML PKDD 2020. LNCS (LNAI), vol. 12458, pp. 69–84. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67661-2_5
Storey, J.: Direct approach to false discovery rates. J. Royal Stat. Society. Series B (Methodological) 64, 479–498 (2002)
Takahashi, H., Ichinose, N., Yasusei, O.: False-negative rate of sars-cov-2 rt-pcr tests and its relationship to test timing and illness severity. IdCases 28 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wawrzeńczyk, A., Mielniczuk, J. (2023). Outlier Detection Under False Omission Rate Control. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 10475. Springer, Cham. https://doi.org/10.1007/978-3-031-36024-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-031-36024-4_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36023-7
Online ISBN: 978-3-031-36024-4
eBook Packages: Computer ScienceComputer Science (R0)