Skip to main content

NeoLOD: A Novel Generalized Coupled Local Outlier Detection Model Embedded Non-IID Similarity Metric

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11439))

Included in the following conference series:

Abstract

Traditional generalized local outlier detection model (TraLOD) unifies the abstract methods and steps for classic local outlier detection approaches that are able to capture local behavior to improve detection performance compared to global outlier detection techniques. However, TraLOD still suffers from an inherent limitation for rational data: it uses traditional (Euclidean) similarity metric to pick out the context/reference set ignoring the effect of attribute structure. i.e., it is with the fundamental assumption that attributes and attribute values are independent and identically distributed (IID). To address the issue above, this paper introduces a novel Non-IID generalized coupled local outlier detection model (NeoLOD) and its instance (NeoLOF) for identifying local outliers with strong couplings. Concretely, this paper mainly includes three aspects: (i) captures the underlying attribute relations automatically by using the Bayesian network. (ii) proposes a novel Non-IID similarity metric to capture the intra-coupling and inter-coupling between attributes and attribute values. (iii) unifies the generalized local outlier detection model by incorporating the Non-IID similarity metric and instantiates a novel NeoLOF algorithm. Results obtained from 13 data sets show the proposed similarity metric can utilize the attribute structure effectively and NeoLOF can improve the performance in local outlier detection tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    RDBMS refers to the database management system based on the relational model.

  2. 2.

    They are downloaded from: http://archive.ics.uci.edu/ml/datasets.html; http://lib.stat.cmu.edu/index.php.

References

  1. Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)

    Google Scholar 

  2. Ernst, M., Haesbroeck, G.: Comparison of local outlier detection techniques in spatial multivariate data. Data Min. Knowl. Discov. 31(2), 371–399 (2017)

    Article  MathSciNet  Google Scholar 

  3. Kriegel, H.-P., Kröger, P., Schubert, E., Zimek, A.: LoOP: local outlier probabilities. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1649–1652. ACM (2009)

    Google Scholar 

  4. Zhang, K., Hutter, M., Jin, H.: A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 813–822. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01307-2_84

    Chapter  Google Scholar 

  5. Schubert, E., Zimek, A., Kriegel, H.-P.: Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min. Knowl. Discov. 28(1), 190–237 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  6. Song, X., Wu, M., Jermaine, C., Ranka, S.: Conditional anomaly detection. IEEE Trans. Knowl. Data Eng. 19(5), 631–645 (2007)

    Article  Google Scholar 

  7. Wang, X., Davidson, I.: Discovering contexts and contextual outliers using random walks in graphs. In: 2009 Ninth IEEE International Conference on Data Mining, ICDM 2009, pp. 1034–1039. IEEE (2009)

    Google Scholar 

  8. Zheng, G., Brantley, S.L., Lauvaux, T., Li, Z.: Contextual spatial outlier detection with metric learning. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2161–2170. ACM (2017)

    Google Scholar 

  9. Jian, S., Cao, L., Lu, K., Gao, H.: Unsupervised coupled metric similarity for non-IID categorical data. IEEE Trans. Knowl. Data Eng. 30, 1810–1823 (2018)

    Article  Google Scholar 

  10. Zhu, C., Cao, L., Liu, Q., Yin, J., Kumar, V.: Heterogeneous metric learning of categorical data with hierarchical couplings. IEEE Trans. Knowl. Data Eng. 30, 1254–1267 (2018)

    Article  Google Scholar 

  11. Chen, L., Liu, H., Pang, G., Cao, L.: Learning homophily couplings from non-IID data for joint feature selection and noise-resilient outlier detection. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-2017, pp. 2585–2591 (2017)

    Google Scholar 

  12. Pang, G., Cao, L., Chen, L., Liu, H.: Learning homophily couplings from non-IID data for joint feature selection and noise-resilient outlier detection. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 2585–2591. AAAI Press (2017)

    Google Scholar 

  13. Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)

    Article  Google Scholar 

  14. Wang, C., Dong, X., Zhou, F., Cao, L., Chi, C.H.: Coupled attribute similarity learning on categorical data. IEEE Trans. Neural Netw. Learn. Syst. 26(4), 781–797 (2015)

    Article  MathSciNet  Google Scholar 

  15. Ienco, D., Pensa, R.G., Meo, R.: From context to distance: learning dissimilarity for categorical data clustering. ACM Trans. Knowl. Discov. Data 6(1), 1–25 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (2017YFB0702600, 2017YFB0702601), the National Natural Science Foundation of China (61432008, U1435214, 61503178, 61806092) and Jiangsu Natural Science Foundation (BK20180326).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Meng, F., Gao, Y., Huo, J., Qi, X., Yi, S. (2019). NeoLOD: A Novel Generalized Coupled Local Outlier Detection Model Embedded Non-IID Similarity Metric. In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11439. Springer, Cham. https://doi.org/10.1007/978-3-030-16148-4_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16148-4_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16147-7

  • Online ISBN: 978-3-030-16148-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics