Skip to main content

Multiple Continuous Outlier Detection over Data Stream

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14854))

Included in the following conference series:

  • 301 Accesses

Abstract

This paper studies the problem of Multiple Continuous Outlier Detection (MCOD for short) over data stream, a fundamental problem in the domain of streaming data management. Let \(\mathcal {S}\) be the set of streaming data, and \(\mathcal {Q}\) be the outlier detection query (query for short) workload. It contains a set of queries with different query parameters. Each query q(nskr) within \(\mathcal {Q}\) monitors objects in \(\mathcal {S}\) that are generated within the last q(n) time units. Whenever q(s) time units pass, q will return outliers within the range threshold r that do not satisfy k neighbor thresholds to the system. Some efforts are proposed to support MCOD, but they incur highly running cost both in computational and space, which cannot efficiently work under data stream.

In this paper, we propose a novel framework named Maximal Common Neighbour (MCN for short) over data stream. It is based on the following observation. That is, if query parameters under two queries q and \(q'\) are similar, neighbours of an object o under q are likely to be neighbours of o under \(q'\). Accordingly, we propose a novel index named Common Neighbour Tree (CN-Tree for short) to maintain neighbours of objects under different queries. It organizes neighbours of each object based on similar relationships among queries, so as to avoid redundant neighbour maintenance. In addition, we propose a group of efficient algorithms to support CN-Tree maintenance. Finally, we conduct extensive performance studies on large real and synthetic data sets, which demonstrate that our new framework could efficiently support MCOD over data stream.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cao, L., Wang, J., Rundensteiner, E.A.: Sharing-aware outlier analytics over high-volume data streams. p. 527-540. SIGMOD ’16 (2016)

    Google Scholar 

  2. Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.A.: Scalable distance-based outlier detection over high-volume data streams. In: ICDE 2014, IL, USA, March 31 - April 4, 2014. pp. 76–87. IEEE Computer Society (2014)

    Google Scholar 

  3. Ciaccia, P., Patella, M., Rabitti, F., Zezula, P.: Indexing metric spaces with m-tree. In: SEBD 1997. pp. 67–86 (1997)

    Google Scholar 

  4. He, Q., Ma, Y., Wang, Q., Zhuang, F., Shi, Z.: Parallel outlier detection using kd-tree based on mapreduce. In: IEEE 3rd International Conference on Cloud Computing Technology and Science. pp. 75–80. IEEE Computer Society (2011)

    Google Scholar 

  5. Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: ICDE 2011. pp. 135–146. IEEE Computer Society (2011)

    Google Scholar 

  6. Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Efficient and flexible algorithms for monitoring distance-based outliers over data streams. 55, 37–53 (2016)

    Google Scholar 

  7. Lin, Y., Lee, B.S., Lustgarten, D.: Continuous detection of abnormal heartbeats from ECG using online outlier detection 898, 349–366 (2018)

    Google Scholar 

  8. Ma, Y., Zhao, X., Zhang, C., Zhang, J., Qin, X.: Outlier detection from multiple data sources. Inf. Sci. 580, 819–837 (2021)

    Article  MathSciNet  Google Scholar 

  9. Mirzaie, S., Bushehrian, O.: A new outlier detection method for anomaly detection in iot-enabled distribution networks. Ad Hoc Sens. Wirel. Networks 55(1–2), 23–43 (2023)

    Google Scholar 

  10. Su, S., Xiao, L., Ruan, L., Xu, R., Li, S., Wang, Z., He, Q., Li, W.: ADCMO: an anomaly detection approach based on local outlier factor for continuously monitored object. In: SustainCom. pp. 865–874. IEEE (2019)

    Google Scholar 

  11. Tang, X., Huang, W., Li, X., Li, S., Liu, Y.: Outlier detection via minimum spanning tree. In: 20th Pacific Asia Conference on Information Systems. p. 211 (2016)

    Google Scholar 

  12. Toliopoulos, T., Gounaris, A.: Multi-parameter streaming outlier detection pp. 208–216 (2019)

    Google Scholar 

  13. Toliopoulos, T., Gounaris, A.: Adaptivity in continuous massively parallel distance-based outlier detection. Computing 104(12), 2659–2684 (2022)

    Article  Google Scholar 

  14. Tran, L.V., Mun, M., Shahabi, C.: Real-time distance-based outlier detection in data streams. Proc. VLDB Endow. 14(2), 141–153 (2020)

    Article  Google Scholar 

  15. Yang, X., Wang, B., Yang, K., Liu, C., Zheng, B.: A novel representation and compression for queries on trajectories in road networks. IEEE Trans. Knowl. Data Eng. 30(4), 613–629 (2018)

    Article  Google Scholar 

  16. Yang, X., Wang, Y., Wang, B., Wang, W.: Local filtering: Improving the performance of approximate queries on string collections. In: SIGMOD. pp. 377–392. ACM (2015)

    Google Scholar 

  17. Yoon, S., Lee, J., Lee, B.S.: NETS: extremely fast outlier detection from a data stream via set-based processing. Proc. VLDB Endow. 12(11), 1303–1315 (2019)

    Article  Google Scholar 

  18. Yoon, S., Shin, Y., Lee, J., Lee, B.S.: Multiple dynamic outlier-detection from a data stream by exploiting duality of data and queries. pp. 2063–2075. ACM (2021)

    Google Scholar 

  19. Zhu, R., Wang, B., Yang, X., Zheng, B.: Closest pairs search over data stream. Proc. ACM Manag. Data 1(3) (2023)

    Google Scholar 

Download references

Acknowledgement

The work is supported by the Liaoning Provincial Social Science Planning Fund Project (L21BGL042). Hong Jiang and Anzhen Zhang are corresponding authors of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, R. et al. (2024). Multiple Continuous Outlier Detection over Data Stream. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14854. Springer, Singapore. https://doi.org/10.1007/978-981-97-5569-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-5569-1_26

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-5568-4

  • Online ISBN: 978-981-97-5569-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics