Abstract
This paper studies the problem of Multiple Continuous Outlier Detection (MCOD for short) over data stream, a fundamental problem in the domain of streaming data management. Let \(\mathcal {S}\) be the set of streaming data, and \(\mathcal {Q}\) be the outlier detection query (query for short) workload. It contains a set of queries with different query parameters. Each query q(n, s, k, r) within \(\mathcal {Q}\) monitors objects in \(\mathcal {S}\) that are generated within the last q(n) time units. Whenever q(s) time units pass, q will return outliers within the range threshold r that do not satisfy k neighbor thresholds to the system. Some efforts are proposed to support MCOD, but they incur highly running cost both in computational and space, which cannot efficiently work under data stream.
In this paper, we propose a novel framework named Maximal Common Neighbour (MCN for short) over data stream. It is based on the following observation. That is, if query parameters under two queries q and \(q'\) are similar, neighbours of an object o under q are likely to be neighbours of o under \(q'\). Accordingly, we propose a novel index named Common Neighbour Tree (CN-Tree for short) to maintain neighbours of objects under different queries. It organizes neighbours of each object based on similar relationships among queries, so as to avoid redundant neighbour maintenance. In addition, we propose a group of efficient algorithms to support CN-Tree maintenance. Finally, we conduct extensive performance studies on large real and synthetic data sets, which demonstrate that our new framework could efficiently support MCOD over data stream.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cao, L., Wang, J., Rundensteiner, E.A.: Sharing-aware outlier analytics over high-volume data streams. p. 527-540. SIGMOD ’16 (2016)
Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.A.: Scalable distance-based outlier detection over high-volume data streams. In: ICDE 2014, IL, USA, March 31 - April 4, 2014. pp. 76–87. IEEE Computer Society (2014)
Ciaccia, P., Patella, M., Rabitti, F., Zezula, P.: Indexing metric spaces with m-tree. In: SEBD 1997. pp. 67–86 (1997)
He, Q., Ma, Y., Wang, Q., Zhuang, F., Shi, Z.: Parallel outlier detection using kd-tree based on mapreduce. In: IEEE 3rd International Conference on Cloud Computing Technology and Science. pp. 75–80. IEEE Computer Society (2011)
Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: ICDE 2011. pp. 135–146. IEEE Computer Society (2011)
Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Efficient and flexible algorithms for monitoring distance-based outliers over data streams. 55, 37–53 (2016)
Lin, Y., Lee, B.S., Lustgarten, D.: Continuous detection of abnormal heartbeats from ECG using online outlier detection 898, 349–366 (2018)
Ma, Y., Zhao, X., Zhang, C., Zhang, J., Qin, X.: Outlier detection from multiple data sources. Inf. Sci. 580, 819–837 (2021)
Mirzaie, S., Bushehrian, O.: A new outlier detection method for anomaly detection in iot-enabled distribution networks. Ad Hoc Sens. Wirel. Networks 55(1–2), 23–43 (2023)
Su, S., Xiao, L., Ruan, L., Xu, R., Li, S., Wang, Z., He, Q., Li, W.: ADCMO: an anomaly detection approach based on local outlier factor for continuously monitored object. In: SustainCom. pp. 865–874. IEEE (2019)
Tang, X., Huang, W., Li, X., Li, S., Liu, Y.: Outlier detection via minimum spanning tree. In: 20th Pacific Asia Conference on Information Systems. p. 211 (2016)
Toliopoulos, T., Gounaris, A.: Multi-parameter streaming outlier detection pp. 208–216 (2019)
Toliopoulos, T., Gounaris, A.: Adaptivity in continuous massively parallel distance-based outlier detection. Computing 104(12), 2659–2684 (2022)
Tran, L.V., Mun, M., Shahabi, C.: Real-time distance-based outlier detection in data streams. Proc. VLDB Endow. 14(2), 141–153 (2020)
Yang, X., Wang, B., Yang, K., Liu, C., Zheng, B.: A novel representation and compression for queries on trajectories in road networks. IEEE Trans. Knowl. Data Eng. 30(4), 613–629 (2018)
Yang, X., Wang, Y., Wang, B., Wang, W.: Local filtering: Improving the performance of approximate queries on string collections. In: SIGMOD. pp. 377–392. ACM (2015)
Yoon, S., Lee, J., Lee, B.S.: NETS: extremely fast outlier detection from a data stream via set-based processing. Proc. VLDB Endow. 12(11), 1303–1315 (2019)
Yoon, S., Shin, Y., Lee, J., Lee, B.S.: Multiple dynamic outlier-detection from a data stream by exploiting duality of data and queries. pp. 2063–2075. ACM (2021)
Zhu, R., Wang, B., Yang, X., Zheng, B.: Closest pairs search over data stream. Proc. ACM Manag. Data 1(3) (2023)
Acknowledgement
The work is supported by the Liaoning Provincial Social Science Planning Fund Project (L21BGL042). Hong Jiang and Anzhen Zhang are corresponding authors of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhu, R. et al. (2024). Multiple Continuous Outlier Detection over Data Stream. In: Onizuka, M., et al. Database Systems for Advanced Applications. DASFAA 2024. Lecture Notes in Computer Science, vol 14854. Springer, Singapore. https://doi.org/10.1007/978-981-97-5569-1_26
Download citation
DOI: https://doi.org/10.1007/978-981-97-5569-1_26
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5568-4
Online ISBN: 978-981-97-5569-1
eBook Packages: Computer ScienceComputer Science (R0)