Skip to main content

Improving iForest for Hydrological Time Series Anomaly Detection

  • Conference paper
  • First Online:
Book cover Algorithms and Architectures for Parallel Processing (ICA3PP 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12454))

Abstract

With the increasing number of installed hydrological sensors, the data from these sensors usually contain a variety of abnormal values due to network congestion, equipment failure, or environmental influence. To deal with the anomaly on a larger scale of hydrological sensor data, a series of algorithms have been proposed. However, they are usually based on the ideas of distance or classification, which usually bring pretty high time complexity. To solve this problem, a detection algorithm called AR-iForest is proposed. It is an algorithm for hydrological time series anomaly detection based on the isolation forest. Firstly, the features of hydrological data are extracted and mapped it to a high-dimensional space. Before using the isolation forest in high-dimensional space for anomaly detection, the Auto-Regressive model is used first to predict the current data and calculate the confidence interval. Only the data not in the confidence interval needs to be detected. Secondly, a measure of the effectiveness of trees in the isolation forest is proposed. This method selects the tree with the best classification effect through continuous iteration. Finally, the proposed algorithm is integrated into the window of the big data platform Flink to give a performance evaluation. The experimental results show that the proposed algorithm increases the AUC value from 90.60% to 96.72%, and the detection time is reduced by 52.23%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wu, D.: Research and Application of Hydrological Time Series Similarity Pattern. HoHai University, pp. 1–2 (2007)

    Google Scholar 

  2. Talagala, P.D., Hyndman, R.J., Miles, K.S., Kandanaarachchi, S., Muñoz, M.A.: Anomaly detection in streaming nonstationary temporal data. JCGS 29(1), 13–27. https://doi.org/10.1080/10618600.2019.1617160

  3. Sun, J.S., Lou, Y.S., Chen, Y.J.: Outlier detection of hydrological time series based on ARIMA-SVR Model. Comput. Digit. Eng. 02, 225–230 (2018)

    Google Scholar 

  4. Vy, N.D.K., Anh, D.T.: Detecting variable length anomaly patterns in time series data. In: Proceedings of DMBD, Bali Island, Indonesia, June 2016, pp. 279–287 (2016)

    Google Scholar 

  5. Yu, Y.F., Zhu, Y.L., Wan, D.S.: Time series outlier detection based on sliding window prediction. J. Comput. Appl. 34(8), 2217–2220 (2014)

    MATH  Google Scholar 

  6. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: Proceedings of ICDM, Pisa, Italy, December 2008, pp. 413–422. https://doi.org/10.1109/icdm.2008.17

  7. Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache flinkTM: stream and batch processing in a single engine. In: Proceedings of ICDE, Seoul, South Korea, vol. 38, no. 4, pp. 28–38 (2015)

    Google Scholar 

  8. Toliopoulos, T., Gounaris, A., Tsichlas, K., Papadopoulos, A., Sampaio, S.: Continuous outlier mining of streaming data in flink (2019). arXiv:1902.07901

  9. Bandaragoda, T.R., Ting, K.M., Albrecht, D., Liu, F.T., Wells, J.R.: Efficient anomaly detection by isolation using nearest neighbour ensemble. In: Proceedings of ICDMW, Shenzhen, China, pp. 698–705 (2014). https://doi.org/10.1109/icdmw.2014.70

  10. Xu, D., Wang, Y., Meng, Y., Zhang, Z.: An improved data anomaly detection method based on isolation forest. In: Proceedings of ISCID, HangZhou, China, December 2017 (2017)

    Google Scholar 

  11. Ding, Z.G., Fei, M.R.: An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window. In: Proceedings of ICONS, ChengDu, China, September 2013, pp. 12–17 (2013)

    Google Scholar 

  12. Aryal, S., Ting, K.M., Wells, J.R., Washio, T.: Improving iForest with relative mass. In: Proc. PAKDD, TaiWan, China, May 2014, pp. 510–521 (2014)

    Google Scholar 

  13. Zou, Z., Xie, Y., Huang, K., Xu, G., Feng, D.¸ Long, D.: A docker container anomaly monitoring system based on optimized isolation forest. In: IEEE TCC, to be published. https://doi.org/10.1109/tcc.2019.2935724

  14. Ma, Y., Zhang, Q., Ding, J., Wang, Q., Ma, J.: Short term load forecasting based on iForest-LSTM. In: Proceedings of ICIEA, Xi’an, China, pp. 2278–2282 (2019)

    Google Scholar 

  15. Apache Kafka. https://kafka.apache.org/

  16. Karimov, J., Rabl, T., Katsifodimos, A., Samarev, R., Heiskanen, H., Markl, V.: Benchmarking distributed stream data processing systems. In: Proceedings of ICDE, Paris, pp. 1507–1518 (2018)

    Google Scholar 

Download references

Acknowledgments

This work is partly supported by the Fundamental Research Funds for the Central Universities B200202185, 2018 Jiangsu Province Key Research and Development Program (Modern Agriculture) Project under Grant No. BE2018301, 2017 Jiangsu Province Postdoctoral Research Funding Project under Grant No. 1701020C, 2017 Six Talent Peaks Endorsement Project of Jiangsu under Grant No. XYDXX-078, Research on the Analysis System of Hydrological Big Data under Grant No. 818116816.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Ye .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shao, P., Ye, F., Liu, Z., Wang, X., Lu, M., Mao, Y. (2020). Improving iForest for Hydrological Time Series Anomaly Detection. In: Qiu, M. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2020. Lecture Notes in Computer Science(), vol 12454. Springer, Cham. https://doi.org/10.1007/978-3-030-60248-2_12

Download citation

Publish with us

Policies and ethics