Skip to main content

Outlier Detection in Data Streams Using OLAP Cubes

  • Conference paper
  • First Online:
Book cover New Trends in Databases and Information Systems (ADBIS 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 767))

Included in the following conference series:

Abstract

Outlier detection is an important tool for many application areas. Often, data has some multidimensional structure so that it can be viewed as OLAP cubes. Exploiting this structure systematically helps to find outliers otherwise undetectable. In this paper, we propose an approach that treats streaming data as a series of OLAP cubes. We then use an offline calculated model of the cube’s expected behavior to find outliers in the data stream. Furthermore, we aggregate multiple outliers found concurrently at different cells of the cube to some user-defined level in the cube. We apply our method to network data to find attacks in the data stream to show its usefulness.

This work has partly been developed in the project IQM4HD (reference number: 01IS15053B). IQM4HD is partly funded by the German ministry of education and research (BMBF) within the research program KMU Innovativ.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See https://www.ll.mit.edu/ideval/data/1998data.html.

  2. 2.

    A mobile network cell, not a cube cell.

References

  1. Aggarwal, C.C.: Outlier Analysis, 1st edn. Springer, New York (2013). doi:10.1007/978-1-4614-6396-2

    Book  MATH  Google Scholar 

  2. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009). http://doi.acm.org/10.1145/1541880.1541882

    Article  Google Scholar 

  3. Dunstan, N., Despi, I., Watson, C.: Anomalies in multidimensional contexts. WIT Transa. Inform. Commun. Technol. 42, 173 (2009). http://www.witpress.com/elibrary/wit-transactions-on-information-and-communication-technologies/42/19978

    Article  Google Scholar 

  4. Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: an architecture for multi-dimensional analysis of data streams. Distrib. Parallel Databases 18(2), 173–197 (2005). http://link.springer.com/article/10.1007/s10619-005-3296-1

    Article  Google Scholar 

  5. Li, X., Han, J.: Mining approximate top-k subspace anomalies in multi-dimensional time-series data. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 447–458. VLDB Endowment (2007)

    Google Scholar 

  6. Lin, S., Brown, D.E.: Outlier-based Data Association: Combining OLAP and Data Mining. Department of Systems and Information Engineering University of Virginia, Charlottesville, VA 22904 (2002). http://web.sys.virginia.edu/files/tech_papers/2002/sie-020011.pdf

  7. Lippmann, R., Fried, D., Graf, I., Haines, J., Kendall, K., McClung, D., Weber, D., Webster, S., Wyschogrod, D., Cunningham, R., Zissman, M.: Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation. In: DARPA Information Survivability Conference and Exposition, DISCEX 2000, Proceedings, vol. 2, pp. 12–26 (2000)

    Google Scholar 

  8. Müller, E., Assent, I., Iglesias, P., Mülle, Y., Böhm, K.: Outlier ranking via subspace analysis in multiple views of the data. In: 2012 IEEE 12th International Conference on Data Mining, pp. 529–538. IEEE (2012). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6413873

  9. Palpanas, T., Koudas, N., Mendelzon, A.: Using datacube aggregates for approximate querying and deviation detection. IEEE Trans. Knowl. Data Eng. 17(11), 1465–1477 (2005)

    Article  Google Scholar 

  10. Rettig, L., Khayati, M., Cudré-Mauroux, P., Piórkowski, M.: Online anomaly detection over Big Data streams. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 1113–1122 (2015)

    Google Scholar 

  11. Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-driven exploration of OLAP data cubes. In: Schek, H.-J., Alonso, G., Saltor, F., Ramos, I. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 168–182. Springer, Heidelberg (1998). doi:10.1007/BFb0100984

    Google Scholar 

  12. Sithirasenan, E., Muthukkumarasamy, V.: Substantiating anomalies in wireless networks using group outlier scores. J. Softw. 6(4), 678–689 (2011)

    Article  Google Scholar 

  13. Thatte, G., Mitra, U., Heidemann, J.: Parametric methods for anomaly detection in aggregate traffic. IEEE/ACM Trans. Networking 19(2), 512–525 (2011)

    Article  Google Scholar 

  14. Xin, D., Han, J., Li, X., Wah, B.W.: Star-cubing: Computing iceberg cubes by top-down and bottom-up integration. In: Proceedings of the 29th International Conference on Very Large Data Bases, vol. 29, pp. 476–487. VLDB Endowment (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felix Heine .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Heine, F. (2017). Outlier Detection in Data Streams Using OLAP Cubes. In: Kirikova, M., et al. New Trends in Databases and Information Systems. ADBIS 2017. Communications in Computer and Information Science, vol 767. Springer, Cham. https://doi.org/10.1007/978-3-319-67162-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67162-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67161-1

  • Online ISBN: 978-3-319-67162-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics