Skip to main content

Non-linear Data Stream Compression: Foundations and Theoretical Results

  • Conference paper
Hybrid Artificial Intelligent Systems (HAIS 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7208))

Included in the following conference series:

Abstract

In this paper, we provide foundations and theoretical results of a novel paradigm for supporting data stream miming algorithms effectively and efficiently, the so-called non-linear data stream compression model. Particularly, the proposed model falls in that class of data stream mining applications where interesting knowledge is extracted via suitable collections of OLAP queries from data streams, being latter ones baseline operations of complex knowledge discovery tasks over data streams implemented by ad-hoc data stream mining algorithms. Here, a fortunate line of research consists in admitting approximate, i.e. compressed, representation models and query/mining results at the benefit of a more efficient and faster computation. On top of this main assumption, the proposed non-linear data stream compression model pursues the idea of maintaining a lower degree of approximation (thus, as a consequence, a higher query error) for aggregate information on those data stream readings related to interesting events, and, by contrast, a higher degree of approximation (thus, as a consequence, a lower query error) for aggregate information on other data stream readings, i.e. readings not related to any particular event, or related to low-interesting events.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C., Han, J., Wang, J., Yu, P.S.: On Demand Classification of Data Streams. In: ACM SIGKDD (2004)

    Google Scholar 

  2. Babu, S., Widom, J.: Continuous Queries over Data Streams. In: ACM SIGMOD RECORD (September 2001)

    Google Scholar 

  3. Barbara, D., DuMouchel, W., Faloutsos, C., Haas, P.J., Hellerstein, J.M., Ioannidis, Y.E., Jagadish, H.V., Johnson, T., Ng, R.T., Poosala, V., Ross, K.A., Sevcik, K.C.: The New Jersey Data Reduction Report. IEEE Data Engineering Bulletin 20(4) (1997)

    Google Scholar 

  4. Cai, Y.D., Clutterx, D., Papex, G., Han, J., Welgex, M., Auvilx, L.: MAIDS: Mining Alarming Incidents from Data Streams. In: ACM SIGMOD (2004)

    Google Scholar 

  5. Chaudhuri, S., Motwani, R., Narasayya, V.: On Random Sampling over Joins. In: ACM SIGMOD (1999)

    Google Scholar 

  6. Chen, Q., Li, Z., Liu, H.: Optimizing Complex Event Processing over RFID Data Streams. In: IEEE ICDE (2008)

    Google Scholar 

  7. Corchado, E., Graña, M., Wozniak, M.: New trends and applications on hybrid artificial intelligence systems. Neurocomputing 75(1) (2012)

    Google Scholar 

  8. Cuzzocrea, A.: Overcoming Limitations of Approximate Query Answering in OLAP. In: IDEAS (2005)

    Google Scholar 

  9. Cuzzocrea, A.: Synopsis Data Structures for Representing, Querying, and Mining Data Streams. In: Ferragine, V.E., Doorn, J.H., Rivero, L.C. (eds.) Encyclopedia of Database Technologies and Applications (2008)

    Google Scholar 

  10. Cuzzocrea, A.: Intelligent Techniques for Warehousing and Mining Sensor Network Data. IGI Global (2009)

    Google Scholar 

  11. Cuzzocrea, A., Chakravarthy, S.: Event-based lossy compression for effective and efficient olap over data streams. Data Knowl. Eng. 69(7) (2010)

    Google Scholar 

  12. Cuzzocrea, A., Furfaro, F., Masciari, E., Saccà, D., Sirangelo, C.: Approximate Query Answering on Sensor Network Data Streams. In: Stefanidis, A., Nittel, S. (eds.) GeoSensor Networks (2004)

    Google Scholar 

  13. Cuzzocrea, A., Furfaro, F., Mazzeo, G.M., Saccá, D.: A Grid Framework for Approximate Aggregate Query Answering on Summarized Sensor Network Readings. In: Meersman, R., Tari, Z., Corsaro, A. (eds.) OTM-WS 2004. LNCS, vol. 3292, pp. 144–153. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Dobra, A., Gehrke, J., Garofalakis, M., Rastogi, R.: Processing Complex Aggregate Queries over Data Streams. In: ACM SIGMOD (2002)

    Google Scholar 

  15. Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: ACM SIGKDD (2000)

    Google Scholar 

  16. Gehrke, J., Korn, F., Srivastava, D.: On Computing Correlated Aggregates over Continual Data Streams. In: ACM SIGMOD (2001)

    Google Scholar 

  17. Gilbert, A., Guha, S., Indyk, P., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Fast, Small-Space Algorithms for Approximate Histogram Maintenance. In: ACM STOC (2002)

    Google Scholar 

  18. Gilbert, A., Kotidis, Y., Muthukrishnan, S., Strauss, M.: One-Pass Wavelet Decompositions of Data Streams. IEEE Trans. on Knowledge and Data Engineering 15(3) (2003)

    Google Scholar 

  19. Greenwald, M., Khanna, S.: Space-Efficient Online Computation of Quantile Summaries. In: ACM SIGMOD (2001)

    Google Scholar 

  20. Guha, S., Koudas, N., Shim, K.: Data Streams and Histograms. In: ACM STOC (2001)

    Google Scholar 

  21. Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams: Theory and Practice. IEEE Trans. on Knowledge and Data Engineering 15(3) (2003)

    Google Scholar 

  22. Ho, C.-T., Agrawal, R., Megiddo, N., Srikant, R.: Range Queries in OLAP Data Cubes. In: ACM SIGMOD (1997)

    Google Scholar 

  23. Jiang, N., Gruenwald, L.: Research Issues in Data Stream Association Rule Mining. ACM SIGMOD Record 35(1) (2006)

    Google Scholar 

  24. Jiang, Q., Adaikkalavan, R., Chakravarthy, S.: MavEStream: Synergistic Integration of Stream and Event Processing. In: IEEE ICDT (2007)

    Google Scholar 

  25. Manku, G., Motwani, R.: Approximate Frequency Counts over Data Streams. In: VLDB (2002)

    Google Scholar 

  26. Muthukrishnan, S.: Data Streams: Algorithms and Applications. In: ACM-SIAM SODA (2003)

    Google Scholar 

  27. Samet, H.: The Quad-Tree and Related Hierarchical Data Structures. ACM Computing Surveys 16(2) (1984)

    Google Scholar 

  28. Vitter, J.: Random Sampling with a Reservoir. CM Trans. on Mathematical Software 11(1) (1985)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cuzzocrea, A., Decker, H. (2012). Non-linear Data Stream Compression: Foundations and Theoretical Results. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28942-2_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28941-5

  • Online ISBN: 978-3-642-28942-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics