Abstract
In this paper, we provide foundations and theoretical results of a novel paradigm for supporting data stream miming algorithms effectively and efficiently, the so-called non-linear data stream compression model. Particularly, the proposed model falls in that class of data stream mining applications where interesting knowledge is extracted via suitable collections of OLAP queries from data streams, being latter ones baseline operations of complex knowledge discovery tasks over data streams implemented by ad-hoc data stream mining algorithms. Here, a fortunate line of research consists in admitting approximate, i.e. compressed, representation models and query/mining results at the benefit of a more efficient and faster computation. On top of this main assumption, the proposed non-linear data stream compression model pursues the idea of maintaining a lower degree of approximation (thus, as a consequence, a higher query error) for aggregate information on those data stream readings related to interesting events, and, by contrast, a higher degree of approximation (thus, as a consequence, a lower query error) for aggregate information on other data stream readings, i.e. readings not related to any particular event, or related to low-interesting events.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C., Han, J., Wang, J., Yu, P.S.: On Demand Classification of Data Streams. In: ACM SIGKDD (2004)
Babu, S., Widom, J.: Continuous Queries over Data Streams. In: ACM SIGMOD RECORD (September 2001)
Barbara, D., DuMouchel, W., Faloutsos, C., Haas, P.J., Hellerstein, J.M., Ioannidis, Y.E., Jagadish, H.V., Johnson, T., Ng, R.T., Poosala, V., Ross, K.A., Sevcik, K.C.: The New Jersey Data Reduction Report. IEEE Data Engineering Bulletin 20(4) (1997)
Cai, Y.D., Clutterx, D., Papex, G., Han, J., Welgex, M., Auvilx, L.: MAIDS: Mining Alarming Incidents from Data Streams. In: ACM SIGMOD (2004)
Chaudhuri, S., Motwani, R., Narasayya, V.: On Random Sampling over Joins. In: ACM SIGMOD (1999)
Chen, Q., Li, Z., Liu, H.: Optimizing Complex Event Processing over RFID Data Streams. In: IEEE ICDE (2008)
Corchado, E., Graña, M., Wozniak, M.: New trends and applications on hybrid artificial intelligence systems. Neurocomputing 75(1) (2012)
Cuzzocrea, A.: Overcoming Limitations of Approximate Query Answering in OLAP. In: IDEAS (2005)
Cuzzocrea, A.: Synopsis Data Structures for Representing, Querying, and Mining Data Streams. In: Ferragine, V.E., Doorn, J.H., Rivero, L.C. (eds.) Encyclopedia of Database Technologies and Applications (2008)
Cuzzocrea, A.: Intelligent Techniques for Warehousing and Mining Sensor Network Data. IGI Global (2009)
Cuzzocrea, A., Chakravarthy, S.: Event-based lossy compression for effective and efficient olap over data streams. Data Knowl. Eng. 69(7) (2010)
Cuzzocrea, A., Furfaro, F., Masciari, E., Saccà, D., Sirangelo, C.: Approximate Query Answering on Sensor Network Data Streams. In: Stefanidis, A., Nittel, S. (eds.) GeoSensor Networks (2004)
Cuzzocrea, A., Furfaro, F., Mazzeo, G.M., Saccá, D.: A Grid Framework for Approximate Aggregate Query Answering on Summarized Sensor Network Readings. In: Meersman, R., Tari, Z., Corsaro, A. (eds.) OTM-WS 2004. LNCS, vol. 3292, pp. 144–153. Springer, Heidelberg (2004)
Dobra, A., Gehrke, J., Garofalakis, M., Rastogi, R.: Processing Complex Aggregate Queries over Data Streams. In: ACM SIGMOD (2002)
Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: ACM SIGKDD (2000)
Gehrke, J., Korn, F., Srivastava, D.: On Computing Correlated Aggregates over Continual Data Streams. In: ACM SIGMOD (2001)
Gilbert, A., Guha, S., Indyk, P., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Fast, Small-Space Algorithms for Approximate Histogram Maintenance. In: ACM STOC (2002)
Gilbert, A., Kotidis, Y., Muthukrishnan, S., Strauss, M.: One-Pass Wavelet Decompositions of Data Streams. IEEE Trans. on Knowledge and Data Engineering 15(3) (2003)
Greenwald, M., Khanna, S.: Space-Efficient Online Computation of Quantile Summaries. In: ACM SIGMOD (2001)
Guha, S., Koudas, N., Shim, K.: Data Streams and Histograms. In: ACM STOC (2001)
Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams: Theory and Practice. IEEE Trans. on Knowledge and Data Engineering 15(3) (2003)
Ho, C.-T., Agrawal, R., Megiddo, N., Srikant, R.: Range Queries in OLAP Data Cubes. In: ACM SIGMOD (1997)
Jiang, N., Gruenwald, L.: Research Issues in Data Stream Association Rule Mining. ACM SIGMOD Record 35(1) (2006)
Jiang, Q., Adaikkalavan, R., Chakravarthy, S.: MavEStream: Synergistic Integration of Stream and Event Processing. In: IEEE ICDT (2007)
Manku, G., Motwani, R.: Approximate Frequency Counts over Data Streams. In: VLDB (2002)
Muthukrishnan, S.: Data Streams: Algorithms and Applications. In: ACM-SIAM SODA (2003)
Samet, H.: The Quad-Tree and Related Hierarchical Data Structures. ACM Computing Surveys 16(2) (1984)
Vitter, J.: Random Sampling with a Reservoir. CM Trans. on Mathematical Software 11(1) (1985)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cuzzocrea, A., Decker, H. (2012). Non-linear Data Stream Compression: Foundations and Theoretical Results. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_56
Download citation
DOI: https://doi.org/10.1007/978-3-642-28942-2_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28941-5
Online ISBN: 978-3-642-28942-2
eBook Packages: Computer ScienceComputer Science (R0)