Query Engine Grid for Executing SQL Streaming Process

Chen, Qiming; Hsu, Meichun

doi:10.1007/978-3-642-22947-3_9

Qiming Chen^18,19 &
Meichun Hsu^18,19

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6864))

Included in the following conference series:

International Conference on Data Management in Grid and P2P Systems

345 Accesses
1 Citations

Abstract

Many enterprise applications are based on continuous analytics of data streams. Integrating data-intensive stream processing with query processing allows us to take advantage of SQL’s expressive power and DBMS’s data management capability. However, it also raises serious challenges in dealing with complex dataflow, applying queries to unbounded stream data, and providing highly scalable, dynamically configurable, elastic infrastructure.

In this project we tackle these problems in three dimensions. First, we model the general graph-structured, continuous dataflow analytics as a SQL Streaming Process with multiple connected and stationed continuous queries. Next, we extend the query engine to support cycle-based query execution for processing unbounded stream data in bounded chunks with sound semantics. Finally, we develop the Query Engine Grid (QE-Grid) over the Distributed Caching Platforms (DCP) as a dynamically configurable elastic infrastructure for parallel and distributed execution of SQL Streaming Processes.

The proposed infrastructure is preliminarily implemented using PostgreSQL engines. Our experience shows its merit in leveraging SQL and query engines to analyze real-time, graph-structured and unbounded streams. Integrating it with a commercial and proprietary MPP based database cluster is being investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Nori, A.: Distributed Caching Platforms. In: VLDB 2010 (2010)
Google Scholar
Arasu, A., Babu, S., Widom, J.: The CQL Continuous Query Language: Semantic Foundations and Query Execution. VLDB Journal 2(15) (June 2006)
Google Scholar
Abadi, D.J., et al.: The Design of the Borealis Stream Processing Engine. In: CIDR 2005 (2005)
Google Scholar
Bryant, R.E.: Data-Intensive Supercomputing: The case for DISC, CMU-CS-07-128 (2007)
Google Scholar
Chen, Q., Hsu, M., Zeller, H.: Experience in Continuous analytics as a Service (CaaaS). In: EDBT 2011 (2011)
Google Scholar
Chen, Q., Hsu, M.: SFL: A Structured Dataflow Language based on SQL and FP. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010. LNCS, vol. 6261, pp. 306–314. Springer, Heidelberg (2010)
Chapter Google Scholar
Chen, Q., Hsu, M.: Experience in Extending Query Engine for Continuous Analytics. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 190–202. Springer, Heidelberg (2010)
Chapter Google Scholar
Chen, Q., Hsu, M.: Continuous MapReduce for In-DB Stream Analytics. In: Proc. CoopIS 2010 (2010)
Google Scholar
Dean, J.: Experiences with MapReduce, an abstraction for large-scale computation. In: Int. Conf. on Parallel Architecture and Compilation Techniques. ACM, New York (2006)
Google Scholar
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed data-parallel programs from sequential building blocks. In: EuroSys 2007 (March 2007)
Google Scholar
Franklin, M.J., et al.: Continuous Analytics: Rethinking Query Processing in a NetworkEffect World. In: CIDR 2009 (2009)
Google Scholar
Memcached (2010), http://www.memcached.org/
EhCache (2010), http://www.terracotta.org/
Vmware vFabric GemFire (2010), http://www.gemstone.com/
Gigaspaces Extreme Application Platform (2010), http://www.gigaspaces.com/xap
IBM Websphere Extreme Scale Cache (2010), http://publib.boulder.ibm.com/infocenter/wxsinfo/v7r1/index.jsp?topic=/com.ibm.websphere.extremescale.over.doc/cxsoverview.html
AppFabric Cache (2010), http://msdn.microsoft.com/appfabric
Liarou, E., et al.: Exploiting the Power of Relational Databases for Efficient Stream Processing. In: EDBT 2009 (2009)
Google Scholar
Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: A Not-So-Foreign Language for Data Processing. In: ACM SIGMOD (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

HP Labs, Palo Alto, California, USA
Qiming Chen & Meichun Hsu
Hewlett Packard Co.,
Qiming Chen & Meichun Hsu

Authors

Qiming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Meichun Hsu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut de Recherche en Informatique de Toulouse (IRIT), Paul Sabatier University, 118, route de Narbonne, 31062, Toulouse Cedex, France
Abdelkader Hameurlain
Institut für Softwaretechnik, Technische Universität Wien, Favoritenstr. 9-11/188, 1040, Wien, Austria
A Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Q., Hsu, M. (2011). Query Engine Grid for Executing SQL Streaming Process. In: Hameurlain, A., Tjoa, A.M. (eds) Data Management in Grid and Peer-to-Peer Systems. Globe 2011. Lecture Notes in Computer Science, vol 6864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22947-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-22947-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22946-6
Online ISBN: 978-3-642-22947-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics