Skip to main content

Query Engine Grid for Executing SQL Streaming Process

  • Conference paper
Data Management in Grid and Peer-to-Peer Systems (Globe 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6864))

Included in the following conference series:

Abstract

Many enterprise applications are based on continuous analytics of data streams. Integrating data-intensive stream processing with query processing allows us to take advantage of SQL’s expressive power and DBMS’s data management capability. However, it also raises serious challenges in dealing with complex dataflow, applying queries to unbounded stream data, and providing highly scalable, dynamically configurable, elastic infrastructure.

In this project we tackle these problems in three dimensions. First, we model the general graph-structured, continuous dataflow analytics as a SQL Streaming Process with multiple connected and stationed continuous queries. Next, we extend the query engine to support cycle-based query execution for processing unbounded stream data in bounded chunks with sound semantics. Finally, we develop the Query Engine Grid (QE-Grid) over the Distributed Caching Platforms (DCP) as a dynamically configurable elastic infrastructure for parallel and distributed execution of SQL Streaming Processes.

The proposed infrastructure is preliminarily implemented using PostgreSQL engines. Our experience shows its merit in leveraging SQL and query engines to analyze real-time, graph-structured and unbounded streams. Integrating it with a commercial and proprietary MPP based database cluster is being investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nori, A.: Distributed Caching Platforms. In: VLDB 2010 (2010)

    Google Scholar 

  2. Arasu, A., Babu, S., Widom, J.: The CQL Continuous Query Language: Semantic Foundations and Query Execution. VLDB Journal 2(15) (June 2006)

    Google Scholar 

  3. Abadi, D.J., et al.: The Design of the Borealis Stream Processing Engine. In: CIDR 2005 (2005)

    Google Scholar 

  4. Bryant, R.E.: Data-Intensive Supercomputing: The case for DISC, CMU-CS-07-128 (2007)

    Google Scholar 

  5. Chen, Q., Hsu, M., Zeller, H.: Experience in Continuous analytics as a Service (CaaaS). In: EDBT 2011 (2011)

    Google Scholar 

  6. Chen, Q., Hsu, M.: SFL: A Structured Dataflow Language based on SQL and FP. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010. LNCS, vol. 6261, pp. 306–314. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Chen, Q., Hsu, M.: Experience in Extending Query Engine for Continuous Analytics. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 190–202. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  8. Chen, Q., Hsu, M.: Continuous MapReduce for In-DB Stream Analytics. In: Proc. CoopIS 2010 (2010)

    Google Scholar 

  9. Dean, J.: Experiences with MapReduce, an abstraction for large-scale computation. In: Int. Conf. on Parallel Architecture and Compilation Techniques. ACM, New York (2006)

    Google Scholar 

  10. Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed data-parallel programs from sequential building blocks. In: EuroSys 2007 (March 2007)

    Google Scholar 

  11. Franklin, M.J., et al.: Continuous Analytics: Rethinking Query Processing in a NetworkEffect World. In: CIDR 2009 (2009)

    Google Scholar 

  12. Memcached (2010), http://www.memcached.org/

  13. EhCache (2010), http://www.terracotta.org/

  14. Vmware vFabric GemFire (2010), http://www.gemstone.com/

  15. Gigaspaces Extreme Application Platform (2010), http://www.gigaspaces.com/xap

  16. IBM Websphere Extreme Scale Cache (2010), http://publib.boulder.ibm.com/infocenter/wxsinfo/v7r1/index.jsp?topic=/com.ibm.websphere.extremescale.over.doc/cxsoverview.html

  17. AppFabric Cache (2010), http://msdn.microsoft.com/appfabric

  18. Liarou, E., et al.: Exploiting the Power of Relational Databases for Efficient Stream Processing. In: EDBT 2009 (2009)

    Google Scholar 

  19. Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: A Not-So-Foreign Language for Data Processing. In: ACM SIGMOD (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, Q., Hsu, M. (2011). Query Engine Grid for Executing SQL Streaming Process. In: Hameurlain, A., Tjoa, A.M. (eds) Data Management in Grid and Peer-to-Peer Systems. Globe 2011. Lecture Notes in Computer Science, vol 6864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22947-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22947-3_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22946-6

  • Online ISBN: 978-3-642-22947-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics