Skip to main content
Log in

Dynamic routing of data stream tuples among parallel query plan running on multi-core processors

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

In this paper, a method for fast processing of data stream tuples in parallel execution of continuous queries over a multiprocessing environment is proposed. A copy of the query plan is assigned to each of processing units in the multiprocessing environment. Dynamic and continuous routing of input data stream tuples among the graph constructed by these copies (called the Query Mega Graph) for each input tuple determines that, after getting processed by each processing unit (e.g., processor), to which next processor it should be forwarded. Selection of the proper next processor is performed such that the destination processor imposes the minimum tuple latency to the corresponding tuple, among all of the alternative processors. The tuple latency is derived from processing, buffering and communication time delay which varies in different practical parallel systems.

Parallel system architectures that would be suitable as the desired multiprocessing environment for employing the proposed Dynamic Tuple Routing (DTR) method are considered and analyzed. Also, practical challenges and issues for the proper parallel underlying system are discussed. Implementation of the desired parallel system on multi-core systems is provided and used for evaluating the proposed DTR method. Evaluation results show that the proposed DTR method outperforms similar method such as the Eddies in terms of tuple latency, throughput and tuple loss.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. Notations are based on predicate logic in the Z notation [40].

  2. Symmetric Multi-Processors.

  3. ASymmetric Multi-Processors.

  4. Instruction Set Architecture.

  5. Application Programming Interface.

References

  1. Safaei, A.A., Haghjoo, M.S.: Parallel processing of data stream query operators. Distrib. Parallel Databases 282, 93–118 (2010). doi:10.1007/s10619-010-7066-3

    Article  Google Scholar 

  2. Safaei, Ali A., Haghjoo, Mostafa S.: Dispatching stream operators in parallel execution of continuous queries. J. Supercomput. (2011). doi:10.1007/s11227-011-0621-5

    MATH  Google Scholar 

  3. Babcock, Brian, et al.: Operator scheduling in data stream systems. VLDB J. 13, 333–353 (2004)

    Article  Google Scholar 

  4. Replicate and migrate objects in the runtime, not cache lines or pages in hardware (Invited Plenary Lecture). In: Barcelona Multicore Workshop 2010, Barcelona, Spain, 21–22 Oct. (2010)

  5. El-Rewini, H., Abd-El-Barr, M.: Advanced Computer Architecture and Parallel Processing. Wiley, Hoboken (2005). doi:10.1002/0471478385.index

    Google Scholar 

  6. Feng, T.Y.: A survey of interconnection networks. Computer 14, 12–27 (1981)

    Article  Google Scholar 

  7. Singah, B.: On multistage interconnection network. M.Sc. thesis (2000)

  8. Aljundi, C., Chadi, A., Jundi, A., Dekeyser, J.-l., Scherson, I.D.: An interconnection networks comparative performance evaluation methodology: the case of delta and over-sized delta multistage interconnection networks. In: Proc. of the 16th International Conference on Parallel and Distributed Computing Systems (2003)

    Google Scholar 

  9. Lawrie, D.H.: Access and alignment of data in an array processor. IEEE Trans. Comput. C-24, 1145–1155 (1975)

    Article  MathSciNet  Google Scholar 

  10. Thomas, R.H.: Behavior of butterfly parallel processor in the presence of memory hot spots. In: Proc. of the 1986 Int. Conf. Parallel Processing, pp. 46–50 (1986)

    Google Scholar 

  11. Lin, W., et al.: A conflict routing scheme on multistage interconnection networks. IEEE Trans. Comput. 38(8), 1086–1097 (1989)

    Article  Google Scholar 

  12. Tian, H., Katangur, A.K., Yipan, J.Z.: A novel multistage network architecture with multicast and broadcast capability. J. Supercomput. 35, 277–300 (2006)

    Article  Google Scholar 

  13. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proc. of the 6th OSDI Symp. (2004)

    Google Scholar 

  14. Upadhyaya, P., Kwon, Y., Latency, A., Balazinska, M.: Fault-tolerance optimizer for online parallel query plans. In: Proceedings of the ACM SIGMOD (2011)

    Google Scholar 

  15. Grama, A., Karypis, G., Kumar, V., Gupta, A.: Introduction to Parallel Computing, 2nd edn. Addison-Wesley, Reading (2003)

    Google Scholar 

  16. Avnur, R., Hellerstein, J.M.: Eddies: continuously adaptive query processing. In: Proceedings of the ACM SIGMOD (2000)

    Google Scholar 

  17. The Internet traffic archive, http://ita.ee.lbl.gov/html/contrib/DEC-PKT.html

  18. Chakravarthy, S., Pajjuri, V.: Scheduling strategies and their evaluation in a data stream management system. In: Lecture Notes in Computer Science, vol. 4042. Springer, Berlin (2006)

    Google Scholar 

  19. LeBlanc, T.J.: Shared memory versus message passing in a tightly coupled multiprocessor: a case study. In: Proc. 1986 Int. Conf. Parallel Processing, pp. 463–466 (1986)

    Google Scholar 

  20. Babcock, B., et al.: Chain: operator scheduling for memory minimization in data stream systems. In: Proceedings of the ACM SIGMOD International Conference (2003)

    Google Scholar 

  21. Sharaf, M.A.: Preemptive rate-based operator scheduling in a data stream management system. In: IEEE/AICCSA (2005)

    Google Scholar 

  22. Soliman, M.S., Tan, G.: Operator-scheduling using dynamic chain for continuous-query processing. In: IEEE Int. Conference on Computer Science and Software Engineering (2008)

    Google Scholar 

  23. Sharaf, M.A., et al.: Scheduling continuous queries in data stream management systems. In: PVLDB (2008)

    Google Scholar 

  24. Don Carney, et al.: Operator scheduling in a data stream manager. In: Proceedings of the 29th International Conference on Very Large Data Bases, Germany, pp. 838–849 (2003)

    Google Scholar 

  25. Ghalambor, M., Safaeei, Ali A., Azgomi, M.A.: DSMS scheduling regarding complex QoS metrics. In: IEEE/ACS International Conference on Computer Systems and Applications (AICCSA), 10–13 May (2009)

    Google Scholar 

  26. Srivastava B., Widom: exploiting k-constraints to reduce memory overhead in continuous queries over data streams. Technical report, November 2002

  27. Graefe, G., et al.: Extensible query optimization and parallel execution in volcano. In: Query Processing for Advanced Database Systems. Morgan Kaufman, San Mateo (1994)

    Google Scholar 

  28. DeWitt, D.J., Gray, J.: Parallel database systems: the future of high performance database processing. Commun. ACM 36(6), 85–98 (1992)

    Article  Google Scholar 

  29. Graefe, G.: Volcano—an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6(1), 120–135 (1994)

    Article  Google Scholar 

  30. Apers, P.M.G., et al.: PRISMA/DB: a parallel, main memory relational DBMS. IEEE Trans. Knowl. Data Eng. 4(6), 541–554 (1992)

    Article  Google Scholar 

  31. Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25, 73–170 (1993)

    Article  Google Scholar 

  32. Abadi, D., et al.: Aurora: a new model and architecture for data stream management. VLDB J. 2, 120–139 (2003)

    Google Scholar 

  33. Deshpande, A.: An initial study of overheads of eddies. SIGMOD Rec. 33, 44–49 (2004)

    Article  Google Scholar 

  34. Tian, F., DeWitt, D.J.: Tuple routing strategies for distributed eddies. In: Proceedings of the 29th VLDB (2000)

    Google Scholar 

  35. Osman, A., Ammar, H.: Dynamic load management for distributed continuous query systems. In: Proceedings of the ICDE (2005)

    Google Scholar 

  36. Zhou, Y., et al.: Efficient dynamic operator placement in a locally distributed continuous query system. In: Lecture Notes in Computer Science, vol. 4275 (2006)

    Google Scholar 

  37. Johnson, T., et al.: Query-aware partitioning for monitoring massive network data streams. In: Proceedings of the ACM SIGMOD (2008)

    Google Scholar 

  38. Tian, F., DeWitt, D.J.: Tuple routing strategies for distributed eddies. In: Proceedings of 29th VLDB Conference, September 2003, pp. 333–344 (2003) (ISBN 0-12-722442-4)

    Chapter  Google Scholar 

  39. Gu, X., et al.: Online failure forecast for fault-tolerant data stream processing. In: Proceeding of ICDE (2008)

    Google Scholar 

  40. Woodcock, J., Davies, J.: Using Z: Specification, Refinement, and Proof. Prentice-Hall International Series in Computer Science. Prentice-Hall, New York (1996). ISBN: 0-13-948472-8

    MATH  Google Scholar 

  41. Babu, S.: Adaptive query processing in data stream management systems. Ph.D. thesis, Stanford University (2005)

  42. Babu, S., Motwani, R., Munagala, K., Nishizawa, I., Widom, J.: Adaptive ordering of pipelined stream filters. In: Proc. SIGMOD Conference, pp. 407–418 (2004)

    Google Scholar 

  43. Das, A., Gehrke, J., Riedewald, M.: Approximate join processing over data streams. In: Proc. SIGMOD Conference, pp. 40–51 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali A. Safaei.

Additional information

Communicated by: Mohamed F. Mokbel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Safaei, A.A., Sharifrazavian, A., Sharifi, M. et al. Dynamic routing of data stream tuples among parallel query plan running on multi-core processors. Distrib Parallel Databases 30, 145–176 (2012). https://doi.org/10.1007/s10619-012-7090-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-012-7090-6

Keywords

Navigation