Skip to main content

Unbounded Barrier-Synchronized Concurrent ASMs for Effective MapReduce Processing on Streams

  • Conference paper
  • First Online:
Rigorous State-Based Methods (ABZ 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12709))

Included in the following conference series:

  • 579 Accesses

Abstract

MapReduce supports the processing of large data sets in parallel. It has been shown that MapReduce is an example for the use of the bulk synchronous parallel (BSP) bridging model, a model for parallel computation on a fixed set of processors comprising alternating computation and communication phases. In this article we extend the normal execution of MapReduce from processing large finite data sets to processing stream queries with input data stream assumed to continue indefinitely. We classify stream queries into three classes, memoryless, semi-memoryless and memorable, and provide the model for each class using MapReduce based on BSP. In addition, as some stream queries require large amounts of computing sources, the BSP computation model is extended to a model with unbounded many agents, but preserving the barrier synchronization. A behavioral theory is developed for this model extending the behavioral theory of the BSP model. This comprises an axiomatization, the definition of Infinite-Agent BSP abstract state machines (Inf-Ag-BSP-ASM) and the proof that such ASMs capture the unbounded synchronized computations. Finally, we show how MapReduce processing can be further improved on grounds of the unbounded extension.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The concatenation (\(\bigodot \)) used here is not same as the common concatenation denoted by \(\sum \). It works more like aggregation and its real functionality varies among different scenarios, but we still use the term concatenation to be consistent with [12].

  2. 2.

    \(\varTheta \)-Class is the intersection of O-Class and \(\varOmega \)-Class which provides an asymptotically tight bound for functions.

  3. 3.

    In theory, we can assume that the number is countably infinite, provided we restrict the model such that only finitely many of them will be simultaneously active.

  4. 4.

    The definitions of the notations used here can be found in [9, Def. 2.2].

  5. 5.

    Note that it still has to be ensured that an agent leaving the computation does this after completing its step. This, however, has to be ensured by the specification of the programs of the agents.

References

  1. Blass, A., Gurevich, Y.: Abstract state machines capture parallel algorithms. ACM Trans. Comput. Logic 4(4), 578–651 (2003)

    Article  MathSciNet  Google Scholar 

  2. Blass, A., Gurevich, Y.: Abstract state machines capture parallel algorithms: correction and extension. ACM Trans. Comp. Logic 9(3), 1–32 (2008)

    MathSciNet  MATH  Google Scholar 

  3. Börger, E., Schewe, K.-D.: Concurrent abstract state machines. Acta Inf. 53(5), 469–492 (2015). https://doi.org/10.1007/s00236-015-0249-7

    Article  MathSciNet  MATH  Google Scholar 

  4. Börger, E., Schewe, K.D.: A behavioural theory of recursive algorithms. Fundam. Inf. 177(1), 1–37 (2020)

    MathSciNet  Google Scholar 

  5. Costa, V.G., Marín, M.: A parallel search engine with BSP. In: Third Latin American Web Congress (LA-Web 2005), pp. 259–268. IEEE Computer Society (2005). https://doi.org/10.1109/LAWEB.2005.7

  6. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, OSDI 2004, vol. 6, p. 10. USENIX Association (2004). http://dl.acm.org/citation.cfm?id=1251254.1251264

  7. Dershowitz, N., Falkovich-Derzhavetz, E.: On the parallel computation thesis. Logic J. IGPL 24(3), 346–374 (2016). https://doi.org/10.1093/jigpal/jzw008

    Article  MathSciNet  MATH  Google Scholar 

  8. Ferrarotti, F., Schewe, K.D., Tec, L., Wang, Q.: A new thesis concerning synchronised parallel computing - simplified parallel ASM thesis. Theor. Comp. Sci. 649, 25–53 (2016). https://doi.org/10.1016/j.tcs.2016.08.013

    Article  MathSciNet  MATH  Google Scholar 

  9. Ferrarotti, F., González, S., Schewe, K.D.: BSP abstract state machines capture bulk synchronous parallel computations. Sci. Comput. Program. 184, 102319 (2019). https://doi.org/10.1016/j.scico.2019.102319

    Article  Google Scholar 

  10. Gava, F., Pommereau, F., Guedj, M.: A BSP algorithm for on-the-fly checking CTL* formulas on security protocols. J. Supercomput. 69(2), 629–672 (2014). https://doi.org/10.1007/s11227-014-1099-8

    Article  Google Scholar 

  11. Gurevich, Y.: Sequential abstract-state machines capture sequential algorithms. ACM Trans. Comp. Logic 1(1), 77–111 (2000). https://doi.org/10.1145/343369.343384

    Article  MathSciNet  MATH  Google Scholar 

  12. Gurevich, Y., Leinders, D., Van den Bussche, J.: A theory of stream queries. In: Arenas, M., Schwartzbach, M.I. (eds.) DBPL 2007. LNCS, vol. 4797, pp. 153–168. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75987-4_11

    Chapter  Google Scholar 

  13. Inda, M.A., Bisseling, R.H.: A simple and efficient parallel FFT algorithm using the BSP model. Parallel Comput. 27(14), 1847–1878 (2001)

    Article  MathSciNet  Google Scholar 

  14. Pace, M.F.: BSP vs. MapReduce. In: Ali, H.H., et al. (eds.) Proceedings of the International Conference on Computational Science (ICCS 2012). Procedia Computer Science, vol. 9, pp. 246–255. Elsevier (2012)

    Google Scholar 

  15. Schewe, K.-D., Wang, Q.: A simplified parallel ASM thesis. In: Derrick, J., et al. (eds.) ABZ 2012. LNCS, vol. 7316, pp. 341–344. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30885-7_27

    Chapter  Google Scholar 

  16. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990). https://doi.org/10.1145/79173.79181

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Klaus-Dieter Schewe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Z., He, S., Du, Y., González, S., Schewe, KD. (2021). Unbounded Barrier-Synchronized Concurrent ASMs for Effective MapReduce Processing on Streams. In: Raschke, A., Méry, D. (eds) Rigorous State-Based Methods. ABZ 2021. Lecture Notes in Computer Science(), vol 12709. Springer, Cham. https://doi.org/10.1007/978-3-030-77543-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77543-8_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77542-1

  • Online ISBN: 978-3-030-77543-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics