Skip to main content

Implementing MapReduce Applications in Dynamic Cloud Environments

  • Chapter
  • First Online:
Cloud Computing

Part of the book series: Computer Communications and Networks ((CCN))

Abstract

MapReduce is one of the most popular programming models for parallel data processing in Cloud environments. Standard MapReduce implementations are based on centralized master-slave architectures that do not cope well with dynamic Cloud environments in which nodes may join and leave the network at high rates. In this chapter we describe P2P-MapReduce, a framework that exploits a peer-to-peer (P2P) model to manage intermittent node participation, master failures, and MapReduce job recovery in a decentralized but effective way. Specifically, the chapter describes the P2P-MapReduce architecture, mechanisms, and implementation and provides an evaluation of its performance. The performance results confirm that P2P-MapReduce ensures a higher level of fault tolerance compared to a centralized implementation of MapReduce.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://aws.amazon.com/emr/

  2. 2.

    https://cloud.google.com/appengine/docs/java/dataprocessing/

References

  1. Talia D, Trunfio P, Marozzo F (2015) Data analysis in the cloud. Elsevier, Amsterdam, Netherlands

    Google Scholar 

  2. Marozzo F, Talia D, Trunfio P (2013) Using clouds for scalable knowledge discovery applications. Lecture notes in computer science, vol 7640 LNCS. Springer, Berlin/New York, pp 220–227

    Google Scholar 

  3. Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  4. Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. 6th USENIX symposium on operating systems design and implementation (OSDI’04), San Francisco

    Google Scholar 

  5. Hadoop (2016) http://hadoop.apache.org. (Site visited September 2016)

  6. Marozzo F, Talia D, Trunfio P (2012) P2P-MapReduce: parallel data processing in dynamic Cloud environments. J Comput Syst Sci 78(5):1382–1402, Elsevier Science

    Google Scholar 

  7. Gridgain (2016) http://www.gridgain.com. (Site visited September 2016)

  8. Skynet (2016) http://skynet.rubyforge.org. (Site visited September 2016)

  9. MapSharp (2016) http://mapsharp.codeplex.com. (Site visited September 2016)

  10. Disco (2016) http://discoproject.org. (Site visited September 2016)

  11. Gu Y, Grossman R (2009) Sector and sphere: the design and implementation of a high performance data cloud. Philos Trans Ser A Math Phys Eng Sci 367(1897):2429–2445

    Article  Google Scholar 

  12. Zaharia M, Konwinski A, Joseph AD, Katz RH, Stoica I (2008) Improving MapReduce performance in heterogeneous environments. 8th USENIX symposium on operating systems design and implementation (OSDI’08), San Diego

    Google Scholar 

  13. Condie T, Conway N, Alvaro P, Hellerstein JM, Elmeleegy K, Sears R (2010) MapReduce online. 7th USENIX symposium on networked systems design and implementation (NSDI’10), San Jose

    Google Scholar 

  14. Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C (2007) Evaluating MapReduce for multi-core and multiprocessor systems. Proceedings of the 13th international symposium on high-performance computer architecture (HPCA’07), Phoenix

    Google Scholar 

  15. Lin H, Ma X, Archuleta J, Feng W-c, Gardner M, Zhang Z (2010) MOON: MapReduce on opportunistic eNvironments. Proceedings of the 19th international symposium on high performance distributed computing (HPDC’10), Chicago

    Google Scholar 

  16. Tang B, Moca M, Chevalier S, He H, Fedak G (2010) Towards MapReduce for desktop grid computing. Proceedings of the 5th international conference on P2P, parallel, grid, cloud and internet computing (3PGCIC’10), Fukuoka

    Google Scholar 

  17. Dou A, Kalogeraki V, Gunopulos D, Mielikainen T, Tuulos VH (2010) Misco: a MapReduce framework for mobile systems. Proceedings of the 3rd international conference on pervasive technologies related to assistive environments (PETRA’10), New York

    Google Scholar 

  18. Marozzo F, Talia D, Trunfio P (2011) A framework for managing MapReduce applications in dynamic distributed environments. Proceedings of the 19th Euromicro international conference on parallel, distributed and network-based computing (PDP 2011), Ayia Napa, pp. 149–158

    Google Scholar 

  19. Gong L (2001) JXTA: a network programming environment. IEEE Internet Comput 5(3):88–95

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabrizio Marozzo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Marozzo, F., Talia, D., Trunfio, P. (2017). Implementing MapReduce Applications in Dynamic Cloud Environments. In: Antonopoulos, N., Gillam, L. (eds) Cloud Computing. Computer Communications and Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-54645-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54645-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54644-5

  • Online ISBN: 978-3-319-54645-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics