Skip to main content
Log in

GMR: graph-compatible MapReduce programming model

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

A Correction to this article was published on 17 October 2017

This article has been updated

Abstract

The MapReduce programming model is widely used to parallelize data processing over the large scale of commodity computer clusters. However, on account of its monotonous data representation, it fails to express graph-parallel algorithms naturally and execute them efficiently. Alternatively, Pregel and PowerGraph could address these challenges. But they require users to familiarize another set of programming patterns and platforms, and at the same time the legacy MapReduce code also becomes incompatible and useless. In this paper, we proposed the Graph-compatible MapReduce (GMR) as an extension of Google’s Standard MapReduce (SMR). In this way, graph-parallel algorithm will be naturally expressed without compromising the efficiency and simplicity, and meanwhile the conventional MapReduce programming pattern be preserved. Also, users could gain the convenience of “Think like a vertex”. Based on the experimental studying, we analyzed the ratio of the redundant computation, transmission and data caching introduced in naive iterative MapReduce platforms (e.g., HaLoop, Twister). Furthermore, we discussed the difference between GMR and the graph-targeted frameworks. The evaluation experiment results show that GMR outperforms GraphX in a series of real-world graph-parallel algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Change history

  • 17 October 2017

    In the original publication, Fig. 12 was incorrectly presented. The plot line and legends of Fig. 12a, c, e and f should not overlap. The original article was corrected.

References

  1. Beierlein F, Clark T (2005) Computer simulations of enzyme reaction mechanisms: simulation of protein spectra. High performance computing in science & engineering Munich 2004, Springer, pp 245-259

  2. Bu Y, Howe B, Balazinska M, Ernst MD (2010) Haloop: efficient iterative data processing on large clusters. Proceedings of the Vldb endowment 3(1):285–296

    Article  Google Scholar 

  3. Buluç A, Fineman JT, Frigo M, Gilbert JR, Leiserson CE (2009) Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: SPAA ’09: proceedings of the twenty-first annual symposium on parallelism in algorithms and archi, pp 233–244

  4. Cherkassky BV, Goldberg AV, Radzik T (1996) Shortest path algorithms: theory and experimental evaluation. Math Program 73(2):129–174

    Article  MathSciNet  Google Scholar 

  5. Chua TS, Chua TS, Chua TS, Chua TS, Chua TS (2016) Learning from collective intelligence: Feature learning using social images and tags. ACM Trans Multimed Comput Commun Appl 13(1):1

    MathSciNet  MATH  Google Scholar 

  6. Ekanayake J, Li H, Zhang B, Gunarathne T, Bae SH, Qiu J, Fox G (2010) Twister: a runtime for iterative mapreduce. In: ACM international symposium on high performance distributed computing, pp 810–818

  7. Elgohary A, (2012) Stateful mapreduce

  8. Gao Z, Zhang H, Xu GP, Xue YB, Hauptmann AG (2014) Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition. Signal Process 112(C):83–97

    Google Scholar 

  9. Gao Z, Zhang H, Xu GP, Xue YB (2015) Multi-perspective and multi-modality joint representation and recognition model for 3d action recognition. Neurocomputing 151:554–564

    Article  Google Scholar 

  10. Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: Usenix conference on operating systems design and implementation, pp 17–30

  11. Guattery S, Miller GL (1995) On the performance of spectral graph partitioning methods. In: ACM-SIAM symposium on discrete algorithms, pp 233–242

  12. Karypis G, Kumar V (1998) Metis: a software package for partitioning unstructured graphs. In: International cryogenics monograph, pp 121–124

  13. Karypis G, Kumar V (1999) Multilevel k-way partitioning scheme for irregular graphs. J Parallel Distrib Comput 48(1):96–129

    Article  Google Scholar 

  14. Liu AA, Nie WZ, Gao Y, Su YT (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116

    Article  MathSciNet  Google Scholar 

  15. Liu AA, Su YT, Nie WZ, Kankanhalli M (2016) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114

    Article  Google Scholar 

  16. Lv Q, Josephson W, Wang Z, Charikar M, Li K (2007) Multi-probe lsh: efficient indexing for high-dimensional similarity search. In: International conference on very large data bases, University of Vienna, Austria, September, pp 950–961

  17. Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2009) Pregel: a system for large-scale graph processing. In: SPAA 2009: proceedings of the ACM symposium on parallelism in algorithms and architectures, Calgary, Alberta, Canada, August, pp 135–146

  18. Miller F (1993) A library for bulk-synchronous parallel programming. In: Proceedings of the BCS parallel processing specialist group workshop on general purpose parallel computing, pp 100–108

  19. Nie W, Liu A, Li W, Su Y (2016) Cross-view action recognition by cross-domain learning *. Image Vis Comput 55:109–118

    Article  Google Scholar 

  20. Nie WZ, Liu AA, Gao Z, Su YT (2015) Clique-graph matching by preserving global & local structure. In: Computer vision and pattern recognition, pp 4503–4510

  21. Nie WZ, Liu AA, Su YT (2016) 3D object retrieval based on sparse coding in weak supervision. J Vis Commun Image Represent 37(C):40–45

    Article  Google Scholar 

  22. Raji RP (2009) Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1):107–113

    MathSciNet  Google Scholar 

  23. Savage JE, Wloka MG (1991) Parallelism in graph-partitioning. J Parallel Distrib Comput 13(3):257–272

    Article  MathSciNet  Google Scholar 

  24. Weilenmann M (2012) Aspects of highly transient catalyst simulation. Catal Today 188(1):121–134

    Article  Google Scholar 

  25. Xin RS, Gonzalez JE, Franklin MJ, Stoica I (2013) Graphx: a resilient distributed graph system on spark. In: International workshop on graph data management experiences and systems, pp 1–6

  26. Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):745–754

    Article  Google Scholar 

  27. Zhang H, Liu W, Liu W, He X, Luan H, Chua TS (2016) Discrete collaborative filtering. In: International ACM SIGIR conference on research and development in information retrieval, pp 325–334

  28. Zhang H, Zha ZJ, Yang Y, Yan S, Chua TS (2014) Robust (semi) nonnegative graph embedding. IEEE Trans Image Process A Publ IEEE Signal Process Soc 23(7):2996

    Article  MathSciNet  Google Scholar 

  29. Zhang H, Zha ZJ, Yang Y, Yan S, Gao Y, Chua TS (2013) Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In: Proceedings of the 21st ACM international conference on Multimedia. ACM, pp 33–42

  30. Zhang Y, Gao Q, Gao L, Wang C (2012) Imapreduce: a distributed computing framework for iterative computation. J Grid Comput 10(1):1112–1121

    Article  Google Scholar 

Download references

Acknowledgments

Our thanks to the Institute of Process Engineering, Chinese Academy of Science for their help. This research was supported by the Zhejiang Engineering Research Center of Intelligent Medicine(2016E10011) and the research and application of key technologies for rapid individualized sculpture manufacture and carving stone materials appraisal.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weidong Zhang.

Additional information

The original version of this article was revised: The plot line and legends of Fig. 12a, c, e and f should not overlap.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., He, B., Chen, Y. et al. GMR: graph-compatible MapReduce programming model. Multimed Tools Appl 78, 457–475 (2019). https://doi.org/10.1007/s11042-017-5102-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5102-2

Keywords

Navigation