Skip to main content
Log in

J2M: a Java to MapReduce translator for cloud computing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Cloud computing has gradually evolved into an infrastructural tool for a variety of scientific research and computing. It has become a trend that lots of products have been migrated from local servers to cloud by many institutions and organizations. One of the challenges in cloud computing now is how to run software efficiently on cloud platforms since lots of original codes are not capable of being executed in parallel on cloud contexts, resulting in that the power of clouds cannot be exerted well. It is costly to redesign and convert current sequential codes into cloud platform. Thus, automatic translation from sequential code to cloud code is one of the directions that could be taken to resolve the problem of code migration in cloud infrastructure. In this paper, a new Java to MapReduce (J2M) translator is developed to achieve the automatic translation from sequential Java to cloud for specific data-parallel code with large loops. This paper will provide details about the design of our translator and evaluate our performance through experiments. The experimental results not only indicate that the translator can precisely translate the sequential Java into cloud codes, but also show that it can achieve very good speedup in performance, and we expect that an almost linear speedup is possible if larger enough data is processed. It is believed that the J2M translator is an ideal stereotype for code migration and will play an important role in the transition era of cloud computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.langpop.com/.

References

  1. Abouzeid A, Bajda-Pawlikowski K, Abadi D, Silberschatz A, Rasin A (2009) HadoopDB : an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proc VLDB Endow 2(1):922–933. http://dl.acm.org/citation.cfm?id=1687627.1687731

  2. Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58

    Article  Google Scholar 

  3. Bajda-Pawlikowski K, Abadi DJ, Silberschatz A, Paulson E (2011) Efficient processing of data warehousing queries in a split execution environment. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, SIGMOD ’11, pp 1165–1176. ACM, New York, NY, USA. doi:10.1145/1989323.1989447

  4. Beyer KS, Ercegovac V, Gemulla R, Balmin A, Eltabakh MY, Kanne CC, Ozcan F, Shekita EJ (2011) Jaql: a scripting language for large scale semistructured data analysis. PVLDB, pp 1272–1283

  5. Bughin J, Chui M, Manyika J (2010) Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Q 56(1):75–86

    Google Scholar 

  6. Chaiken R, Jenkins B, Larson PA, Ramsey B, Shakib D, Weaver S, Zhou J (2008) SCOPE : easy and efficient parallel processing of massive data sets. Proc VLDB Endow. 1(2):1265–1276. http://dl.acm.org/citation.cfm?id=1454159.1454166

  7. Chattopadhyay B, Lin, L, Liu W, Mittal S, Aragonda P, Lychagina V, Kwon Y, Wong M (2011) Tenzing a SQL implementation on the MapReduce framework. In: Proceedings of VLDB, p 1318–1327

  8. Dagum L, Menon R (1998) OpenMP: an industry standard API for shared-memory programming. Comput Sci Eng IEEE 5(1):46–55

    Article  Google Scholar 

  9. Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113. doi:10.1145/1327452.1327492

    Article  Google Scholar 

  10. Ekanayake J, Li H, Zhang B, Gunarathne T, Bae SH, Qiu J, Fox G (2010) Twister : a runtime for iterative MapReduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC ’10, ACM, New York, NY, USA, p 810–818 doi:10.1145/1851476.1851593

  11. Gates AF, Natkovich O, Chopra S, Kamath P, Narayanamurthy SM, Olston C, Reed B, Srinivasan S, Srivastava U (2009) Building a high-level dataflow system on top of Map-Reduce: the Pig experience. Proc VLDB Endow. 2(2):1414–1425. http://dl.acm.org/citation.cfm?id=1687553.1687568

  12. Gunarathne T, Zhang B, Wu TL, Qiu J (2011) Portable parallel programming on cloud and HPC: Scientific applications of twister4azure. In: UCC’11, p 97–104

  13. He B, Fang W, Luo Q, Govindaraju NK, Wang T (2008) Mars : a mapreduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT ’08, ACM, New York, NY, USA, pp 260–269. doi:10.1145/1454115.1454152

  14. Lee R, Luo T, Huai Y, Wang F, He Y, Zhang X (2011) YSmart: Yet another SQL-to-MapReduce translator. In: Distributed Computing Systems (ICDCS), 2011 31st International Conference on, p 25 –36. doi:10.1109/ICDCS.2011.26

  15. Lifander J, Arya A (2012) Automatic conversion of functional sequences to MapReduce with dynamic path selection. https://wiki.engr.illinois.edu/download/attachments/195770312/ABC-2nd.pdf

  16. Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of data, SIGMOD ’08, ACM, New York, NY, USA, pp 1099–1110. doi:10.1145/1376616.1376726

  17. Pallickara S, Ekanayake J, Fox G (2009) Granules: A lightweight, streaming runtime for cloud computing with support, for Map-Reduce. In: Cluster Computing and Workshops, 2009. CLUSTER ’09. IEEE International Conference on, pp 1–10. doi:10.1109/CLUSTR.2009.5289160

  18. Pan Y, Zhang J (2012) Parallel programming on cloud computing platforms. J Converg 3(4):23–28

    MathSciNet  Google Scholar 

  19. Talbot J, Yoo RM, Kozyrakis C (2011) Phoenix++ : modular MapReduce for shared-memory systems. In: Proceedings of the second international workshop on MapReduce and its applications, MapReduce ’11, ACM, New York, NY, USA, p 9–16. doi:10.1145/1996092.1996095

  20. Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a Map-Reduce framework. Proc VLDB Endow. 2(2):1626–1629. http://dl.acm.org/citation.cfm?id=1687553.1687609

  21. White T (2010) Hadoop: The Definitive Guide, 2nd edn. O’Reilly Media, Inc., Sebastopol, CA

  22. Zhang J, Xiang D, Li T, Pan Y (2013) M2M : a simple Matlab-to-MapReduce translator for cloud computing. Tsinghua Sci Technol 18(1):1–9

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Pan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, B., Zhang, J., Yu, N. et al. J2M: a Java to MapReduce translator for cloud computing. J Supercomput 72, 1928–1945 (2016). https://doi.org/10.1007/s11227-016-1695-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1695-x

Keywords

Navigation