Abstract
Cloud computing has gradually evolved into an infrastructural tool for a variety of scientific research and computing. It has become a trend that lots of products have been migrated from local servers to cloud by many institutions and organizations. One of the challenges in cloud computing now is how to run software efficiently on cloud platforms since lots of original codes are not capable of being executed in parallel on cloud contexts, resulting in that the power of clouds cannot be exerted well. It is costly to redesign and convert current sequential codes into cloud platform. Thus, automatic translation from sequential code to cloud code is one of the directions that could be taken to resolve the problem of code migration in cloud infrastructure. In this paper, a new Java to MapReduce (J2M) translator is developed to achieve the automatic translation from sequential Java to cloud for specific data-parallel code with large loops. This paper will provide details about the design of our translator and evaluate our performance through experiments. The experimental results not only indicate that the translator can precisely translate the sequential Java into cloud codes, but also show that it can achieve very good speedup in performance, and we expect that an almost linear speedup is possible if larger enough data is processed. It is believed that the J2M translator is an ideal stereotype for code migration and will play an important role in the transition era of cloud computing.
Similar content being viewed by others
Notes
References
Abouzeid A, Bajda-Pawlikowski K, Abadi D, Silberschatz A, Rasin A (2009) HadoopDB : an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proc VLDB Endow 2(1):922–933. http://dl.acm.org/citation.cfm?id=1687627.1687731
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58
Bajda-Pawlikowski K, Abadi DJ, Silberschatz A, Paulson E (2011) Efficient processing of data warehousing queries in a split execution environment. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, SIGMOD ’11, pp 1165–1176. ACM, New York, NY, USA. doi:10.1145/1989323.1989447
Beyer KS, Ercegovac V, Gemulla R, Balmin A, Eltabakh MY, Kanne CC, Ozcan F, Shekita EJ (2011) Jaql: a scripting language for large scale semistructured data analysis. PVLDB, pp 1272–1283
Bughin J, Chui M, Manyika J (2010) Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Q 56(1):75–86
Chaiken R, Jenkins B, Larson PA, Ramsey B, Shakib D, Weaver S, Zhou J (2008) SCOPE : easy and efficient parallel processing of massive data sets. Proc VLDB Endow. 1(2):1265–1276. http://dl.acm.org/citation.cfm?id=1454159.1454166
Chattopadhyay B, Lin, L, Liu W, Mittal S, Aragonda P, Lychagina V, Kwon Y, Wong M (2011) Tenzing a SQL implementation on the MapReduce framework. In: Proceedings of VLDB, p 1318–1327
Dagum L, Menon R (1998) OpenMP: an industry standard API for shared-memory programming. Comput Sci Eng IEEE 5(1):46–55
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113. doi:10.1145/1327452.1327492
Ekanayake J, Li H, Zhang B, Gunarathne T, Bae SH, Qiu J, Fox G (2010) Twister : a runtime for iterative MapReduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC ’10, ACM, New York, NY, USA, p 810–818 doi:10.1145/1851476.1851593
Gates AF, Natkovich O, Chopra S, Kamath P, Narayanamurthy SM, Olston C, Reed B, Srinivasan S, Srivastava U (2009) Building a high-level dataflow system on top of Map-Reduce: the Pig experience. Proc VLDB Endow. 2(2):1414–1425. http://dl.acm.org/citation.cfm?id=1687553.1687568
Gunarathne T, Zhang B, Wu TL, Qiu J (2011) Portable parallel programming on cloud and HPC: Scientific applications of twister4azure. In: UCC’11, p 97–104
He B, Fang W, Luo Q, Govindaraju NK, Wang T (2008) Mars : a mapreduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT ’08, ACM, New York, NY, USA, pp 260–269. doi:10.1145/1454115.1454152
Lee R, Luo T, Huai Y, Wang F, He Y, Zhang X (2011) YSmart: Yet another SQL-to-MapReduce translator. In: Distributed Computing Systems (ICDCS), 2011 31st International Conference on, p 25 –36. doi:10.1109/ICDCS.2011.26
Lifander J, Arya A (2012) Automatic conversion of functional sequences to MapReduce with dynamic path selection. https://wiki.engr.illinois.edu/download/attachments/195770312/ABC-2nd.pdf
Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of data, SIGMOD ’08, ACM, New York, NY, USA, pp 1099–1110. doi:10.1145/1376616.1376726
Pallickara S, Ekanayake J, Fox G (2009) Granules: A lightweight, streaming runtime for cloud computing with support, for Map-Reduce. In: Cluster Computing and Workshops, 2009. CLUSTER ’09. IEEE International Conference on, pp 1–10. doi:10.1109/CLUSTR.2009.5289160
Pan Y, Zhang J (2012) Parallel programming on cloud computing platforms. J Converg 3(4):23–28
Talbot J, Yoo RM, Kozyrakis C (2011) Phoenix++ : modular MapReduce for shared-memory systems. In: Proceedings of the second international workshop on MapReduce and its applications, MapReduce ’11, ACM, New York, NY, USA, p 9–16. doi:10.1145/1996092.1996095
Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a Map-Reduce framework. Proc VLDB Endow. 2(2):1626–1629. http://dl.acm.org/citation.cfm?id=1687553.1687609
White T (2010) Hadoop: The Definitive Guide, 2nd edn. O’Reilly Media, Inc., Sebastopol, CA
Zhang J, Xiang D, Li T, Pan Y (2013) M2M : a simple Matlab-to-MapReduce translator for cloud computing. Tsinghua Sci Technol 18(1):1–9
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, B., Zhang, J., Yu, N. et al. J2M: a Java to MapReduce translator for cloud computing. J Supercomput 72, 1928–1945 (2016). https://doi.org/10.1007/s11227-016-1695-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-016-1695-x