S2MART: Smart Sql to Map-Reduce Translators

Gowraj, Narayan; Ravi, Prasanna Venkatesh; V, Mouniga; Sumalatha, M. R.

doi:10.1007/978-3-642-37401-2_56

Narayan Gowraj²⁰,
Prasanna Venkatesh Ravi²⁰,
Mouniga V²⁰ &
…
M. R. Sumalatha²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7808))

Included in the following conference series:

Asia-Pacific Web Conference

4692 Accesses

Abstract

With the advent of rapid increase in the size of data in large cluster systems and the transformation of data into big data in major data intensive organizations and applications, it is very necessary to build efficient and flexible Sql to Map-Reduce translators that would make Tera to Peta bytes of data easy to access and retrieve because conventional Sql-based data processing has limited scalability in these cases. In this paper we propose a Smart Sql to Map-Reduce Translator (S2MART), which transforms the Sql queries into Map-Reduce jobs with the inclusion of intra-query correlation for minimizing redundant operation, sub-query generation and spiral modeled database for reducing data transfer cost and network transfer cost. S2MART also applies the concept of views in database to perform parallelization of big data easy and streamlined. This paper gives a comprehensive study about the various features of S2MART and we compare the performance and correctness of our system with two widely used Sql to Map-Reduce translators’ hive and pig.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Evaluation of high-level query languages based on MapReduce in Big Data

Article Open access 06 October 2018

An Approach to Implement MapReduce and Aggregation Pipeline Utilizing NoSql Technologies

An Efficient Solution for Processing Skewed MapReduce Jobs

References

Jiang, D., Tung, A.K.H., Chen, G.: MAP-JOIN-REDUCE: Toward Scalable and Efficient Data Analysis on Large Clusters. IEEE Transactions on Knowledge and Data Engineering 23(9) (September 2011)
Google Scholar
Warneke, D., Kao, O.: Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud. IEEE Transactions on Parallel and Distributed Systems 22(6) (June 2011)
Google Scholar
Jiang, W., Agrawal, G.: MATE-CG: A MapReduce-Like Framework for Accelerating Data-Intensive Computations on Heterogeneous Clusters. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium (2012)
Google Scholar
Sakr, S., Liu, A., Batista, D.M., Alomari, M.: A Survey of Large Scale Data Management Approaches in Cloud Environments. IEEE Communications Surveys and Tutorials 13(3) (Third Quarter 2011)
Google Scholar
Han, J., Song, M., Song, J.: A Novel Solution of Distributed Memory NoSQL Database for Cloud Computing. In: 10th IEEE/ACIS International Conference on Computer and Information Science (2011)
Google Scholar
Bisdikian, C.: Challenges for Mobile Data Management in the Era of Cloud and Social Computing. In: 2011 12th IEEE International Conference on Mobile Data Management (2011)
Google Scholar
Nicolae, B., Moise, D., Antoniu, G., Bougé, L., Dorier, M.: BlobSeer: Bringing High Throughput under Heavy Concurrency to Hadoop Map-Reduce Applications. In: International IEEE Conference (2010)
Google Scholar
Zhang, J., Wu, X.: A 2-Tier Clustering Algorithm with Map-Reduce. In: The Fifth Annual ChinaGrid Conference (2010)
Google Scholar
Pallickara, S., Ekanayake, J., Fox, G.: Granules: A Lightweight, Streaming Runtime for Cloud Computing With Support for Map-Reduce. In: IEEE International Conference (2009)
Google Scholar
Starkey, J.: Presentation of nuoDB, http://www.siia.net/presentations/software/AATC2012/NextGen_NuoDB.pdf
Starkey, J.: Presentation of nuoDB, http://www.cs.brown.edu/courses/cs227/slides/dtxn/nuodb.pdf
Zhang, Z., Cherkasova, L., Vermam, A., Loo, B.T.: Optimizing Completion Time and Resource Provisioning of Pig Programs. In: 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2012)
Google Scholar
Olsten, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Christopher Olsten, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins. Presented by Welch, D.
Google Scholar
Thusoo, A., Sen Sarma, J., Jain, N., Shao, Z., Chakka, P., Zhang, N., Antony, S., Liu, H., Murthy, R.: Hive – A Petabyte Scale Data Warehouse Using Hadoop. In: IEEE International Conference (2010)
Google Scholar
Michel, S., Theobald, M.: HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. Presented by Raber, F.
Google Scholar
Thusoo, A., Sen Sarma, J., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive A Warehousing Solution Over a MapReduce Framework. In: VLDB 2009, Lyon, France (August 2009)
Google Scholar
Chaiken, R., Jenkins, B., Larson, P.-Å., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. In: VLDB 2008, Auckland, New Zealand, August 24-30 (2008)
Google Scholar
Yang, C.: Osprey: Implementing MapReduce-Style Fault Tolerance in a Shared-Nothing Distributed Database. In: ICDE Conference (2010)
Google Scholar
Isard, Y.Y.M.: DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. Microsoft research labs
Google Scholar
He, Y., Lee, R.: RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems. In: ICDE Conference (2011)
Google Scholar
Foley, M.: High Availability HDFS, http://storageconference.org/2012/Presentations/M07.Foley.pdf
Chandar, J.: Join Algorithms using Map/Reduce. University of Edinburgh (2010)
Google Scholar
Chen, T., Taura, K.: ParaLite: Supporting Collective Queries in Database System to Parallelize User-Defined Executable. In: 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2012)
Google Scholar
Leu, J.-S., Yee, Y.-S., Chen, W.-L.: Comparison of Map-Reduce and SQL on Large-scale Data Processing. In: International Symposium on Parallel and Distributed Processing with Applications (2010)
Google Scholar
Hsieh, M.-J.: SQLMR: A Scalable Database Management System for Cloud Computing. In: 2011 International Conference on Parallel Processing (2011)
Google Scholar
Zhu, M., Risch, T.: Querying Combined Cloud-Based and Relational Databases. In: 2011 International Conference on Cloud and Service Computing (2011)
Google Scholar
Husain, M.F.: Scalable Complex Query Processing Over Large Semantic Web Data Using Cloud. In: 2011 IEEE 4th International Conference on Cloud Computing (2011)
Google Scholar
Hu, W.: A Hybrid Join Algorithm on Top of Map Reduce. In: 2011 Seventh International Conference on Semantics, Knowledge and Grids (2011)
Google Scholar
Introduction to Hive by Cloudera 2009, http://www.cloudera.com/wp-content/uploads/2010/01/6-IntroToHive.pdf
Wallom, D.: myTrustedCloud: Trusted Cloud Infrastructure for Security-critical Computation and Data Management. In: 2011 Third IEEE International Conference on Cloud Computing Technology and Science (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, Anna University, Chennai, India
Narayan Gowraj, Prasanna Venkatesh Ravi, Mouniga V & M. R. Sumalatha

Authors

Narayan Gowraj
View author publications
You can also search for this author in PubMed Google Scholar
Prasanna Venkatesh Ravi
View author publications
You can also search for this author in PubMed Google Scholar
Mouniga V
View author publications
You can also search for this author in PubMed Google Scholar
M. R. Sumalatha
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Engineering, Nagoya University, 464-8601, Nagoya, Japan
Yoshiharu Ishikawa
Department of Computer Science and Technology, Harbin Institute of Technology, 150006, Harbin, China
Jianzhong Li
School of Computer Science and Engineering, University of New South Wales, 2031, Sydney, NSW, Australia
Wei Wang & Wenjie Zhang &
Department of Computing and Information Systems, University of Melbourne, 3052, Melbourne, VIC, Australia
Rui Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gowraj, N., Ravi, P.V., V, M., Sumalatha, M.R. (2013). S2MART: Smart Sql to Map-Reduce Translators. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds) Web Technologies and Applications. APWeb 2013. Lecture Notes in Computer Science, vol 7808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37401-2_56

Download citation

DOI: https://doi.org/10.1007/978-3-642-37401-2_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37400-5
Online ISBN: 978-3-642-37401-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

S2MART: Smart Sql to Map-Reduce Translators

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Evaluation of high-level query languages based on MapReduce in Big Data

An Approach to Implement MapReduce and Aggregation Pipeline Utilizing NoSql Technologies

An Efficient Solution for Processing Skewed MapReduce Jobs

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

S2MART: Smart Sql to Map-Reduce Translators

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Evaluation of high-level query languages based on MapReduce in Big Data

An Approach to Implement MapReduce and Aggregation Pipeline Utilizing NoSql Technologies

An Efficient Solution for Processing Skewed MapReduce Jobs

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation