Skip to main content

Performance Analysis of Hadoop-Based SQL and NoSQL for Processing Log Data

  • Conference paper
  • First Online:
Book cover Database Systems for Advanced Applications (DASFAA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9052))

Included in the following conference series:

Abstract

Recently, many companies and research organizations are seeking scalable solutions by using Hadoop ecosystems. The log data management with large-scale and real-time properties is one of the appropriate application on top of Hadoop. In this paper, we focus on SQL and NoSQL choices for building Hadoop-based log data management system. For this purpose, we first select major products supporting SQL and NoSQL, and we then present an appropriate scheme for each product by considering its own characteristics. All the schema are for real-time monitoring and analyzing the log data. For each product, we implement insertion and selection operations of log data in Hadoop, and we analyze the performance of these operation. Analysis results show that MariaDB and MongoDB are fast in the insertion, and PostgreSQL and HBase are fast in the selection. We believe that our evaluation results will be very helpful for users to choose Hadoop SQL and NoSQL products for handling large-scale and real-time log data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boulmakoul, A., Karim, L., Laarabi, M.H., Sacile, R., Garbolino, E.: MongoDB-Hadoop distributed and scalable framework for spatio-temporal hazardous materials data warehousing. In: Proceedings of the 7th Int’l Congress on Environmental Modelling and Software (iEMSs), San Diego, CA, vol. 3, pp. 2255–2267, June 2014

    Google Scholar 

  2. Vora, M.N.: Hadoop-HBase for large-scale data. In: Proceedings of Int’l Conference on Computer Science and Network Technology, Harbin, China, vol. 4, pp. 601–605, Dec. 2011

    Google Scholar 

  3. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST), Lake Tahoe, Nevada, pp. 1–10, May 2010

    Google Scholar 

  4. Massie, M., Li, B., Nicholes, B., Vuksan, V.: Monitoring with Ganglia. O’Reilly Media Inc, Sebastopol (2012)

    Google Scholar 

  5. Rabl, T., Sadoghi, M., Jacobsen, H.-A., Gómez-Villamor, S., Muntés-Mulero, V., Mankowskii, S.: Solving big data challenges for enterprise application performance management. Proc. VLDB Endowment 5(12), 1724–1735 (2012)

    Article  Google Scholar 

  6. Wang, X., Chen, H., Wang, Z.: Research on improvement of dynamic load balancing in MongoDB. In: Proceedings of the 11th IEEE Int’l Conference on Dependable, Autonomic and Secure Computing (DASC), Chengdu, Sichuan, China, pp. 124–130, Dec. 2013

    Google Scholar 

  7. HBase. http://hbase.apache.org/

  8. Riak. http://basho.com/riak/

Download references

Acknowledge

This research was funded by the MSIP(Ministry of Science, ICT & Future Planning), Korea in the ICT R&D Program 2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang-Sae Moon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Son, S., Gil, MS., Moon, YS., Won, HS. (2015). Performance Analysis of Hadoop-Based SQL and NoSQL for Processing Log Data. In: Liu, A., Ishikawa, Y., Qian, T., Nutanong, S., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9052. Springer, Cham. https://doi.org/10.1007/978-3-319-22324-7_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22324-7_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22323-0

  • Online ISBN: 978-3-319-22324-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics