Abstract
Big data shall mean the massive volume of data that could not be stored, processed and managed by any traditional database management systems. Big Data Analytics becoming a comprehensive research area today this has attracted to all academia and industry to extract knowledge and information from a large amount of data. Oracle SQL is a prominent DBMS and is used worldwide. As the data goes bigger the running time is increasing in Oracle SQL. With the help of Apache Hive, we can do a large scale of data analysis in minimal time period. Apache Hive expedites for reading, writing and managing big datasets in distributed environment using SQL. Whereas Oracle SQL provides integrated development domain for running queries and scripts. In this paper, we have taken few queries for analysis for some smaller data sets as well as larger data sets and we have done an analysis for both Apache Hive and Oracle SQL environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chawda, R.K.: Big data and advanced analytics tools. In: Symposium on Colossal Data Analysis and Networking (CDAN) (2016)
Garg, V.: Optimization of multiple queries for big data with apache Hadoop/Hive. In: 2015 International Conference on Computational Intelligence and Communication Networks, pp. 938–941 (2015)
Gruenheid, A., Omiecinski, E., Mark, L.: Query optimization using column statistics in hive. In: Categories and Subject Descriptors (2016)
Haryono, G.P., Zhou, Y.: Profiling apache HIVE query from runtime logs. In: International Conference on Big Data Smart Computing BigComp, pp. 61–68 (2016)
Kaisler, S., Armour, F., Espinosa, J.A., Money, W.: Big data: issues and challenges moving forward. In: 2013 46th Hawaii International Conference on System Science, pp. 995–1004 (2013)
Sethy, R., Panda, M.: Big data analysis using hadoop: a survey. IJARCSSE 1153–1157 (2015)
Thusoo, A., Sen, S.J., Jain, N., Shao, Z., Chakka, P., Zhang, N., Antony, S., Liu, H., Murthy, R.: Hive - A petabyte scale data warehouse using Hadoop. In: Proceedings of the International Conference on Data Engineering, pp. 996–1005 (2010)
Loshin, D.: Big Data Tools and Techniques, pp. 61–72 (2013). Chapter 7
Hive Architecture. https://cwiki.apache.org/confluence/display/Hive/Design
Introduction to Oracle Database. https://docs.oracle.com/database/121/CNCPT/intro.htm#CNCPT001
Online Video Characteristics and Transcoding Time Dataset Data Set (2015). https://archive.ics.uci.edu/ml/datasets.html
Record Linkage Comparison Patterns Data Set (2011). https://archive.ics.uci.edu/ml/datasets.html
3D Road Network (North Jutland, Denmark) Data Set (2013). https://archive.ics.uci.edu/ml/datasets.html
Rate Data Set (2015). https://www.kaggle.com/hhsgov/health-insurance-marketplace
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Sethy, R., Dash, S.K., Panda, M. (2018). Performance Comparison Between Apache Hive and Oracle SQL for Big Data Analytics. In: Abraham, A., Cherukuri, A., Madureira, A., Muda, A. (eds) Proceedings of the Eighth International Conference on Soft Computing and Pattern Recognition (SoCPaR 2016). SoCPaR 2016. Advances in Intelligent Systems and Computing, vol 614. Springer, Cham. https://doi.org/10.1007/978-3-319-60618-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-60618-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60617-0
Online ISBN: 978-3-319-60618-7
eBook Packages: EngineeringEngineering (R0)