Abstract
With an ever-increasing amount of both data volume and variety, traditional data processing tools became unsuitable for the big data context. This has pushed toward the creation of specific processing tools that are well aligned with emerging needs. However, it is often hard to choose the adequate solution as the wide list of available tools are continuously changing. For this, we present in this paper both a literature review and a technical comparison of the most known analytics tools in order to help mapping it to different needs. Moreover, we underline how much important choosing the appropriate tool is acting for different kind of applications and especially for smart cities environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Internet Live Stats - Internet Usage & Social Media Statistics. http://www.internetlivestats.com/. Accessed 25 Mar 2017
Reinsel, D., Gantz, J.: Extracting Value from Chaos. IDC IVIEW, Sponsored by EMC (2011)
Laney, D.: 3D Data Management: Controlling Data Volume, Velocity, and Variety. META Group Inc., Stamford (2011)
Mohanty, S., Das, G., Suman, H., Maharana, P., Ratnakar, R.: A survey on working principle and application of Hadoop. Int. J. Adv. Innovative Res. 4, 71–75 (2015)
Bajaber, F., Elshawi, R., Batarfi, O., Altalhi, A., Barnawi, A., Sakr, S.: Big data 2.0 processing systems: taxonomy and open challenges. J. Grid Comput. 14(3), 379–405 (2016)
Lu, R., Wu, G., Xie, B., Hu, J.: Stream bench: towards benchmarking modern distributed stream computing frameworks. In: 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, pp. 69–78 (2014)
Liu, X., Iftikhar, N., Xie, X.: Survey of real-time processing systems for big data. In: Proceedings of the 18th International Database Engineering & Applications Symposium, New York, NY, USA, pp. 356–361 (2014)
Yadranjiaghdam, B., Pool, N., Tabrizi, N.: A survey on real-time big data analytics: applications and tools. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 404–409 (2016)
Tsai, C.-W., Lai, C.-F., Chao, H.-C., Vasilakos, A.V.: Big data analytics: a survey. J. Big Data 2(1), 21 (2015)
Gong, Y., Morandini, L., Sinnott, R.O.: The design and benchmarking of a cloud-based platform for processing and visualization of traffic data. In: 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 13–20 (2017)
Jiang, Y., Huang, Z., Tsang, D.H.K.: Towards max-min fair resource allocation for stream big data analytics in shared clouds. IEEE Trans. Big Data PP(99), 1 (2017)
Gulzar, M.A., Interlandi, M., Condie, T., Kim, M.: BigDebug: interactive debugger for big data analytics in Apache Spark. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, New York, USA, pp. 1033–1037 (2016)
Zhu, W., Chen, H., Hu, F.: ASC: improving spark driver performance with SPARK automatic checkpoint. In: 2016 18th International Conference on Advanced Communication Technology (ICACT), pp. 1–8 (2016)
Li, H., Chen, T., Xu, W.: Improving spark performance with zero-copy buffer management and RDMA. In: 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 33–38 (2016)
Yang, H., Liu, X., Chen, S., Lei, Z., Du, H., Zhu, C.: Improving Spark performance with MPTE in heterogeneous environments. In: 2016 International Conference on Audio, Language and Image Processing (ICALIP), pp. 28–33 (2016)
Yan, Y., Gao, Y., Chen, Y., Guo, Z., Chen, B., Moscibroda, T.: TR-Spark: transient computing for big data analytics. In: Proceedings of the Seventh ACM Symposium on Cloud Computing, New York, USA, pp. 484–496 (2016)
Park, G., Park, S., Khan, L., Chung, L.: IRIS: a goal-oriented big data analytics framework on Spark for better business decisions. In: 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 76–83 (2017)
Hashem, I.A.T., et al.: The role of big data in smart city. Int. J. Inf. Manag. 36(5), 748–758 (2016)
Yin, C., Xiong, Z., Chen, H., Wang, J., Cooper, D., David, B.: A literature survey on smart cities. Sci. China Inf. Sci. 58(10), 1–18 (2015)
Nuaimi, E.A., Neyadi, H.A., Mohamed, N., Al-Jaroodi, J.: Applications of big data to smart cities. J. Internet Serv. Appl. 6(1), 25 (2015)
Rathore, M.M., Ahmad, A., Paul, A.: IoT-based smart city development using big data analytical approach. In: 2016 IEEE International Conference on Automatica (ICA-ACCA), pp. 1–8 (2016)
Nathali Silva, B., Khan, M., Han, K.: Big data analytics embedded smart city architecture for performance enhancement through real-time data processing and decision-making. Wirel. Commun. Mob. Comput. 2017, e9429676 (2017)
Costa, C., Santos, M.Y.: BASIS: a big data architecture for smart cities. In: 2016 SAI Computing Conference (SAI), pp. 1247–1256 (2016)
Gomes, E., Dantas, M.A.R., de Macedo, D.D.J., Rolt, C.D., Brocardo, M.L., Foschini, L.: Towards an infrastructure to support big data for a smart city project. In: 2016 IEEE 25th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 107–112 (2016)
Mosannenzadeh, F., Di Nucci, M.R., Vettorato, D.: Identifying and prioritizing barriers to implementation of smart energy city projects in Europe: an empirical approach. Energy Policy 105, 191–201 (2017)
Coulouris, G., Dollimore, J., Kindberg, T., Blair, G.: Distributed Systems: Concepts and Design, 5th edn. Pearson, Boston (2011)
HDFS Architecture Guide. https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#Introduction. Accessed: 27 Mar 2017
Google Research Publication: MapReduce. https://research.google.com/archive/mapreduce.html. Accessed 21 Jan 2017
MapReduce Tutorial. https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html. Accessed 27 Mar 2017
Lee, K.-H., Lee, Y.-J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with mapreduce: a survey. SIGMOD Rec. 40(4), 11–20 (2012)
Vavilapalli, V.K., et al.: Apache hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, New York, USA, pp. 5:1–5:16 (2013)
Apache SparkTM - Lightning-Fast Cluster Computing. https://spark.apache.org/. Accessed 27 Mar 2017
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, Berkeley, USA, p. 10 (2010)
Xin, R.: Spark officially sets a new record in large-scale sorting (2014). http://databricks.com/blog/2014/11/05/spark-officially-sets-a-new-record-in-large-scale-sorting.html. Accessed 27 Mar 2017
Sort Benchmark Home Page. http://sortbenchmark.org/. Accessed 27 Mar 2017
Trident Tutorial. http://storm.apache.org/releases/1.0.1/Trident-tutorial.html. Accessed 05 Apr 2017
Apache Storm: http://storm.apache.org/. Accessed 27 Mar 2017
Apache Flink: Scalable Stream and Batch Data Processing. https://flink.apache.org/. Accessed 27 Mar 2017
Samza: http://samza.apache.org/. Accessed 27 Mar 2017
Google Trends: Google Trends. https://g.co/trends/aes0h. Accessed 31 Mar 2017
Thommandram, A., Pugh, J.E., Eklund, J.M., McGregor, C., James, A.G.: Classifying neonatal spells using real-time temporal analysis of physiological data streams: algorithm development. In: 2013 IEEE Point-of-Care Healthcare Technologies (PHT), pp. 240–243 (2013)
Nair, L.R., Shetty, S.D., Shetty, S.D.: Applying Spark based machine learning model on streaming big data for health status prediction. Comput. Electr. Eng. (2017, in press)
Yan, K., You, X., Ji, X., Yin, G., Yang, F.: A hybrid outlier detection method for health care big data. In: 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom), pp. 157–162 (2016)
Chen, X., Shao, S., Tian, Z., Xie, Z., Yin, P.: Impacts of air pollution and its spatial spillover effect on public health based on China’s big data sample. J. Clean. Prod. 142(Part 2), 915–925 (2017)
Goli-Malekabadi, Z., Sargolzaei-Javan, M., Akbari, M.K.: An effective model for store and retrieve big health data in cloud computing. Comput. Methods Programs Biomed. 132, 75–82 (2016)
Al Rasyid, M.U.H., Yuwono, W., Muharom, S.A., Alasiry, A.H.: Building platform application big sensor data for e-health wireless body area network. In: 2016 International Electronics Symposium (IES), pp. 409–413 (2016)
Ma, Y., Wang, Y., Yang, J., Miao, Y., Li, W.: Big health application system based on health internet of things and big data. IEEE Access PP(99), 1 (2016)
Ho, K.F., Hirai, H.W., Kuo, Y.H., Meng, H.M., Tsoi, K.K.F.: Indoor air monitoring platform and personal health reporting system: big data analytics for public health research. In: 2015 IEEE International Congress on Big Data, pp. 309–312 (2015)
Ta, V.-D., Liu, C.-M., Nkabinde, G.W.: Big data stream computing in healthcare real-time analytics. In: 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pp. 37–42 (2016)
Gupta, S., Tripathi, P.: An emerging trend of big data analytics with health insurance in India. In: 2016 International Conference on Innovation and Challenges in Cyber Security (ICICCS-INBUSH), pp. 64–69 (2016)
Kumar, K.M., Tejasree, S., Swarnalatha, S.: Effective implementation of data segregation extraction using big data in E - health insurance as a service. In: 2016 3rd International Conference on Advanced Computing and Communication Systems (ICACCS), vol. 1, pp. 1–5 (2016)
Suguna, S., Vithya, M., Eunaicy, J.I.C.: Big data analysis in e-commerce system using HadoopMapReduce. In: 2016 International Conference on Inventive Computation Technologies (ICICT), vol. 2, pp. 1–6 (2016)
Dong, T., Yang, B., Tian, T.: Volatility analysis of Chinese stock market using high-frequency financial big data. In: 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity), pp. 769–774 (2015)
Zamani-Dehkordi, P., Rakai, L., Zareipour, H., Rosehart, W.: Big data analytics for modelling the impact of wind power generation on competitive electricity market prices. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), pp. 2528–2535 (2016)
Aivalis, C.J., Gatziolis, K., Boucouvalas, A.C.: Evolving analytics for e-commerce applications: utilizing big data and social media extensions. In: 2016 International Conference on Telecommunications and Multimedia (TEMU), pp. 1–6 (2016)
Deng, L., Gao, J., Vuppalapati, C.: Building a big data analytics service framework for mobile advertising and marketing. In: 2015 IEEE First International Conference on Big Data Computing Service and Applications, pp. 256–266 (2015)
Zhang, H., Zhang, L., Cheng, X., Chen, W.: A novel precision marketing model based on telecom big data analysis for luxury cars. In: 2016 16th International Symposium on Communications and Information Technologies (ISCIT), pp. 307–311 (2016)
Bollen, J., Mao, H., Zeng, X.-J.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)
Zhang, Y., Pennacchiotti, M.: Predicting purchase behaviors from social media. In: Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, pp. 1521–1532 (2013)
Attigeri, G.V., Pai, M.M.M., Pai, R.M., Nayak, A.: Stock market prediction: a big data approach. In: TENCON 2015 - 2015 IEEE Region 10 Conference, pp. 1–5 (2015)
Wich, M., Kramer, T.: Enrichment of smart home services by integrating social network services and big data analytics. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), pp. 425–434 (2016)
Xu, G., Liu, M., Li, F., Zhang, F., Shen, W.: User behavior prediction model for smart home using parallelized neural network algorithm. In: 2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 221–226 (2016)
Udupi, P.K., Malali, P., Noronha, H.: Big data integration for transition from e-learning to smart learning framework. In: 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), pp. 1–4 (2016)
Jagtap, A., Bodkhe, B., Gaikwad, B., Kalyana, S.: Homogenizing social networking with smart education by means of machine learning and Hadoop: a case study. In: 2016 International Conference on Internet of Things and Applications (IOTA), pp. 85–90 (2016)
Raghothama, J., Shreenath, V.M., Meijer, S.: Analytics on public transport delays with spatial big data. In: Proceedings of the 5th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, New York, USA, pp. 28–33 (2016)
Rathore, M.M., Ahmad, A., Paul, A., Jeon, G.: Efficient graph-oriented smart transportation using internet of things generated big data. In: 2015 11th International Conference on Signal-Image Technology Internet-Based Systems (SITIS), pp. 512–519 (2015)
Chua, A., Servillo, L., Marcheggiani, E., Moere, A.V.: Mapping cilento: using geotagged social media data to characterize tourist flows in southern Italy. Tour. Manag. 57, 295–310 (2016)
Hochstetler, J., Hochstetler, L., Fu, S.: An optimal police patrol planning strategy for smart city safety. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 1256–1263 (2016)
Yamini, J., Babu, Y.R.: Design and implementation of smart home energy management system. In: 2016 International Conference on Communication and Electronics Systems (ICCES), pp. 1–4 (2016)
Vaidya, M., Deshpande, S.: Distributed data management in energy sector using Hadoop. In: 2015 IEEE Bombay Section Symposium (IBSS), pp. 1–6 (2015)
Kavianand, G., Nivas, V.M., Kiruthika, R., Lalitha, S.: Smart drip irrigation system for sustainable agriculture. In: 2016 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR), pp. 19–22 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
El Alaoui, I., Gahi, Y., Messoussi, R., Todoskoff, A., Kobi, A. (2018). Big Data Analytics: A Comparison of Tools and Applications. In: Ben Ahmed, M., Boudhir, A. (eds) Innovations in Smart Cities and Applications. SCAMS 2017. Lecture Notes in Networks and Systems, vol 37. Springer, Cham. https://doi.org/10.1007/978-3-319-74500-8_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-74500-8_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74499-5
Online ISBN: 978-3-319-74500-8
eBook Packages: EngineeringEngineering (R0)