skip to main content
10.1145/3321408.3326687acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesacm-turcConference Proceedingsconference-collections
research-article

Survey on streaming data computing system

Published: 17 May 2019 Publication History

Abstract

In the era of big data, as the data has been developed to be more and more streaming, the value produced by that has increased considerably, therefore contributing to the creation of a large number of computing platforms. Besides, because the traditional big data computing technology is unable to meet the needs of this innovation, a great deal of new technologies have been created rapidly. Based on that background, this paper analyzes the distinctive characteristics of streaming data, and enumerates and compares five widely used technology platforms based on their system architectures, usage scenarios, advantages and drawbacks.

References

[1]
Li G J, Cheng X Q. Research status and scientific thinking of big data{J}. Bulletin of Chinese Academy of Sciences, 2012, 27(6): 647--657.
[2]
China Internet Network Information Center. Statistical Report on China's Internet Development Status (February 2019). http://cnnic.cn/gywm/xwzx/rdxw/20172017_7056/201902/W020190228474508417254.pdf
[3]
Lee Y, Kang W, Son H. An Internet Traffic Analysis Method with MapReduce. 1st IFIP{C}//IEEE Workshop on Cloud Management, Osaka. 2010.
[4]
Kiran M, Murphy P, Monga I, et al. Lambda architecture for cost-effective batch and speed big data processing{C}//2015 IEEE International Conference on Big Data (Big Data). IEEE, 2015: 2785--2792.
[5]
Xiaofeng M, Xiang C. Big data management: concepts, techniques and challenges {J}{J}. Journal of computer research and development, 2013, 1(98): 146--169.
[6]
Opresnik D, Taisch M. The value of big data in servitization{J}. International Journal of Production Economics, 2015, 165: 174--184.
[7]
Shvachko K, Kuang H, Radia S, et al. The hadoop distributed file system{C}//2010 IEEE 26th symposium on mass storage systems and technologies (MSST). Ieee, 2010: 1--10.
[8]
Kambatla K, Kollias G, Kumar V, et al. Trends in big data analytics{J}. Journal of Parallel and Distributed Computing, 2014, 74(7): 2561--2573.
[9]
Henzinger M R, Raghavan P, Rajagopalan S. Computing on data streams, SRC technical notes{J}. 1998.
[10]
Fong S, Wong R, Vasilakos A V. Accelerated PSO swarm search feature selection for data stream mining big data{J}. IEEE transactions on services computing, 2016, 9(1): 33--45.
[11]
Guo Y, Feng S, Li K, et al. Big data processing and analysis platform for condition monitoring of electric power system{C}//2016 UKACC 11th International Conference on Control (CONTROL). IEEE, 2016: 1--6.
[12]
Najafabadi M M, Villanustre F, Khoshgoftaar T M, et al. Deep learning applications and challenges in big data analytics{J}. Journal of Big Data, 2015, 2(1): 1.
[13]
Scalosub G, Marbach P, Liebeherr J. Buffer management for aggregated streaming data with packet dependencies{J}. IEEE Transactions on Parallel and Distributed Systems, 2013, 24(3): 439--449.
[14]
Spark A. Apache Spark: Lightning-fast cluster computing{J}. URL http://spark.apache.org, 2016.
[15]
Meng X, Bradley J, Yavuz B, et al. Mllib: Machine learning in apache spark{J}. The Journal of Machine Learning Research, 2016, 17(1): 1235--1241.
[16]
Swetha K V, Sathyadevan S, Bilna P. Network data analysis using spark{M}//Software Engineering in Intelligent Systems. Springer, Cham, 2015: 253--259.
[17]
Zaharia M, Xin R S, Wendell P, et al. Apache spark: a unified engine for big data processing{J}. Communications of the ACM, 2016, 59(11): 56--65.
[18]
Toshniwal A, Taneja S, Shukla A, et al. Storm@ twitter{C}//Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM, 2014: 147--156.
[19]
Jones M T. Process real-time big data with Twitter Storm{J}. IBM Technical Library, 2013.
[20]
Batyuk A, Voityshyn V. Apache storm based on topology for real-time processing of streaming data from social networks{C}//2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP). IEEE, 2016: 345--349.
[21]
Solaimani M, Khan L, Thuraisingham B. Real-time anomaly detection over VMware performance data using storm{C}//Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014). IEEE, 2014: 458--465.
[22]
Hunt P, Konar M, Junqueira F P, et al. ZooKeeper: Wait-free Coordination for Internet-scale Systems{C}//USENIX annual technical conference. 2010, 8(9).
[23]
Neumeyer L, Robbins B, Nair A, et al. S4: Distributed stream computing platform{C}//2010 IEEE International Conference on Data Mining Workshops. IEEE, 2010: 170--177.
[24]
Zhou B, Luan Z, Wu J, et al. Using paralleled-PEs method to resolve the bursting data in distributed stream processing system{C}//2013 IEEE 16th International Conference on Computational Science and Engineering. IEEE, 2013: 1324--1331.
[25]
Noghabi S A, Paramasivam K, Pan Y, et al. Samza: stateful scalable stream processing at LinkedIn{J}. Proceedings of the VLDB Endowment, 2017, 10(12): 1634--1645.
[26]
Kleppmann M A, Kreps J. Kafka, Samza and the Unix philosophy of distributed data{J}. 2015.
[27]
Carbone P, Katsifodimos A, Ewen S, et al. Apache flink: Stream and batch processing in a single engine{J}. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2015, 36(4).
[28]
Chintapalli S, Dagit D, Evans B, et al. Benchmarking streaming computation engines: Storm, flink and spark streaming{C}//2016 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, 2016: 1789--1792.

Cited By

View all
  • (2024)Haywire – A System for Visualizing and Analyzing Streaming IoT Data2024 9th International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC62297.2024.10710184(22-29)Online publication date: 2-Sep-2024
  • (2023)A New Big Data Processing Framework for the Online RoadshowBig Data and Cognitive Computing10.3390/bdcc70301237:3(123)Online publication date: 27-Jun-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACM TURC '19: Proceedings of the ACM Turing Celebration Conference - China
May 2019
963 pages
ISBN:9781450371582
DOI:10.1145/3321408
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. batching computing
  2. big data
  3. steaming big data
  4. streaming computing

Qualifiers

  • Research-article

Funding Sources

  • Hunan Provincial Natural Science Foundation of China
  • 2016 Science Research Project of Hunan Provincial Department of Education
  • National Social Science Fund Project
  • Open Foundation for the University Innovation Platform in the Hunan Province

Conference

ACM TURC 2019

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Haywire – A System for Visualizing and Analyzing Streaming IoT Data2024 9th International Conference on Fog and Mobile Edge Computing (FMEC)10.1109/FMEC62297.2024.10710184(22-29)Online publication date: 2-Sep-2024
  • (2023)A New Big Data Processing Framework for the Online RoadshowBig Data and Cognitive Computing10.3390/bdcc70301237:3(123)Online publication date: 27-Jun-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media