Skip to main content
Log in

Cloud resource management using 3Vs of Internet of Big data streams

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Internet of things (IoT) allows various smart devices to get connected to anything, anywhere, and at anytime. The ubiquitous nature of IoT devices generates huge volume of data called Internet of Big data (IoBd). IoBd is generated in continuous streams and at unprecedented speed. The rapid analysis of such IoBd streams is the need of hour. Moreover, the allocation of optimal number of cloud resources for real time analysis of IoBd streams is a challenging task. Most of the current methods use data characteristics provided by the user to allocate cloud nodes. But in case of IoBd streams, data characteristics are usually unknown to the user because of the stochastic nature of IoT devices. This poses difficulty in selecting appropriate cloud resources. This paper proposes an efficient method to tackle this issue. The proposed method first predicts the data characteristics of IoBd stream in terms of volume, velocity and variety (3Vs). Later, these predicted values are expressed in terms of a triplet called Charactrization of Stream (CoSt). On the other hand, self-organizing maps are used to create dynamic clusters of cloud resources. One of the clusters is allocated to IoBd stream based upon its CoSt. Experimental results show that the proposed method effectively boosted the performance of cloud resources and minimized the execution and waiting time of IoBd stream processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Zheng Z, Wu X, Zhang Y, Lyu MR, Wang J (2013) QoS ranking prediction for cloud services. IEEE Trans Parallel Distrib Syst 24(6):1213–1222

    Article  Google Scholar 

  2. Sandhu R, Sood SK (2014) Scheduling of big data applications on distributed cloud based on QoS parameters. Clust Comput 18(2):817–828

    Article  Google Scholar 

  3. EC2 instance types—Amazon Web Services (AWS). https://aws.amazon.com/ec2/instance-types/. Accessed 10 Jan 2019

  4. Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82(1):35–45

    Article  MathSciNet  Google Scholar 

  5. Kohonen T (1989) Self-organization and associative memory, vol 8. Springer, Berlin

    Book  MATH  Google Scholar 

  6. Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19(2):171–209

    Article  Google Scholar 

  7. Philip Chen CLL, Zhang CYY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on Big data. Inf Sci NY 275:314–347

    Article  Google Scholar 

  8. Hashem IAT, Yaqoob I, Badrul Anuar N, Mokhtar S, Gani A, Ullah Khan S (2015) The rise of ‘Big Data’ on cloud computing: review and open research issues. Inf Syst 47:98–115

    Article  Google Scholar 

  9. Rao J, Wei Y, Gong J, Xu CZ (2013) QoS guarantees and service differentiation for dynamic cloud applications. IEEE Trans Netw Serv Manag 10(1):43–55

    Article  Google Scholar 

  10. Wang W-J, Chang Y-S, Lo W-T, Lee Y-K (2013) Adaptive scheduling for parallel tasks with QoS satisfaction for hybrid cloud environments. J Supercomput 66(2):783–811

    Article  Google Scholar 

  11. Zhu Z, Li S, Chen X (2013) Design QoS-aware multi-path provisioning strategies for efficient cloud-assisted SVC video streaming to heterogeneous clients. IEEE Trans Multimed 15(4):758–768

    Article  Google Scholar 

  12. Hsu W-H, Lo C-H (2014) QoS/QoE mapping and adjustment model in the cloud-based multimedia infrastructure. IEEE Syst J 8(1):247–255

    Article  Google Scholar 

  13. Chang JM (2013) QoS-aware data replication for data-intensive applications in cloud computing systems. IEEE Trans Cloud Comput 1(1):101–115

    Article  Google Scholar 

  14. Misra S, Das S, Khatua M, Obaidat MS (2014) QoS-guaranteed bandwidth shifting and redistribution in mobile cloud environment. IEEE Trans Cloud Comput 2(2):181–193

    Article  Google Scholar 

  15. Chen KT, Chang YC, Hsu HJ, Chen DY, Huang CY, Hsu CH (2014) On the quality of service of cloud gaming systems. IEEE Trans Multimed 16(2):480–495

    Article  Google Scholar 

  16. Sood SK (2016) Function points-based resource prediction in cloud computing. Concurr Comput Pract Exp 28(10):2781–2794

    Article  Google Scholar 

  17. Sood SK, Sandhu R (2015) Matrix based proactive resource provisioning in mobile cloud environment. Simul Model Pract Theory 50:83–95

    Article  Google Scholar 

  18. Dean J, Ghemawat S (2008) MapReduce. Commun ACM 51(1):107–113

    Article  Google Scholar 

  19. Welcome to Apache\({}^{\rm TM}\) Hadoop\(^{\textregistered }\)! http://hadoop.apache.org/. Accessed 10 Jan 2019

  20. Olston C, Chiou G, Chitnis L, Liu F, Han Y, Larsson M, Neumann A, Rao VBN, Sankarasubramanian V, Seth S, Tian C, Zicornell T, Wang X (2011) Nova: continuous pig/hadoop workflows. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data. ACM, pp 1081–1090

  21. Cascading|Application platform for enterprise Big data. http://www.cascading.org/. Accessed 10 Jan 2019

  22. Apache HBase—Apache HBase\({}^{\rm TM}\) Home. http://hbase.apache.org/. Accessed 10 Jan 2019

  23. The Apache Cassandra Project. http://cassandra.apache.org/. Accessed 10 Jan 2019

  24. Apache Mahout: Scalable machine learning and data mining. http://mahout.apache.org/. Accessed 10 Jan 2019

  25. Agile data integration platforms—Cloud-based (iPaaS) and on-premise software|Scribe software. http://www.scribesoft.com/. Accessed 10 Jan 2019

  26. Olston C, Seth S, Tian C, ZiCornell T, Wang X, Chiou G, Chitnis L, Liu F, Han Y, Larsson M, Neumann A, Rao VBN, Sankarasubramanian V (2011) Nova. In: Proceedings of the international conference on management of data—SIGMOD’11, p 1081

  27. Bhatotia P, Wieder A, Rodrigues R, Acar Ua, Pasquin R (2011) Incoop: MapReduce for incremental computations. In: Proceedings of the 2nd ACM symposium on cloud computing—SOCC’11, pp 1–14

  28. Neumeyer L, Robbins B, Nair A, Kesari A (2010) S4: distributed stream computing platform. In: IEEE international conference on data mining workshops, pp 170–177

  29. Apache Storm. http://storm.apache.org/. Accessed 10 Jan 2019

  30. Welcome to apache flume—Apache flume. http://flume.apache.org/index.html. Accessed 10 Jan 2019

  31. Zhang F, Cao J, Khan SU, Li K, Hwang K (2015) A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications. Futur Gener Comput Syst 43–44:149–160

    Article  Google Scholar 

  32. Zhang Q, Chen Z, Yang LT (2015) A nodes scheduling model based on Markov chain prediction for big streaming data analysis. Int J Commun Syst 28(9):1610–1619

    Article  Google Scholar 

  33. Jain, A, Chang EY (2004) Adaptive sampling for sensor networks. In: Proceedings of the 1st international workshop on data management for sensor networks in conjunction with VLDB 2004—DMSN’04, p 10

  34. Qt Concurrent 5.6. http://doc.qt.io/qt-5/qtconcurrent-index.html. Accessed 10 Jan 2019

  35. Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C (2007) Evaluating MapReduce for multi-core and multiprocessor systems. In: Proceedings of the IEEE 13th international symposium on high performance computer architecture, pp 13–24

  36. Disco MapReduce. http://discoproject.org/. Accessed 12 Jan 2019

  37. Space. http://skynet.rubyforge.org/. Accessed 12 Jan 2019

  38. Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S-H, Qiu J, Fox G (2010) Twister. In: Proceedings of the 19th ACM international symposium on high performance distributed computing—HPDC’10, p 810

  39. Dou A, Kalogeraki V, Gunopulos D, Mielikainen T, Tuulos VH (2010) Misco. In: Proceedings of the 3rd international conference on PErvasive technologies related to assistive environments—PETRA’10, p 1

  40. Li R, Hu H, Li H, Wu Y, Yang J (2015) MapReduce parallel programming model: a state-of-the-art survey. Int J Parallel Progr 44(4):832–866

    Article  Google Scholar 

  41. Feng B, Fu M, Ma H, Xia Y, Wang B (2014) Kalman filter with recursive covariance estimation-sequentially estimating process noise covariance. IEEE Trans Ind Electron 61(11):6253–6263

    Article  Google Scholar 

  42. Chandrasekhar VR, Bach J, Girod B, Chen DM, Tsai SS, Cheung N-M, Chen H, Takacs G, Reznik Y, Vedantham R, Grzeszczuk R (2011) The Stanford mobile visual search data set. In: Proceedings of the second annual ACM conference on multimedia systems—MMSys’11, p 117

  43. UCI machine learning repository: Geographical original of music data set. https://archive.ics.uci.edu/ml/datasets/Geographical+Original+of+Music. Accessed 13 Jan 2019

  44. UCI machine learning repository: Bag of words data set. https://archive.ics.uci.edu/ml/datasets/Bag+of+Words. Accessed 10 Jan 2019

  45. IBM—SPSS software—India. http://www-01.ibm.com/software/in/analytics/spss/. Accessed 10 Jan 2019

  46. Discrete event simulation software—SimEvents—Simulink—MathWorks India. http://in.mathworks.com/products/simevents/. Accessed 10 Jan 2019

  47. GStreamer: open source multimedia framework. https://gstreamer.freedesktop.org/. Accessed 10 Jan 2019

  48. Media stream type detection. https://gstreamer.freedesktop.org/data/doc/gstreamer/head/manual/ html/section-typefinding.html. Accessed 10 Jan 2019

  49. List of defined types. https://gstreamer.freedesktop.org/data/doc/gstreamer/head/pwg/html/section-types-definitions.html#table-container-types. Accessed 10 Jan 2019

  50. Jiang Y, Huang Z, Tsang DH (2018) Towards max–min fair resource allocation for stream big data analytics in shared clouds. IEEE Trans Big Data 4(1):130–137

    Article  Google Scholar 

  51. Hassan MM, Song B, Hossain MS, Alamri A (2014) Efficient resource scheduling for big data processing in cloud platform. In: International conference on internet and distributed computing systems, pp 51–63

  52. Kollenstart M, Harmsma E, Langius E, Andrikopoulos V, Lazovik A (2018) Adaptive provisioning of heterogeneous cloud resources for big data processing. Big Data Cogn Comput 2(3):1–18

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Navroop Kaur.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaur, N., Sood, S.K. & Verma, P. Cloud resource management using 3Vs of Internet of Big data streams. Computing 102, 1463–1485 (2020). https://doi.org/10.1007/s00607-019-00732-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-019-00732-5

Keywords

Mathematics Subject Classification

Navigation