Skip to main content
Log in

Scheduling of big data applications on distributed cloud based on QoS parameters

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Big data is one of the major technology usages for business operations in today’s competitive market. It provides organizations a powerful tool to analyze large unstructured data to make useful decisions. Result quality, time, and price associated with big data analytics are very important aspects for its success. Selection of appropriate cloud infrastructure at coarse and fine grained level will ensure better results. In this paper, a global architecture is proposed for QoS based scheduling for big data application to distributed cloud datacenter at two levels which are coarse grained and fine grained. At coarse grain level, appropriate local datacenter is selected based on network distance between user and datacenter, network throughput and total available resources using adaptive K nearest neighbor algorithm. At fine grained level, probability triplet (C, I, M) is predicted using naïve Bayes algorithm which provides probability of new application to fall in compute intensive (C), input/output intensive (I) and memory intensive (M) categories. Each datacenter is transformed into a pool of virtual clusters capable of executing specific category of jobs with specific (C, I, M) requirements using self organized maps. Novelty of study is to represent whole datacenter resources in a predefined topological ordering and executing new incoming jobs in their respective predefined virtual clusters based on their respective QoS requirements. Proposed architecture is tested on three different Amazon EMR datacenters for resource utilization, waiting time, availability, response time and estimated time to complete the job. Results indicated better QoS achievement and 33.15 % cost gain of the proposed architecture over traditional Amazon methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Top 10 Strategic Technology Trends of 2014 [Online]. http://www.gartner.com/technology/research/top-10-technology-trends/. Accessed 15 Oct 2014

  2. Sood, S.K., Sandhu, R.: Matrix based proactive resource provisioning in mobile cloud environment. Simul. Model. Pract. Theory (2014). doi:10.1016/j.simpat.2014.06.004

  3. Chen, J., Chen, Y., Du, X., Li, C., Lu, J., Zhao, S., Zhou, X.: Big data challenge: a data management perspective. Front. Comput. Sci. 7(2), 157–164 (2013)

    Article  MathSciNet  Google Scholar 

  4. Zheng, Z., Wu, X., Zhang, Y., Lyu, M.R.: QoS ranking prediction for cloud services. IEEE Trans. Parallel Distrib. Syst. 24(6), 1213–1222 (2013)

    Article  Google Scholar 

  5. Rao, J., Wei, Y., Gong, J., Xu, C.Z.: QoS guarantees and service differentiation for dynamic cloud applications. IEEE Trans. Netw. Serv. Manag. 10(1), 43–55 (2013)

  6. Wang, W.J., Chang, Y.S., Lo, W.T., Lee, Y.K.: Adaptive scheduling for parallel tasks with QoS satisfaction for hybrid cloud environment. J. Supercomput. 66(2), 783–811 (2013)

    Article  Google Scholar 

  7. Zhu, Z., Li, S., Chen, X.: Design QoS-aware multi-path provisioning strategies for efficient cloud-assisted SVC video streaming to heterogeneous clients. IEEE Trans. Multimed. 15(4), 758–768 (2013)

  8. Hsu, W.H., Lo, C.H.: QoS/QoE mapping and adjustment model in the cloud-based multimedia infrastructure. IEEE Syst. J. 8(1), 247–255 (2014)

    Article  Google Scholar 

  9. Lin, J.W., Chen, C.H., Chang, M.: QoS-aware data replication for data-intensive applications in cloud computing systems. IEEE Trans. Cloud Comput. 1(1), 101–115 (2013)

    Article  Google Scholar 

  10. Misra, S., Das, S., Khatua, M., Obaidat, M.S.: QoS-guaranteed bandwidth shifting and redistribution in mobile cloud environment. IEEE Trans. Cloud Comput. 2(2), 181–193 (2013)

    Article  Google Scholar 

  11. Chen, K.T., Chang, Y.C., Hsu, H.J., Chen, D.Y., Huang, C.Y., Hsu, C.H.: On the quality of service of cloud gaming systems. IEEE Trans. Multimed. 16(2), 480–495 (2014)

    Article  Google Scholar 

  12. Kaur, P.D., Chana, I.: A resource elasticity framework for QoS-aware execution of cloud applications. Future Gener. Comput. Syst. 37(1), 14–25 (2014)

    Article  Google Scholar 

  13. Amazon Elastic Map Redude [Online]. http://aws.amazon.com/elasticmapreduce. Accessed 17 Oct 2014

  14. Rackspace Public Cloud Pricing [Online]. http://www.rackspace.com/cloud/public-pricing. Accessed 17 Oct 2014

  15. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  16. Kohonen, T.: Self-Organization and Associative Memory, vol. 8. Springer Series in Information Sciences. Springer, Berlin (1989)

  17. Amazon Elastic Compute Cloud [Online]. http://aws.amazon.com/ec2/. Accessed 19 Oct 2014

  18. Sood, S.K.: Function points-based resource prediction in cloud computing. Concurr. Comput. Pract. Exp. (2014). doi:10.1002/cpe.3296

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajinder Sandhu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sandhu, R., Sood, S.K. Scheduling of big data applications on distributed cloud based on QoS parameters. Cluster Comput 18, 817–828 (2015). https://doi.org/10.1007/s10586-014-0416-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-014-0416-6

Keywords

Navigation