Abstract
The cloud computing paradigm offer users access to computing resource in a pay-as-you-go manner. However, to both cloud computing vendors and users, it is a challenge to predict how much resource is needed to run an application in a cloud at a required level of quality. This research focuses on developing a model to predict the computing resource consumption of MapReduce applications in the cloud computing environment. Based on the Classified and Regression Tree (CART), the proposed approach derives knowledge of the relationship among the application features, quality of service, and amount of computing resource, from a small training. The experiments show that the prediction accuracy is as high as 80%. This research can potentially benefit both the cloud vendors and users through improving resource management and reducing costs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Albers, R., Suijs, E., de With, P.H.N.: Triple-C: Resource Usage Prediction for Semi-Automatic Parallelization of Groups of Dynamic Image-Processing Tasks. In: Proc. of the 23rd Int. Parallel Distributed Processing Symp. (2009)
Duan, R., Nadeem, F., Wang, J.: A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids. In: Proc. of the 9th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), Shanghai, China, pp. 339–347 (May 2009)
Ganapathi, A., Chen, Y., Fox, A.: Statistics-Driven Workload Modeling for the Cloud. In: ICDE Workshops 2010, pp. 87–92 (2010)
Ganapathi, A., Kuno, H., Dayal, U., et al.: Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning. In: Proc. of the 2009 IEEE International Conference on Data Engineering, Shanghai, China, pp. 592–603 (March 2009)
Gibbons, R.: A Historical Application Profiler for Use by Parallel Schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 58–77. Springer, Heidelberg (1997)
Kaashoek, F., Morris, R., Mao, Y.: Optimizing MapReduce for Multicore Architectures, technical report, http://dspace.mit.edu/bitstream/handle/1721.1/54692/MIT-CSAIL-TR-2010-020.pdf?sequence=1
Matsunaga, A., Fortes, J.: On the Use of Machine Learning to Predict the Time and Resources Consumed by Applications. In: Proc. of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), Melbourne Australia, pp. 495–504 (June 2010)
Mu’alem, A.W., Feitelson, D.G.: Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Transactions on Parallel and Distributed Systems 12(6) (June 2001)
Mitchell, T.M.: Machine Learning, McGraw-Hill Science/Engineering/Math (March 1, 1997)
Nadeem, F., Fahringer, T.: Using Templates to Predict Execution Time of Scientific Workflow Applications in the Grid. In: Proc. of the 9th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), Shanghai, China, pp. 316–323 (May 2009)
Guim, F., Rodero, I., Corbalan, J., et al.: The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques. In: CoreGrid Workshop in Grid Middleware (2007)
Smith, W.: Prediction Services for Distributed Computing. In: Proc. of IEEE Internatioal Parallel and Distributed Processing Symposium, Long Beach, US, pp. 1–10 (June 2007)
Smith, W., Foster, I., Taylor, V.: Predicting Application Run Times Using Historical Information. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1998. LNCS, vol. 1459, pp. 122–142. Springer, Heidelberg (1998)
Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving MapReduce performance in heterogeneous environment. In: OSDI 2008 Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Piao, J.T., Yan, J. (2012). Computing Resource Prediction for MapReduce Applications Using Decision Tree. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds) Web Technologies and Applications. APWeb 2012. Lecture Notes in Computer Science, vol 7235. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29253-8_51
Download citation
DOI: https://doi.org/10.1007/978-3-642-29253-8_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29252-1
Online ISBN: 978-3-642-29253-8
eBook Packages: Computer ScienceComputer Science (R0)