Abstract
The recent appearance, evolution and massive expansion of social media-based technologies, in conjunction with what currently is known as Internet of Things, results in a vertiginous data production. One of the main contributions to address this matter has been the Hadoop framework (which implements the Map/Reduce paradigm), especially when used in conjunction with Cloud computing environments. In this paper, a comprehensive and rigourous study of the Map/Reduce framework using formal methods is presented. Specifically, the Timed Process Algebra BTC is used, and the resulting formal model is evaluated with a real social media data Hadoop-based application. Moreover, the formal model is validated by carrying out several experiments on a real private Cloud environment. Finally, the formal model outcomes are harnessed to determine the best performance–cost agreement in a real scenario. Results show that the proposed model enables to determine in advance both the performance of a Hadoop-based application within Cloud environments and the best performance–cost agreement.
Similar content being viewed by others
References
Amazon Calculator—Simple Monthly Calculator. http://calculator.s3.amazonaws.com/calc5.html. Accessed 21 July 2015
Anderson P (2007) What is Web 2.0? Ideas, technologies and implications for education. In: JISC Online Report
Apache Hadoop (2015) http://hadoop.apache.org/. Accessed 21 July 2015
Babu S (2010) Towards automatic optimization of MapReduce programs. In: Proceedings of the 1st ACM symposium on cloud computing (SoCC ’10ACM), New York, pp 137–142
CentOS (2015) http://www.centos.org/. Accessed 21 July 2015
Conejero J, Rana O, Burnap P, Morgan J (2013) Scaling archived social media data analysis using a hadoop cloud. In: IEEE 6th international conference on cloud computing (CLOUD). Santa Clara
COSMOS: Cardiff On-line Social Media Observatory (2013). http://www.cs.cf.ac.uk/cosmos/. Accessed 21 July 2015
Freitas L, Woodcock J (2007) FDR explorer. Electron Notes Theor Comput Sci 187:19–34
Hoare C (1985) Communicating sequential processes. Prentice Hall, Englewood Cliffs
Jiang D, Ooi BC, Shi L, Wu S (2010) The performance of MapReduce: an in-depth study. Proc VLDB Endow 3(1–2):472–483
Kernel Based Virtual Machine (2015) http://www.linux-kvm.org/. Accessed 21 July 2015
Ono K, Hirai Y, Tanabe Y, Noda N, Hagiya M (2011) Using Coq in Specification and Program Extraction of Hadoop MapReduce applications. In: Proceedings of the 9th international conference on software engineering and formal methods (SEFM’11), Springer, Berlin, pp 350–365
OpenNebula (2015) http://opennebula.org/. Accessed 21 July 2015
Ruiz MC, Cazorla D, Cuartero F, Pardo JJ (2006) Analysis of the SET e-commerce protocol using a true concurrency process algebra. In: 21st ACM Symposium on Applied Computing (SAC-06), ACM Press, New York, pp 879–886
Ruiz MC, Cazorla D, Cuartero F, Pardo JJ, Maciá H (2004) A bounded true concurrency process algebra for performance evaluation. FORTE Workshops, vol 3236., Lecture Notes in Computer ScienceSpringer, Berlin, pp 143–155
Ruiz MC, Pérez D, Pardo JJ, Cazorla D (2009) BAL Tool. http://www.dsi.uclm.es/retics/bal/. Accessed 21 July 2015
SentiStrength (2013) The sentiment strength detection in short texts. http://sentistrength.wlv.ac.uk/. Accessed 21 July 2015
The Coq Proof Assistant (2015) http://coq.inria.fr/. Accessed 21 July 2015
Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111
Yang F, Su W, Zhu H, Li Q (2010) Formalizing MapReduce with CSP. In: Proceedings of the 17th IEEE international conference and workshops on the engineering of computer-based systems (ECBS’2010), pp 358–367
Yoshimura M (2010) System design optimization for product manufacturing, 1st edn. Springer, London
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ruiz, M.C., Cazorla, D., Pérez, D. et al. Formal performance evaluation of the Map/Reduce framework within cloud computing. J Supercomput 72, 3136–3155 (2016). https://doi.org/10.1007/s11227-015-1553-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-015-1553-2