skip to main content
10.1145/3230905.3230954acmotherconferencesArticle/Chapter ViewAbstractPublication PageslopalConference Proceedingsconference-collections
research-article

An Efficient Approach to Improve Security for MapReduce Computation in Cloud System

Authors Info & Claims
Published:02 May 2018Publication History

ABSTRACT

Running MapReduce computation in public cloud raises a series of security challenges since the service providers may not be properly protected. Due to the fact that the MapReduce applications are long-running, which increases the chance of an attacker to massively perform malicious attacks by exploiting the workers vulnerability, many workers may be compromised. Those workers could misbehave and thereby tamper the results integrity of all computations assigned to them. To tackle this challenge, this paper proposes an effective Result Verification Mechanism (RVM) using a reputation threshold-based voting method to ensure the result integrity of MapReduce on the map and reduce phases. Therefore, render the MapReduce computation accurate. Another major contribution of this paper is that we implement RVM based on Apache Hadoop and perform a series of experiments. The evaluation study of the experimental results demonstrate that RVM can significantly reduce computation overhead and guarantee a low error rate as compared to the simple voting method like m-first voting.

References

  1. J. Ekanayake, S. Pallickara, and G. Fox. 2008. MapReduce for data intensive scientific analyses. In Proceedings of the 4th IEEE International Conference on eScience. Indianapolis, IN USA, 277--284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Dean and S. Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation. USENIX Association, 137--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David A. Anisi. 2003. Optimal Motion Control of a Ground Vehicle. Master's thesis. Royal Institute of Technology (KTH), Stockholm, Sweden. J. Zhao and J. Pjesivac-Grbovic. 2009. MapReduce: MapReduce: The programming model and practice. SIGMETRICS (Google). June 2009.Google ScholarGoogle Scholar
  4. B. Langmead, M. Schatz, J. Lin, M. Pop, and S. Salzberg. 2009. Searching for SNPs with cloud computing. Genome Biology, Vol. 10, No. 11, November 2009.Google ScholarGoogle ScholarCross RefCross Ref
  5. M. C. Schatz. 2009. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics, Vol. 25, No. 11, 1363--1369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. W. Services. Aws case study: Washington post. https://aws.amazon.com/solutions/case-studies, (site visited January 2017).Google ScholarGoogle Scholar
  7. M. Moca, G. C. Silaghi, and G. Fedak. 2011. Distributed Results Checking for MapReduce in Volunteer Computing. In IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, Shanghai, China, 1847--1854. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Watanabe, M. Fukushi, and S. Horiguchi. 2009. Collusion-Resistant Sabotage-Tolerance mechanisms for volunteer computing systems. IEEE International Conference on e-Business Engineering, Macau, 213--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Zhao, V. Lo, and C. Gauthier Dickey. 2005. Result verification and trust-based scheduling in peer-to-peer grids. In the 5th IEEE International Conference on Peer-to-Peer Computing. IEEE Computer Society, Washington, 31--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Dean, and S. Ghemawat. 2008. MapReduce: Simplified Data Processing on Large Clusters. Commun ACM 51, 1 (2008), 107--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Amazon Elastic MapReduce. http://aws.amazon.com/elasticmapreduce/. (Site visited January 2016).Google ScholarGoogle Scholar
  12. Y. Chen, V. Paxson, and R. Katz. 2010. What's New About Cloud Computing Security?. Technical Report UCB/EECS-2010-5, Berkeley.Google ScholarGoogle Scholar
  13. Y. Wang, and J. Wei. 2011. VIAF: Verification-Based integrity assurance framework for mapReduce. In Proc. IEEE International Conference on Cloud Computing (Cloud 11), IEEE Press, 300--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Wei, J. Du, T. Yu, and X. Gu. 2009. SecureMR: A Service Integrity Assurance Framework for MapReduce. In Proceedings of the 2009 Annual Computer Security Applications Conference, 73--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mapper API for Google AppEngine. http://googleappengine.blogspot.com/2010/07/introducing-mapper-api.html (site visited January 2016).Google ScholarGoogle Scholar
  16. B. Gedik, H. Andrade, K. L. Wu, P. S. Yu, and M. Doo. 2008. SPADE: The System S declarative stream processing engine. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 1123--1134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Wenliang, J. Jing, M. Mangal, and M. Murugesan. 2004. Uncheatable grid computing. In the 24th International Conference on Distributed Computing Systems. IEEE Computer Society, Washington, 4--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Kondo, F. Araujo, P. Malecot, P. Domingues. L. M. Silva, G. Fedak, and F. Cappello. 2007. Characterizing result errors in Internet Desktop Grids. In Euro-Par2007. LNCS, Vol. 4641. Springer, Heidelberg, 361--371. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. F. Sarmenta. 2002. Sabotage-tolerance mechanisms for volunteer computing systems. Future Generation Computer Systems. Vol. 18, No. 4, 561--572. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Domingues, B. Sousa, and L. M. Silva. 2007. Sabotage-tolerance and trust management in Desktop Grid computing. Future Generation Computer System, Vol. 23, No. 7, 904--912. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Bendahmane, M. Essaaidi, A. El Moussaoui, and A. Younes. 2012. Result verification mechanism for mapreduce computation integrity in cloud computing. International Conference on Complex Systems. 1--6.Google ScholarGoogle Scholar
  22. M. Grant, S. Sehrish, J. Bent, and J. Wang. 2008. Introducing map-reduce to high end computing. 3rd Petascale Data Storage Workshop.Google ScholarGoogle Scholar
  23. S. Chen and S. Schlosser. 2008. Mapreduce meets wider varieties of applications. Technical Report IRP- TR - 08- 05, Intel Research.Google ScholarGoogle Scholar
  24. A. Matsunaga, M. Tsugawa, and J. Fortes. 2008. Cloudblast: Combining mapreduce and virtualization on distributed resources for bioinformatics. Microsoft eScience Workshop. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hadoop -- mapreduce. https://wiki.apache.org/hadoop/MapReduce, (site visited April 27th 2017).Google ScholarGoogle Scholar
  26. Y. Zhiwei, W. Chaokun, T. Clark, W. Jianmin, L. Shiguo and V. V. Athanasios. 2012. Multimedia Applications and Security in MapReduce: Opportunities and Challenges. Concurrency and Computation: Practice and Experience, Vol. 24, No. 17, 2083--2101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Brun, G. Edwards, B. Y. Jae and N. Medvidovic, "Smart Redundancy for Distributed Computation," in the 31st International Conference on Distributed Computing Systems, Minneapolis, MN, pp. 665 - 676, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Z. Xiao and Y. Xiao. 2011. Accountable mapreduce in cloud computing. IEEE International Conference on Computer Communications Workshops. USA, 1082--1087.Google ScholarGoogle Scholar
  29. I. Roy, S. Setty, A. Kilzer, V. Shmatikov, and E. Witchel. 2010. Airavat: Security and privacy for mapreduce. In Proceedings of the 7th USENIX conference on Networked systems design and implementation. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Huang, S. Zhu and D. Wu. 2012. Towards Trusted Services: Result Verification Schemes for MapReduce. In Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Ottawa, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. Golle and I. Mironov. 2001. Uncheatable distributed computations. In CT-RSA 2001: Proceedings of the 2001 Conference on Topics in Cryptology. London, UK: Springer-Verlag, 425--440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Szajda, B. Lawson, and J. Owen. 2003. Hardening functions for large scale distributed computations. In Proceedings of IEEE Symposium on Security and Privacy. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. P. Varalakshmi, S. T. Selvi, K. A. Devi, C. Krithika and R. Kundhavai. 2008. A Quiz-Based Trust Model with Optimized Resource Management in Grid. In the Thirteenth IEEE Asia-Pacific Computer Systems Architecture Conference. Taiwan, 1--6.Google ScholarGoogle Scholar
  34. J. D. Sonnek, A. Chandra, and J. B. Weissman. 2007. Adaptive reputation-based scheduling on unreliable distributed infrastructures. IEEE Trans. Parallel Distrib. Syst., Vol. 18, No. 11, 1551--1564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P. Golle and S. Stubblebine. 2002. Secure distributed computing in a commercial environment. In Financial Cryptography. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. D. Szajda, B. Lawson, and J. Owen. 2005. Toward an optimal redundancy strategy for distributed computations. In Proceedings of the IEEE International Conference on Cluster Computing. Boston, MA, 1--11.Google ScholarGoogle Scholar
  37. J. Du, N. Shah, and X. Gu. 2011. Adaptive data-driven service integrity attestation for multi-tenant cloud systems. In Proc. IEEE Int. Workshop on Quality of Service, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Du, W. Wei, X. Gu, and T. Yu. 2010. RunTest: Assuring integrity of dataflow processing in cloud computing infrastructures. In Proc. ACM Symposium on Information, Computer and Communications Security, 293--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Y. A. Zuev. 1998. On the estimation of efficiency of voting procedures. Theory Probab. Appl. Vol. 42, No. 1, 71--81.Google ScholarGoogle ScholarCross RefCross Ref
  40. J. Lin, C. Dyer. 2010. Data-Intensive Text Processing with MapReduce. Synthesis Lectures on Human Language Technologies, Vol. 3, 1--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Wordcount example. http://wiki.apache.org/hadoop/WordCount (site visited June 29th 2017).Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications
    May 2018
    357 pages
    ISBN:9781450353045
    DOI:10.1145/3230905

    Copyright © 2018 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 2 May 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    LOPAL '18 Paper Acceptance Rate61of141submissions,43%Overall Acceptance Rate61of141submissions,43%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader