research-article

An Efficient Approach to Improve Security for MapReduce Computation in Cloud System

Authors:
Ahmed Bendahmane

Computer Science and Systems Engineering Laboratory, Abdelmalek Essaadi University, Tetouan, Morocco

Computer Science and Systems Engineering Laboratory, Abdelmalek Essaadi University, Tetouan, Morocco
View Profile

,
Hanane Bennasar

College of IT (ENSIAS), Mohamed V University, Rabat, Morocco

College of IT (ENSIAS), Mohamed V University, Rabat, Morocco
View Profile

,
Mohammad Essaaidi

College of IT (ENSIAS), Mohamed V University, Rabat, Morocco

College of IT (ENSIAS), Mohamed V University, Rabat, Morocco
View Profile

LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and ApplicationsMay 2018Article No.: 53Pages 1–6https://doi.org/10.1145/3230905.3230954

Published:02 May 2018Publication History

LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications

Pages 1–6

ABSTRACT

Running MapReduce computation in public cloud raises a series of security challenges since the service providers may not be properly protected. Due to the fact that the MapReduce applications are long-running, which increases the chance of an attacker to massively perform malicious attacks by exploiting the workers vulnerability, many workers may be compromised. Those workers could misbehave and thereby tamper the results integrity of all computations assigned to them. To tackle this challenge, this paper proposes an effective Result Verification Mechanism (RVM) using a reputation threshold-based voting method to ensure the result integrity of MapReduce on the map and reduce phases. Therefore, render the MapReduce computation accurate. Another major contribution of this paper is that we implement RVM based on Apache Hadoop and perform a series of experiments. The evaluation study of the experimental results demonstrate that RVM can significantly reduce computation overhead and guarantee a low error rate as compared to the simple voting method like m-first voting.

References

J. Ekanayake, S. Pallickara, and G. Fox. 2008. MapReduce for data intensive scientific analyses. In Proceedings of the 4th IEEE International Conference on eScience. Indianapolis, IN USA, 277--284. Google ScholarDigital Library
J. Dean and S. Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation. USENIX Association, 137--149. Google ScholarDigital Library
David A. Anisi. 2003. Optimal Motion Control of a Ground Vehicle. Master's thesis. Royal Institute of Technology (KTH), Stockholm, Sweden. J. Zhao and J. Pjesivac-Grbovic. 2009. MapReduce: MapReduce: The programming model and practice. SIGMETRICS (Google). June 2009.Google Scholar
B. Langmead, M. Schatz, J. Lin, M. Pop, and S. Salzberg. 2009. Searching for SNPs with cloud computing. Genome Biology, Vol. 10, No. 11, November 2009.Google ScholarCross Ref
M. C. Schatz. 2009. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics, Vol. 25, No. 11, 1363--1369. Google ScholarDigital Library
A. W. Services. Aws case study: Washington post. https://aws.amazon.com/solutions/case-studies, (site visited January 2017).Google Scholar
M. Moca, G. C. Silaghi, and G. Fedak. 2011. Distributed Results Checking for MapReduce in Volunteer Computing. In IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, Shanghai, China, 1847--1854. Google ScholarDigital Library
K. Watanabe, M. Fukushi, and S. Horiguchi. 2009. Collusion-Resistant Sabotage-Tolerance mechanisms for volunteer computing systems. IEEE International Conference on e-Business Engineering, Macau, 213--218. Google ScholarDigital Library
S. Zhao, V. Lo, and C. Gauthier Dickey. 2005. Result verification and trust-based scheduling in peer-to-peer grids. In the 5th IEEE International Conference on Peer-to-Peer Computing. IEEE Computer Society, Washington, 31--38. Google ScholarDigital Library
J. Dean, and S. Ghemawat. 2008. MapReduce: Simplified Data Processing on Large Clusters. Commun ACM 51, 1 (2008), 107--113. Google ScholarDigital Library
Amazon Elastic MapReduce. http://aws.amazon.com/elasticmapreduce/. (Site visited January 2016).Google Scholar
Y. Chen, V. Paxson, and R. Katz. 2010. What's New About Cloud Computing Security?. Technical Report UCB/EECS-2010-5, Berkeley.Google Scholar
Y. Wang, and J. Wei. 2011. VIAF: Verification-Based integrity assurance framework for mapReduce. In Proc. IEEE International Conference on Cloud Computing (Cloud 11), IEEE Press, 300--307. Google ScholarDigital Library
W. Wei, J. Du, T. Yu, and X. Gu. 2009. SecureMR: A Service Integrity Assurance Framework for MapReduce. In Proceedings of the 2009 Annual Computer Security Applications Conference, 73--82. Google ScholarDigital Library
Mapper API for Google AppEngine. http://googleappengine.blogspot.com/2010/07/introducing-mapper-api.html (site visited January 2016).Google Scholar
B. Gedik, H. Andrade, K. L. Wu, P. S. Yu, and M. Doo. 2008. SPADE: The System S declarative stream processing engine. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 1123--1134. Google ScholarDigital Library
D. Wenliang, J. Jing, M. Mangal, and M. Murugesan. 2004. Uncheatable grid computing. In the 24th International Conference on Distributed Computing Systems. IEEE Computer Society, Washington, 4--11. Google ScholarDigital Library
D. Kondo, F. Araujo, P. Malecot, P. Domingues. L. M. Silva, G. Fedak, and F. Cappello. 2007. Characterizing result errors in Internet Desktop Grids. In Euro-Par2007. LNCS, Vol. 4641. Springer, Heidelberg, 361--371. Google ScholarDigital Library
L. F. Sarmenta. 2002. Sabotage-tolerance mechanisms for volunteer computing systems. Future Generation Computer Systems. Vol. 18, No. 4, 561--572. Google ScholarDigital Library
P. Domingues, B. Sousa, and L. M. Silva. 2007. Sabotage-tolerance and trust management in Desktop Grid computing. Future Generation Computer System, Vol. 23, No. 7, 904--912. Google ScholarDigital Library
A. Bendahmane, M. Essaaidi, A. El Moussaoui, and A. Younes. 2012. Result verification mechanism for mapreduce computation integrity in cloud computing. International Conference on Complex Systems. 1--6.Google Scholar
M. Grant, S. Sehrish, J. Bent, and J. Wang. 2008. Introducing map-reduce to high end computing. 3rd Petascale Data Storage Workshop.Google Scholar
S. Chen and S. Schlosser. 2008. Mapreduce meets wider varieties of applications. Technical Report IRP- TR - 08- 05, Intel Research.Google Scholar
A. Matsunaga, M. Tsugawa, and J. Fortes. 2008. Cloudblast: Combining mapreduce and virtualization on distributed resources for bioinformatics. Microsoft eScience Workshop. Google ScholarDigital Library
Hadoop -- mapreduce. https://wiki.apache.org/hadoop/MapReduce, (site visited April 27th 2017).Google Scholar
Y. Zhiwei, W. Chaokun, T. Clark, W. Jianmin, L. Shiguo and V. V. Athanasios. 2012. Multimedia Applications and Security in MapReduce: Opportunities and Challenges. Concurrency and Computation: Practice and Experience, Vol. 24, No. 17, 2083--2101. Google ScholarDigital Library
Y. Brun, G. Edwards, B. Y. Jae and N. Medvidovic, "Smart Redundancy for Distributed Computation," in the 31st International Conference on Distributed Computing Systems, Minneapolis, MN, pp. 665 - 676, 2011. Google ScholarDigital Library
Z. Xiao and Y. Xiao. 2011. Accountable mapreduce in cloud computing. IEEE International Conference on Computer Communications Workshops. USA, 1082--1087.Google Scholar
I. Roy, S. Setty, A. Kilzer, V. Shmatikov, and E. Witchel. 2010. Airavat: Security and privacy for mapreduce. In Proceedings of the 7th USENIX conference on Networked systems design and implementation. USENIX Association. Google ScholarDigital Library
C. Huang, S. Zhu and D. Wu. 2012. Towards Trusted Services: Result Verification Schemes for MapReduce. In Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Ottawa, Canada. Google ScholarDigital Library
P. Golle and I. Mironov. 2001. Uncheatable distributed computations. In CT-RSA 2001: Proceedings of the 2001 Conference on Topics in Cryptology. London, UK: Springer-Verlag, 425--440. Google ScholarDigital Library
D. Szajda, B. Lawson, and J. Owen. 2003. Hardening functions for large scale distributed computations. In Proceedings of IEEE Symposium on Security and Privacy. Google ScholarDigital Library
P. Varalakshmi, S. T. Selvi, K. A. Devi, C. Krithika and R. Kundhavai. 2008. A Quiz-Based Trust Model with Optimized Resource Management in Grid. In the Thirteenth IEEE Asia-Pacific Computer Systems Architecture Conference. Taiwan, 1--6.Google Scholar
J. D. Sonnek, A. Chandra, and J. B. Weissman. 2007. Adaptive reputation-based scheduling on unreliable distributed infrastructures. IEEE Trans. Parallel Distrib. Syst., Vol. 18, No. 11, 1551--1564. Google ScholarDigital Library
P. Golle and S. Stubblebine. 2002. Secure distributed computing in a commercial environment. In Financial Cryptography. Springer. Google ScholarDigital Library
D. Szajda, B. Lawson, and J. Owen. 2005. Toward an optimal redundancy strategy for distributed computations. In Proceedings of the IEEE International Conference on Cluster Computing. Boston, MA, 1--11.Google Scholar
J. Du, N. Shah, and X. Gu. 2011. Adaptive data-driven service integrity attestation for multi-tenant cloud systems. In Proc. IEEE Int. Workshop on Quality of Service, 1--9. Google ScholarDigital Library
J. Du, W. Wei, X. Gu, and T. Yu. 2010. RunTest: Assuring integrity of dataflow processing in cloud computing infrastructures. In Proc. ACM Symposium on Information, Computer and Communications Security, 293--304. Google ScholarDigital Library
Y. A. Zuev. 1998. On the estimation of efficiency of voting procedures. Theory Probab. Appl. Vol. 42, No. 1, 71--81.Google ScholarCross Ref
J. Lin, C. Dyer. 2010. Data-Intensive Text Processing with MapReduce. Synthesis Lectures on Human Language Technologies, Vol. 3, 1--177. Google ScholarDigital Library
Wordcount example. http://wiki.apache.org/hadoop/WordCount (site visited June 29th 2017).Google Scholar

Recommendations

Log files Analysis Using MapReduce to Improve Security
Abstract
Log files are a very useful source of information to diagnose system security and to detect problems that occur in the system, and are often very large and can have complex structure. In this paper, we provide a methodology of security analysis ...
Read More
Mapreduce over the hybrid cloud: a novel infrastructure management policy
UCC '15: Proceedings of the 8th International Conference on Utility and Cloud Computing

Over the last few years, the context of big data has gained a significant traction due to many factors. While the public cloud model had been deeply studied to face the increasing demand for large-scale data processing capabilities, many organizations ...
Read More
Software execution protection in the cloud
EWDCC '12: Proceedings of the 1st European Workshop on Dependable Cloud Computing

Most cloud computing services execute software on behalf of their users. Many war stories and several studies suggest that such software execution is threatened by accidental arbitrary faults and malicious insiders. We present two lines of work to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications
May 2018
357 pages
ISBN:9781450353045
DOI:10.1145/3230905

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 May 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Cloud System
Malicious Attacks
MapReduce
Reputation
Result Correctness
Voting
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
LOPAL '18 Paper Acceptance Rate61of141submissions,43%Overall Acceptance Rate61of141submissions,43%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 37
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An Efficient Approach to Improve Security for MapReduce Computation in Cloud System

LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications

ABSTRACT

References

Cited By

Recommendations

Log files Analysis Using MapReduce to Improve Security

Mapreduce over the hybrid cloud: a novel infrastructure management policy

Software execution protection in the cloud

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An Efficient Approach to Improve Security for MapReduce Computation in Cloud System

LOPAL '18: Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications

ABSTRACT

References

Cited By

Recommendations

Log files Analysis Using MapReduce to Improve Security

Mapreduce over the hybrid cloud: a novel infrastructure management policy

Software execution protection in the cloud

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media