Abstract
In the last decade, cloud infrastructures such as Google Cloud and Amazon AWS have grown vastly in scale and utilization. Therefore, research into the security and confidentiality of sensitive data passed through these infrastructures is of great importance. We present SecureMCMR, a system that utilizes two public clouds for privacy preserving computation outsourcing for MapReduce applications. We also present analysis of 87 MapReduce applications and the operations they use. Our results on three MapReduce applications show overhead of 160%, 254%, and 380% over plaintext execution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The protocol assumes that x and y are integers, however, it is trivially adapted to work over fixpoint representation of real numbers as in the Java implementation of Paillier we use.
References
Ahmad, F., Lee, S., Thottethodi, M., Vijaykumar, T.N.: PUMA: Purdue MapReduce Benchmarks Suite. Technical Report. Purdue University (2012)
Aly, A., et al.: SCALE-MAMBA v1.6 : Documentation (2019). https://homes.esat.kuleuven.be/~nsmart/SCALE/Documentation.pdf
Aly, A., Smart, N.P.: Benchmarking privacy preserving scientific operations. In: Deng, R.H., Gauthier-Umaña, V., Ochoa, M., Yung, M. (eds.) ACNS 2019. LNCS, vol. 11464, pp. 509–529. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21568-2_25
N1 Analytics: javallier (2017). https://github.com/n1analytics/javallier
Apache Software Foundation: Apache CouchDB 2005–2019. https://couchdb.apache.org/
Apache Software Foundation: Apache Hadoop 2006–2018. https://hadoop.apache.org/
Applebaum, B.: Garbled circuits as randomized encodings of functions: a primer. Tutorials on the Foundations of Cryptography. ISC, pp. 1–44. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57048-8_1
Ben-David, A., Nisan, N., Pinkas, B.: FairplayMP: a system for secure multi-party computation. In: Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS 2008), pp. 257–266. Association for Computing Machinery, New York (2008)
Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness theorems for non-cryptographic fault-tolerant distributed computation. In: Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing (STOC 1988), pp. 1–10. ACM, New York (1988)
Bogdanov, D., Laur, S., Willemson, J.: Sharemind: a framework for fast privacy-preserving computations. In: Jajodia, S., Lopez, J. (eds.) ESORICS 2008. LNCS, vol. 5283, pp. 192–206. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88313-5_13
Boldyreva, A., Chenette, N., Lee, Y., O’Neill, A.: Order-preserving symmetric encryption. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 224–241. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01001-9_13
Boldyreva, A., Chenette, N., O’Neill, A.: Order-preserving encryption revisited: improved security analysis and alternative solutions. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 578–595. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22792-9_33
Büscher, N., Demmler, D., Katzenbeisser, S., Kretzmer, D., Schneider, T.: HyCC: compilation of hybrid protocols for practical secure computation. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS 2018), pp. 847–861. ACM, New York (2018)
Chandran, N., Gupta, D., Rastogi, A., Sharma, R., Tripathi, S.: EzPC: programmable, efficient, and scalable secure two-party computation for machine learning. In: IEEE European Symposium on Security and Privacy. (IEEE EuroS&P 2019) (2019)
Chaum, D., Crépeau, C., Damgard, I.: Multiparty unconditionally secure protocols. In: Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing (STOC 1988), pp. 11–19. ACM, New York (1988)
Damgård, I., Geisler, M., Krøigaard, M.: Efficient and secure comparison for on-line auctions. In: Pieprzyk, J., Ghodosi, H., Dawson, E. (eds.) ACISP 2007. LNCS, vol. 4586, pp. 416–430. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73458-1_30
Damgård, I., Geisler, M., Krøigaard, M., Nielsen, J.B.: Asynchronous multiparty computation: theory and implementation. In: Jarecki, S., Tsudik, G. (eds.) PKC 2009. LNCS, vol. 5443, pp. 160–179. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00468-1_10
Damgård, I., Jurik, M.: A generalisation, a simpli.cation and some applications of paillier’s probabilistic public-key system. In: Kim, K. (ed.) PKC 2001. LNCS, vol. 1992, pp. 119–136. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44586-2_9
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, pp. 137–150 (2004)
Demmler, D., Schneider, T., Zohner, M.: ABY - a framework for efficient mixed-protocol secure two-party computation. In: NDSS (2015)
Dinh, T.T.A., Saxena, P., Chang, E.C., Ooi, B.C., Zhang, C.: M2R: enabling stronger privacy in MapReduce computation. In: Proceedings of the 24th USENIX Conference on Security Symposium (SEC 2015), 447–462. USENIX Association, Berkeley (2015)
Disco Project: disco: a Map/Reduce framework for distributed computing 2008–2019. https://github.com/discoproject/disco
Dong, Y., Milanova, A., Dolby, J.: SecureMR: Secure MapReduce computation using homomorphic encryption and program partitioning. In: Proceedings of the 5th Annual Symposium and Bootcamp on Hot Topics in the Science of Security, HoTSoS 2018, Raleigh, North Carolina, USA, 10–11 April 2018, pp. 4:1–4:13 (2018)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(1936), 179–188 (1936)
Franz, M., Holzer, A., Katzenbeisser, S., Schallhart, C., Veith, H.: CBMC-GC: an ANSI C compiler for secure two-party computations. In: Cohen, A. (ed.) CC 2014. LNCS, vol. 8409, pp. 244–249. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54807-9_15
Gascón, A., Schoppmann, P., Balle, B., Raykova, M., Doerner, J., Zahur, S., Evans, D.: Privacy-preserving distributed linear regression on high-dimensional data. PoPETs 2017(2017), 345–364 (2017)
Gentry, C.: Computing arbitrary functions of encrypted data. Commun. ACM 53(3), 97–105 (2010)
Gentry, C., Halevi, S.: Implementing gentry’s fully-homomorphic encryption scheme. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 129–148. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20465-4_9
Giacomelli, I., Jha, S., Joye, M., Page, C.D., Yoon, K.: Privacy-preserving ridge regression with only linearly-homomorphic encryption. In: Preneel, B., Vercauteren, F. (eds.) ACNS 2018. LNCS, vol. 10892, pp. 243–261. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93387-0_13
Goldreich, O., Micali, S., Wigderson, A.: How to play ANY mental game. In: Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (STOC 1987), pp. 218–229. ACM, New York (1987)
Hastings, M., Hemenway, B., Noble, D., Zdancewic, S.: SoK: general purpose compilers for secure multi-party computation. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 479–496. IEEE Computer Society, Los Alamitos (2019)
Intel-bigdata: HiBench Suite 2012–2017. https://github.com/Intel-bigdata/HiBench
Juvekar, C., Vaikuntanathan, V., Chandrakasan, A.: GAZELLE: a low latency framework for secure neural network inference. In: Proceedings of the 27th USENIX Conference on Security Symposium (SEC 2018), pp. 1651–1668. USENIX Association, Berkeley (2018). http://dl.acm.org/citation.cfm?id=3277203.3277326
Keller, M., Pastro, V., Rotaru, D.: Overdrive: making SPDZ great again. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018. LNCS, vol. 10822, pp. 158–189. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78372-7_6
Kennard, L., Milanova, A.: SecureMCMR: Computation Outsourcing for MapReduce Applications. Technical Report. Rensselaer Polytechnic Institute (2020). https://www.cs.rpi.edu/~milanova/docs/LindseyTR.pdf
Kerschbaum, F.: Frequency-hiding order-preserving encryption. In: Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS 2015), pp. 656–667. ACM, New York (2015)
Kerschbaum, F., Biswas, D., de Hoogh, S.: performance comparison of secure comparison protocols. In: Proceedings of the 2009 20th International Workshop on Database and Expert Systems Application (DEXA 2009), pp. 133–136. IEEE Computer Society, Washington, DC (2009)
Liu, J., Juuti, M., Lu, Y., Asokan, N.: Oblivious neural network predictions via MiniONN transformations. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS 2017), pp. 619–631. ACM, New York (2017)
Malkhi, D., Nisan, N., Pinkas, B., Sella, Y.: Fairplay–a secure two-party computation system. In: Proceedings of the 13th Conference on USENIX Security Symposium - Volume 13 (SSYM 2004), p. 20. USENIX Association, Berkeley (2004)
Mohassel, P., Rindal, P.: ABY3: a mixed protocol framework for machine learning. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS 2018), pp. 35–52. Association for Computing Machinery, New York (2018)
Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38 (2017)
Mood, B., Gupta, D., Carter, H., Butler, K., Traynor, P.: Frigate: a validated, extensible, and efficient compiler and interpreter for secure computation. In: 2016 IEEE European Symposium on Security and Privacy (EuroS P), pp. 112–127 (2016)
Punit Naik: MLHadoop 2016–2018. https://github.com/punit-naik/MLHadoop
Nayak, K., Wang, X.S., Ioannidis, S., Weinsberg, U., Taft, N., Shi, E.: GraphSC: parallel secure computation made easy. In: Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP 2015), pp. 377–394. IEEE Computer Society, Washington, DC (2015)
Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: 2013 IEEE Symposium on Security and Privacy, pp. 334–348 (2013)
Ohrimenko, O.: Oblivious multi-party machine learning on trusted processors. In: Proceedings of the 25th USENIX Conference on Security Symposium (SEC 2016), PP. 619–636. USENIX Association, Berkeley (2016)
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16
Pig Mix: PIGMIX2 Benchmarks (2013). https://cwiki.apache.org/confluence/display/PIG/PigMix
Popa, R.A., Redfield, C.M., Zeldovich, N., Balakrishnan, H.: CryptDB: protecting confidentiality with encrypted query processing. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP 2011), PP. 85–100. ACM, New York (2011)
CPullonen, P., Bogdanov, D., Schneider, T.: The Design and Implementation of a Two-Party Protocol Suite for SHAREMIND 3 (2012)
Raj, P.: Predicting a Pulsar Star (2018). https://www.kaggle.com/pavanraj159/predicting-a-pulsar-star/metadata
Riazi, M.S., Weinert, C., Tkachenko, O., Songhori, E.M., Schneider, T., Koushanfar, F.: Chameleon: a hybrid secure computation framework for machine learning applications. In: Proceedings of the 2018 on Asia Conference on Computer and Communications Security (ASIACCS 2018), pp. 707–721. ACM, New York (2018)
Rouhani, B.D., Hussain, S.U., Lauter, K., Koushanfar, F.: ReDCrypt: real-time privacy-preserving deep learning inference in clouds using FPGAs. ACM Trans. Reconfigurable Technol. Syst. 11(3), 1–21 (2018)
Schoenmakers, B.: MPyC - Python Package for Secure Multiparty Computation (2018). https://www.win.tue.nl/~berry/mpyc/
Tetali, S.D., Lesani, M., Majumdar, R., Millstein, T.: MrCrypt: static analysis for secure cloud computations. In: Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications, pp. 271–286 (2013)
Tople, S., et al.: AUTOCRYPT: enabling homomorphic computation on servers to protect sensitive web Content. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security (CCS 2013), pp. 1297–1310 (2013)
TPC: TPC-H 2001–2019. http://www.tpc.org/tpch/
Tu, S.L., Kaashoek, M.F., Madden, S.R., Zeldovich, N.: Processing analytical queries over encrypted data. In: Proceedings of the 39th International Conference on Very Large Data Bases (VLDB 2013), pp. 289–300 (2013)
van Dijk, M., Gentry, C., Halevi, S., Vaikuntanathan, V.: Fully homomorphic encryption over the integers. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 24–43. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13190-5_2
Yao, A.C.: Protocols for secure computations. In: 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982), pp. 160–164 (1982)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kennard, L., Milanova, A. (2020). SecureMCMR: Computation Outsourcing for MapReduce Applications. In: Dolev, S., Kolesnikov, V., Lodha, S., Weiss, G. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2020. Lecture Notes in Computer Science(), vol 12161. Springer, Cham. https://doi.org/10.1007/978-3-030-49785-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-49785-9_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49784-2
Online ISBN: 978-3-030-49785-9
eBook Packages: Computer ScienceComputer Science (R0)