Skip to main content

SecureMCMR: Computation Outsourcing for MapReduce Applications

  • Conference paper
  • First Online:
Cyber Security Cryptography and Machine Learning (CSCML 2020)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12161))

  • 892 Accesses

Abstract

In the last decade, cloud infrastructures such as Google Cloud and Amazon AWS have grown vastly in scale and utilization. Therefore, research into the security and confidentiality of sensitive data passed through these infrastructures is of great importance. We present SecureMCMR, a system that utilizes two public clouds for privacy preserving computation outsourcing for MapReduce applications. We also present analysis of 87 MapReduce applications and the operations they use. Our results on three MapReduce applications show overhead of 160%, 254%, and 380% over plaintext execution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The protocol assumes that x and y are integers, however, it is trivially adapted to work over fixpoint representation of real numbers as in the Java implementation of Paillier we use.

References

  1. Ahmad, F., Lee, S., Thottethodi, M., Vijaykumar, T.N.: PUMA: Purdue MapReduce Benchmarks Suite. Technical Report. Purdue University (2012)

    Google Scholar 

  2. Aly, A., et al.: SCALE-MAMBA v1.6 : Documentation (2019). https://homes.esat.kuleuven.be/~nsmart/SCALE/Documentation.pdf

  3. Aly, A., Smart, N.P.: Benchmarking privacy preserving scientific operations. In: Deng, R.H., Gauthier-Umaña, V., Ochoa, M., Yung, M. (eds.) ACNS 2019. LNCS, vol. 11464, pp. 509–529. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21568-2_25

    Chapter  Google Scholar 

  4. N1 Analytics: javallier (2017). https://github.com/n1analytics/javallier

  5. Apache Software Foundation: Apache CouchDB 2005–2019. https://couchdb.apache.org/

  6. Apache Software Foundation: Apache Hadoop 2006–2018. https://hadoop.apache.org/

  7. Applebaum, B.: Garbled circuits as randomized encodings of functions: a primer. Tutorials on the Foundations of Cryptography. ISC, pp. 1–44. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57048-8_1

    Chapter  Google Scholar 

  8. Ben-David, A., Nisan, N., Pinkas, B.: FairplayMP: a system for secure multi-party computation. In: Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS 2008), pp. 257–266. Association for Computing Machinery, New York (2008)

    Google Scholar 

  9. Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness theorems for non-cryptographic fault-tolerant distributed computation. In: Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing (STOC 1988), pp. 1–10. ACM, New York (1988)

    Google Scholar 

  10. Bogdanov, D., Laur, S., Willemson, J.: Sharemind: a framework for fast privacy-preserving computations. In: Jajodia, S., Lopez, J. (eds.) ESORICS 2008. LNCS, vol. 5283, pp. 192–206. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88313-5_13

    Chapter  Google Scholar 

  11. Boldyreva, A., Chenette, N., Lee, Y., O’Neill, A.: Order-preserving symmetric encryption. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 224–241. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01001-9_13

    Chapter  Google Scholar 

  12. Boldyreva, A., Chenette, N., O’Neill, A.: Order-preserving encryption revisited: improved security analysis and alternative solutions. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 578–595. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22792-9_33

    Chapter  Google Scholar 

  13. Büscher, N., Demmler, D., Katzenbeisser, S., Kretzmer, D., Schneider, T.: HyCC: compilation of hybrid protocols for practical secure computation. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS 2018), pp. 847–861. ACM, New York (2018)

    Google Scholar 

  14. Chandran, N., Gupta, D., Rastogi, A., Sharma, R., Tripathi, S.: EzPC: programmable, efficient, and scalable secure two-party computation for machine learning. In: IEEE European Symposium on Security and Privacy. (IEEE EuroS&P 2019) (2019)

    Google Scholar 

  15. Chaum, D., Crépeau, C., Damgard, I.: Multiparty unconditionally secure protocols. In: Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing (STOC 1988), pp. 11–19. ACM, New York (1988)

    Google Scholar 

  16. Damgård, I., Geisler, M., Krøigaard, M.: Efficient and secure comparison for on-line auctions. In: Pieprzyk, J., Ghodosi, H., Dawson, E. (eds.) ACISP 2007. LNCS, vol. 4586, pp. 416–430. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73458-1_30

    Chapter  MATH  Google Scholar 

  17. Damgård, I., Geisler, M., Krøigaard, M., Nielsen, J.B.: Asynchronous multiparty computation: theory and implementation. In: Jarecki, S., Tsudik, G. (eds.) PKC 2009. LNCS, vol. 5443, pp. 160–179. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00468-1_10

    Chapter  Google Scholar 

  18. Damgård, I., Jurik, M.: A generalisation, a simpli.cation and some applications of paillier’s probabilistic public-key system. In: Kim, K. (ed.) PKC 2001. LNCS, vol. 1992, pp. 119–136. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44586-2_9

    Chapter  MATH  Google Scholar 

  19. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, pp. 137–150 (2004)

    Google Scholar 

  20. Demmler, D., Schneider, T., Zohner, M.: ABY - a framework for efficient mixed-protocol secure two-party computation. In: NDSS (2015)

    Google Scholar 

  21. Dinh, T.T.A., Saxena, P., Chang, E.C., Ooi, B.C., Zhang, C.: M2R: enabling stronger privacy in MapReduce computation. In: Proceedings of the 24th USENIX Conference on Security Symposium (SEC 2015), 447–462. USENIX Association, Berkeley (2015)

    Google Scholar 

  22. Disco Project: disco: a Map/Reduce framework for distributed computing 2008–2019. https://github.com/discoproject/disco

  23. Dong, Y., Milanova, A., Dolby, J.: SecureMR: Secure MapReduce computation using homomorphic encryption and program partitioning. In: Proceedings of the 5th Annual Symposium and Bootcamp on Hot Topics in the Science of Security, HoTSoS 2018, Raleigh, North Carolina, USA, 10–11 April 2018, pp. 4:1–4:13 (2018)

    Google Scholar 

  24. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7(1936), 179–188 (1936)

    Article  Google Scholar 

  25. Franz, M., Holzer, A., Katzenbeisser, S., Schallhart, C., Veith, H.: CBMC-GC: an ANSI C compiler for secure two-party computations. In: Cohen, A. (ed.) CC 2014. LNCS, vol. 8409, pp. 244–249. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54807-9_15

    Chapter  Google Scholar 

  26. Gascón, A., Schoppmann, P., Balle, B., Raykova, M., Doerner, J., Zahur, S., Evans, D.: Privacy-preserving distributed linear regression on high-dimensional data. PoPETs 2017(2017), 345–364 (2017)

    Google Scholar 

  27. Gentry, C.: Computing arbitrary functions of encrypted data. Commun. ACM 53(3), 97–105 (2010)

    Article  Google Scholar 

  28. Gentry, C., Halevi, S.: Implementing gentry’s fully-homomorphic encryption scheme. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 129–148. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20465-4_9

    Chapter  Google Scholar 

  29. Giacomelli, I., Jha, S., Joye, M., Page, C.D., Yoon, K.: Privacy-preserving ridge regression with only linearly-homomorphic encryption. In: Preneel, B., Vercauteren, F. (eds.) ACNS 2018. LNCS, vol. 10892, pp. 243–261. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93387-0_13

    Chapter  Google Scholar 

  30. Goldreich, O., Micali, S., Wigderson, A.: How to play ANY mental game. In: Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (STOC 1987), pp. 218–229. ACM, New York (1987)

    Google Scholar 

  31. Hastings, M., Hemenway, B., Noble, D., Zdancewic, S.: SoK: general purpose compilers for secure multi-party computation. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 479–496. IEEE Computer Society, Los Alamitos (2019)

    Google Scholar 

  32. Intel-bigdata: HiBench Suite 2012–2017. https://github.com/Intel-bigdata/HiBench

  33. Juvekar, C., Vaikuntanathan, V., Chandrakasan, A.: GAZELLE: a low latency framework for secure neural network inference. In: Proceedings of the 27th USENIX Conference on Security Symposium (SEC 2018), pp. 1651–1668. USENIX Association, Berkeley (2018). http://dl.acm.org/citation.cfm?id=3277203.3277326

  34. Keller, M., Pastro, V., Rotaru, D.: Overdrive: making SPDZ great again. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018. LNCS, vol. 10822, pp. 158–189. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78372-7_6

    Chapter  Google Scholar 

  35. Kennard, L., Milanova, A.: SecureMCMR: Computation Outsourcing for MapReduce Applications. Technical Report. Rensselaer Polytechnic Institute (2020). https://www.cs.rpi.edu/~milanova/docs/LindseyTR.pdf

  36. Kerschbaum, F.: Frequency-hiding order-preserving encryption. In: Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS 2015), pp. 656–667. ACM, New York (2015)

    Google Scholar 

  37. Kerschbaum, F., Biswas, D., de Hoogh, S.: performance comparison of secure comparison protocols. In: Proceedings of the 2009 20th International Workshop on Database and Expert Systems Application (DEXA 2009), pp. 133–136. IEEE Computer Society, Washington, DC (2009)

    Google Scholar 

  38. Liu, J., Juuti, M., Lu, Y., Asokan, N.: Oblivious neural network predictions via MiniONN transformations. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS 2017), pp. 619–631. ACM, New York (2017)

    Google Scholar 

  39. Malkhi, D., Nisan, N., Pinkas, B., Sella, Y.: Fairplay–a secure two-party computation system. In: Proceedings of the 13th Conference on USENIX Security Symposium - Volume 13 (SSYM 2004), p. 20. USENIX Association, Berkeley (2004)

    Google Scholar 

  40. Mohassel, P., Rindal, P.: ABY3: a mixed protocol framework for machine learning. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS 2018), pp. 35–52. Association for Computing Machinery, New York (2018)

    Google Scholar 

  41. Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 19–38 (2017)

    Google Scholar 

  42. Mood, B., Gupta, D., Carter, H., Butler, K., Traynor, P.: Frigate: a validated, extensible, and efficient compiler and interpreter for secure computation. In: 2016 IEEE European Symposium on Security and Privacy (EuroS P), pp. 112–127 (2016)

    Google Scholar 

  43. Punit Naik: MLHadoop 2016–2018. https://github.com/punit-naik/MLHadoop

  44. Nayak, K., Wang, X.S., Ioannidis, S., Weinsberg, U., Taft, N., Shi, E.: GraphSC: parallel secure computation made easy. In: Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP 2015), pp. 377–394. IEEE Computer Society, Washington, DC (2015)

    Google Scholar 

  45. Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: 2013 IEEE Symposium on Security and Privacy, pp. 334–348 (2013)

    Google Scholar 

  46. Ohrimenko, O.: Oblivious multi-party machine learning on trusted processors. In: Proceedings of the 25th USENIX Conference on Security Symposium (SEC 2016), PP. 619–636. USENIX Association, Berkeley (2016)

    Google Scholar 

  47. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48910-X_16

    Chapter  Google Scholar 

  48. Pig Mix: PIGMIX2 Benchmarks (2013). https://cwiki.apache.org/confluence/display/PIG/PigMix

  49. Popa, R.A., Redfield, C.M., Zeldovich, N., Balakrishnan, H.: CryptDB: protecting confidentiality with encrypted query processing. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP 2011), PP. 85–100. ACM, New York (2011)

    Google Scholar 

  50. CPullonen, P., Bogdanov, D., Schneider, T.: The Design and Implementation of a Two-Party Protocol Suite for SHAREMIND 3 (2012)

    Google Scholar 

  51. Raj, P.: Predicting a Pulsar Star (2018). https://www.kaggle.com/pavanraj159/predicting-a-pulsar-star/metadata

  52. Riazi, M.S., Weinert, C., Tkachenko, O., Songhori, E.M., Schneider, T., Koushanfar, F.: Chameleon: a hybrid secure computation framework for machine learning applications. In: Proceedings of the 2018 on Asia Conference on Computer and Communications Security (ASIACCS 2018), pp. 707–721. ACM, New York (2018)

    Google Scholar 

  53. Rouhani, B.D., Hussain, S.U., Lauter, K., Koushanfar, F.: ReDCrypt: real-time privacy-preserving deep learning inference in clouds using FPGAs. ACM Trans. Reconfigurable Technol. Syst. 11(3), 1–21 (2018)

    Article  Google Scholar 

  54. Schoenmakers, B.: MPyC - Python Package for Secure Multiparty Computation (2018). https://www.win.tue.nl/~berry/mpyc/

  55. Tetali, S.D., Lesani, M., Majumdar, R., Millstein, T.: MrCrypt: static analysis for secure cloud computations. In: Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications, pp. 271–286 (2013)

    Google Scholar 

  56. Tople, S., et al.: AUTOCRYPT: enabling homomorphic computation on servers to protect sensitive web Content. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security (CCS 2013), pp. 1297–1310 (2013)

    Google Scholar 

  57. TPC: TPC-H 2001–2019. http://www.tpc.org/tpch/

  58. Tu, S.L., Kaashoek, M.F., Madden, S.R., Zeldovich, N.: Processing analytical queries over encrypted data. In: Proceedings of the 39th International Conference on Very Large Data Bases (VLDB 2013), pp. 289–300 (2013)

    Google Scholar 

  59. van Dijk, M., Gentry, C., Halevi, S., Vaikuntanathan, V.: Fully homomorphic encryption over the integers. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 24–43. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13190-5_2

    Chapter  Google Scholar 

  60. Yao, A.C.: Protocols for secure computations. In: 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982), pp. 160–164 (1982)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lindsey Kennard .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kennard, L., Milanova, A. (2020). SecureMCMR: Computation Outsourcing for MapReduce Applications. In: Dolev, S., Kolesnikov, V., Lodha, S., Weiss, G. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2020. Lecture Notes in Computer Science(), vol 12161. Springer, Cham. https://doi.org/10.1007/978-3-030-49785-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49785-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49784-2

  • Online ISBN: 978-3-030-49785-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics