Abstract
The quality of empirical statistical studies is tightly related to the quality and amount of source data available. However, it is often hard to collect data from several sources due to privacy requirements or a lack of trust. In this paper, we propose a novel way to combine secure multi-party computation technology with federated database systems to preserve privacy in statistical studies that combine and analyse data from multiple databases. We describe an implementation on two real-world platforms—the Sharemind secure multi-party computation and the X-Road database federation platform. Our solution enables the privacy-preserving linking and analysis of databases belonging to different institutions. Indeed, a preliminary analysis from the Estonian Data Protection Inspectorate suggests that the correct implementation of our solution ensures that no personally identifiable information is processed in such studies. Therefore, our proposed solution can potentially reduce the costs of conducting statistical studies on shared data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, G., Mishra, N., Pinkas, B.: Secure computation of the median (and other elements of specified ranks). Journal of Cryptology 23(3), 373–401 (2010)
Ansper, A., Buldas, A., Freudenthal, M., Willemson, J.: Scalable and Efficient PKI for Inter-Organizational Communication. In: Proceedings of ACSAC 2003, pp. 308–318 (2003)
Ansper, A., Buldas, A., Freudenthal, M., Willemson, J.: High-Performance Qualified Digital Signatures for X-Road. In: Riis Nielson, H., Gollmann, D. (eds.) NordSec 2013. LNCS, vol. 8208, pp. 123–138. Springer, Heidelberg (2013)
Ansper, A., Buldas, A., Freudenthal, M., Willemson, J.: Protecting a Federated Database Infrastructure Against Denial-of-Service Attacks. In: Luiijf, E., Hartel, P. (eds.) CRITIS 2013. LNCS, vol. 8328, pp. 26–37. Springer, Heidelberg (2013)
Asharov, G., Lindell, Y., Zarosim, H.: Fair and Efficient Secure Multiparty Computation with Reputation Systems. In: Sako, K., Sarkar, P. (eds.) ASIACRYPT 2013, Part II. LNCS, vol. 8270, pp. 201–220. Springer, Heidelberg (2013)
Ben-David, A., Nisan, N., Pinkas, B.: FairplayMP: A system for secure multi-party computation. In: Proceedings of ACM CCS 2008, pp. 257–266 (2008)
Bogdanov, D.: Sharemind: programmable secure computations with practical applications. PhD thesis. University of Tartu (2013)
Bogdanov, D., Laud, P., Randmets, J.: Domain-Polymorphic Programming of Privacy-Preserving Applications. Cryptology ePrint Archive, Report 2013/371 (2013), http://eprint.iacr.org/
Bogdanov, D., Niitsoo, M., Toft, T., Willemson, J.: High-performance secure multi-party computation for data mining applications. International Journal of Information Security 11(6), 403–418 (2012)
Bogdanov, D., Talviste, R., Willemson, J.: Deploying secure multi-party computation for financial data analysis. In: Keromytis, A.D. (ed.) FC 2012. LNCS, vol. 7397, pp. 57–64. Springer, Heidelberg (2012)
Bogetoft, P., et al.: Secure Multiparty Computation Goes Live. In: Dingledine, R., Golle, P. (eds.) FC 2009. LNCS, vol. 5628, pp. 325–343. Springer, Heidelberg (2009)
Breunig, M.M., Kriegel, H.-P., Ng, R.T., Lof, J.S.: Identifying density-based local outliers. In: Proceedings of CM SIGMOD 2000, pp. 93–104 (2000)
Burkhart, M., Strasser, M., Many, D., Dimitropoulos, X.A.: SEPIA: Privacy-Preserving Aggregation of Multi-Domain Network Events and Statistics. In: Proceedings of USENIX 2010, pp. 223–240 (2010)
Canetti, R., Ishai, Y., Kumar, R., Reiter, M.K., Rubinfeld, R., Wright, R.N.: Selective private function evaluation with applications to private statistics. In: Proceedings of PODC 2001, pp. 293–304. ACM (2001)
Cybernetica. Income analysis of the Estonian Public Sector. Online service, https://sharemind.cyber.ee/clouddemo/ (last accessed December 13, 2013)
Damgård, I., Geisler, M., Krøigaard, M., Nielsen, J.B.: Asynchronous multiparty computation: Theory and implementation. In: Jarecki, S., Tsudik, G. (eds.) PKC 2009. LNCS, vol. 5443, pp. 160–179. Springer, Heidelberg (2009)
Damgård, I., Pastro, V., Smart, N., Zakarias, S.: Multiparty computation from somewhat homomorphic encryption. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 643–662. Springer, Heidelberg (2012)
Du, W., Atallah, M.J.: Privacy-preserving cooperative statistical analysis. In: Proceedings of ACSAC 2001, pp. 102–110 (2001)
Du, W., Chen, S., Han, Y.S.: Privacy-preserving multivariate statistical analysis: Linear regression and classification. In: SDM 2004, pp. 222–233 (2004)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. Part II. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)
Feigenbaum, J., Pinkas, B., Ryger, R., Saint-Jean, F.: Secure computation of surveys. In: EU Workshop on Secure Multiparty Protocols (2004)
Gentry, C.: Fully homomorphic encryption using ideal lattices. In: Proceedings of STOC 2009, pp. 169–178. ACM (2009)
Goldreich, O., Ostrovsky, R.: Software Protection and Simulation on Oblivious RAMs. Journal of the ACM 43(3), 431–473 (1996)
Hollander, M., Wolfe, D.A.: Nonparametric statistical methods, 2nd edn. John Wiley, New York (1999)
Hoonhout, H.C.M.: Setting the stage for developing innovative product concepts: people and climate. CoDesign, 3(S1),19–34 (2007)
Hyndman, R.J., Fan, Y.: Sample quantiles in statistical packages. The American Statistician 50(4), 361–365 (1996)
Jawurek, M., Kerschbaum, F.: Fault-tolerant privacy-preserving statistics. In: Fischer-Hübner, S., Wright, M. (eds.) PETS 2012. LNCS, vol. 7384, pp. 221–238. Springer, Heidelberg (2012)
Kalja, A.: The X-Road Project. A Project to Modernize Estonia’s National Databases. Baltic IT&T review 24, 47–48 (2002)
Kalja, A.: The first ten years of X-road. In: Estonian Information Society Yearbook 2011/2012, pp. 78–80. Department of State Information System, Estonia (2012)
Kalja, A., Vallner, U.: Public e-Service Projects in Estonia. In: Proceedings of Baltic DB&IS 2002, vol. 2, pp. 143–153 (June 2002)
Kamm, L., Bogdanov, D., Laur, S., Vilo, J.: A new way to protect privacy in large-scale genome-wide association studies. Bioinformatics 29(7), 886–893 (2013)
Kanji, G.K.: 100 statistical tests. Sage (2006)
Kerschbaum, F.: Practical privacy-preserving benchmarking. In: Jajodia, S., Samarati, P., Cimato, S. (eds.) Proceedings of IFIP TC-11 SEC 2008, vol. 278, pp. 17–31. Springer, Boston (2008)
Kiltz, E., Leander, G., Malone-Lee, J.: Secure computation of the mean and related statistics. In: Kilian, J. (ed.) TCC 2005. LNCS, vol. 3378, pp. 283–302. Springer, Heidelberg (2005)
Lane, J., Heus, P., Mulcahy, T.: Data Access in a Cyber World: Making Use of Cyberinfrastructure. Transactions on Data Privacy 1(1), 2–16 (2008)
Laur, S., Talviste, R., Willemson, J.: From Oblivious AES to Efficient and Secure Database Join in the Multiparty Setting. In: Jacobson, M., Locasto, M., Mohassel, P., Safavi-Naini, R. (eds.) ACNS 2013. LNCS, vol. 7954, pp. 84–101. Springer, Heidelberg (2013)
S. Laur, R. Talviste, J. Willemson.: From Oblivious AES to Efficient and Secure Database Join in the Multiparty Setting (extended version). Cryptology ePrint Archive, Report 2013/203 (2013), http://eprint.iacr.org/
Laur, S., Willemson, J., Zhang, B.: Round-Efficient Oblivious Database Manipulation. In: Lai, X., Zhou, J., Li, H. (eds.) ISC 2011. LNCS, vol. 7001, pp. 262–277. Springer, Heidelberg (2011)
Lettl, C.: User involvement competence for radical innovation. Journal of engineering and technology management 24(1), 53–75 (2007)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and â„“-diversity. In: Proceedings of ICDE 2007 (2007)
Y. Lindell, K. Nissim, C. Orlandi.: Hiding the input-size in secure two-party computation. Cryptology ePrint Archive, Report 2012/679 (2012), http://eprint.iacr.org/
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD)Â 1(1) ( March 2007)
P. Pruulmann-Vengerfeldt, L. Kamm, R. Talviste, P. Laud, D. Bogdanov.: Deliverable D1.1—Capability model (2012), http://usable-security.eu/files/D1.1.pdf.pdf
Samarati, P.: Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13, 1010–1027 (2001)
Shamir, A.: How to share a secret. Communications of the ACM 22, 612–613 (1979)
Suber, P.: Open Access. MIT Press (2012)
Subramaniam, H., Wright, R.N., Yang, Z.: Experimental analysis of privacy-preserving statistics computation. In: Jonker, W., Petković, M. (eds.) SDM 2004. LNCS, vol. 3178, pp. 55–66. Springer, Heidelberg (2004)
Sweeney, L.: K-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)
Wilcoxon, F.: Individual Comparisons by Ranking Methods. Biometrics Bulletin 1(6), 80–83 (1945)
Willemson, J.: Pseudonymization Service for X-Road eGovernment Data Exchange Layer. In: Andersen, K.N., Francesconi, E., Grönlund, Å., van Engers, T.M. (eds.) EGOVIS 2011. LNCS, vol. 6866, pp. 135–145. Springer, Heidelberg (2011)
Willemson, J., Ansper, A.: A Secure and Scalable Infrastructure for Inter-Organizational Data Exchange and eGovernment Applications. In: Proceedings of ARES 2008, pp. 572–577. IEEE Computer Society (2008)
Yang, Z., Wright, R.N., Subramaniam, H.: Experimental analysis of a privacy-preserving scalar product protocol. Computer Systems Science & Engineering 21(1) (2006)
Yao, A.C.-C.: Protocols for Secure Computations (Extended Abstract). In: Proceedings of FOCS 1982, pp. 160–164. IEEE (1982)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Bogdanov, D., Kamm, L., Laur, S., Pruulmann-Vengerfeldt, P., Talviste, R., Willemson, J. (2014). Privacy-Preserving Statistical Data Analysis on Federated Databases. In: Preneel, B., Ikonomou, D. (eds) Privacy Technologies and Policy. APF 2014. Lecture Notes in Computer Science, vol 8450. Springer, Cham. https://doi.org/10.1007/978-3-319-06749-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-06749-0_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06748-3
Online ISBN: 978-3-319-06749-0
eBook Packages: Computer ScienceComputer Science (R0)