skip to main content
article

Preserving data privacy in outsourcing data aggregation services

Published: 01 August 2007 Publication History

Abstract

Advances in distributed service-oriented computing and Internet technology have formed a strong technology push for outsourcing and information sharing. There is an increasing need for organizations to share their data across organization boundaries both within the country and with countries that may have lesser privacy and security standards. Ideally, we wish to share certain statistical data and extract the knowledge from the private databases without revealing any additional information of each individual database apart from the aggregate result that is permitted. In this article, we describe two scenarios for outsourcing data aggregation services and present a set of decentralized peer-to-peer protocols for supporting data sharing across multiple private databases while minimizing the data disclosure among individual parties. Our basic protocols include a set of novel probabilistic computation mechanisms for important primitive data aggregation operations across multiple private databases such as max, min, and top k selection. We provide an analytical study of our basic protocols in terms of precision, efficiency, and privacy characteristics. Our advanced protocols implement an efficient algorithm for performing kNN classification across multiple private databases. We provide a set of experiments to evaluate the proposed protocols in terms of their correctness, efficiency, and privacy characteristics.

References

[1]
Aggarwal, G., Bawa, M., Ganesan, P., Garcia-Molina, H., Kenthapadi, K., Motwani, R., Srivastava, U., Thomas, D., and Xu, Y. 2005. Two can keep a secret: A distributed architecture for secure database services. Conference on Innovative Data Systems Research (CIDR).
[2]
Aggarwal, G., Mishra, N., and Pinkas, B. 2004. Secure computation of the kth ranked element. IACR Conference on Eurocryption.
[3]
Agrawal, D. and Aggarwal, C. C. 2001. On the design and quantification of privacy preserving data mining algorithms. Symposium on Principles of Database Systems.
[4]
Agrawal, R., Bird, P., Grandison, T., Kieman, J., Logan, S., and Rjaibi, W. 2005. Extending relational database systems to automatically enforce privacy policies. 21st International Conference on Data Engineering (ICDE).
[5]
Agrawal, R., Evfimievski, A., and Srikant, R. 2003. Information sharing across private databases. ACM SIGMOD International Conference on Management of Data.
[6]
Agrawal, R., Kieman, J., Srikant, R., and Xu, Y. 2002. Hippocratic databases. International Conference on Very Large Databases (VLDB).
[7]
Agrawal, R., Kiernan, J., Srikant, R., and Xu, Y. 2004. Order-preserving encryption for numeric data. ACM SIGMOD International Conference on Management of Data.
[8]
Bawa, M., Bayardo, R. J., and Agrawal, R. 2003. Privacy-preserving indexing of documents on the network. 29th International Conference on Very Large Databases (VLDB).
[9]
Bertino, E., Ooi, B., Yang, Y., and Deng, R. H. 2005. Privacy and ownership preserving of outsourced medical data. International Conference on Data Engineering (ICDE).
[10]
Blaze, M., Feigenbaum, J., and Lacy, J. 1996. Decentralized trust management. IEEE Conference on Privacy and Security.
[11]
Clifton, C. 2002. Tutorial on privacy, security, and data mining. 13th European Conference on Machine Learning and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases.
[12]
Clifton, C., Kantarcioglu, M., Lin, X., Vaidya, J., and Zhu, M. 2003. Tools for privacy preserving distributed data mining. SIGKDD Explorations.
[13]
Dijkstra, E. W. 1974. Self-stabilizing systems in spite of distributed control. Commun. ACM 17, 11.
[14]
Doan, A. and Halevy, A. 2005. Semantic integration research in the database community: A brief survey. AI Magazine (Special Issue on Semantic Integration).
[15]
Elmagarmid, A., Rusinkiewicz, M., and Sheth, A., Eds. 1999. Management of Heterogeneous and Autonomous Database Systems 1st Ed. Morgan Kaufmann.
[16]
Garcia-Molina, H., Ullman, J. D., and Widom, J. D. 2001. Information Integration, Chapter 20. Prentice Hall.
[17]
Goldreich, O. 2001. Secure multi-party computation. Working Draft, version 1.3.
[18]
Hacigumus, H., Iyer, B., Li, C., and Mehrotra, S. 2002. Executing SQL over encrypted data in the database service provider model. ACM SIGMOD Conference on Management of Data.
[19]
Hacigumus, H., Iyer, B., and Mehrotra, S. 2002. Providing database as a service. International Conference on Data Engineering (ICDE).
[20]
Halevy, A. Y., Ashish, N., Bitton, D., Carey, M. J., Draper, D., Pollock, J., Rosenthal, A., and Sikka, V. 2005. Enterprise information integration: successes, challenges and controversies. ACM SIGMOD International Conference on Management of Data.
[21]
Hore, B., Mehrotra, S., and Tsudik, G. 1997. A privacy-preserving index for range queries. ACM Symposium on Principles of Distributed Computing.
[22]
Jajodia, S. and Sandhu, R. 1991. Toward a multilevel secure relational data model. ACM SIGMOD International Conference on Management of Data.
[23]
Kantarcioglu, M. and Clifton, C. 2004a. Privacy preserving data mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Engin. 16, 9.
[24]
Kantarcioglu, M. and Clifton, C. 2004b. Security issues in querying encrypted data. Tech. rep. TR-04-013, Purdue University.
[25]
Kantarcioglu, M. and Clifton, C. 2005. Privacy preserving k-nn classifier. International Conference on Data Engineering (ICDE).
[26]
Kantarcoglu, M. and Vaidya, J. 2003. Privacy preserving naive Bayes classifier for horizontally partitioned data. IEEE ICDM Workshop on Privacy Preserving Data Mining.
[27]
Lindell, Y. and Pinkas, B. 2002. Privacy preserving data mining. J. Crypto. 15, 3.
[28]
Lynch, N. A. 1996. Distributed Algorithms. Morgan Kaufmann Publishers.
[29]
Markey, E. J. 2005. Outsourcing privacy: Countries processing U.S. social security numbers, health information, tax records lack fundamental privacy safeguards. A staff report prepared at the request of Edward J. Markey, U.S. House of Representatives.
[30]
Reiter, M. K. and Rubin, A. D. 1998. Crowds: Anonymity for Web transactions. ACM Trans. Inform. Syst. Secur. (TISSEC) 1, 1.
[31]
Syverson, S., Coldsehlag, D. M., and Reed, M. C. 1997. Anonymous connections and onion routing. IEEE Symposium on Security and Privacy.
[32]
Vaidya, J. and Clifton, C. 2002. Privacy preserving association rule mining in vertically partitioned data. The 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[33]
Vaidya, J. and Clifton, C. 2003a. Privacy-preserving k-means clustering over vertically partitioned data. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[34]
Vaidya, J. and Clifton, C. 2003b. Privacy preserving naive Bayes classifier for vertically partitioned data. The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[35]
Vaidya, J. and Clifton, C. 2005. Privacy-preserving top-k queries. International Conference on Data Engineering (ICDE).
[36]
Wang, K., Fung, B. C. M., and Dong, G. 2005. Integrating private databases for data analysis. IEEE Intelligence and Security Informatics Conference (ISI).
[37]
Wright, M., Adler, M., Levine, B. N., and Shields, C. 2003. Defending anonymous communications against passive logging attacks. IEEE Symposium on Security and Privacy.
[38]
Xiao, L., Xu, Z., and Zhang, X. 2003. Mutual anonymity protocols for hybrid peer-to-peer systems. International Conference on Distributed Computing Systems (ICDCS).
[39]
Xiong, L., Chitti, S., and Liu, L. 2005. Topk queries across multiple private databases. 25th International Conference on Distributed Computing Systems (ICDCS).
[40]
Xiong, L. and Liu, L. 2004. PeerTrust: supporting reputation-based trust in peer-to-peer communities. IEEE Trans. Knowl. Data Engin. 16, 7.
[41]
Yang, Z., Zhong, S., and Wright, R. N. 2005. Privacy-preserving classification of customer data without loss of accuracy. SIAM Conference on Data Mining (SDM).

Cited By

View all
  • (2022)A security framework for QaaS model in intelligent transportation systemsMicroprocessors & Microsystems10.1016/j.micpro.2022.10450090:COnline publication date: 1-Apr-2022
  • (2021) Efficient homomorphic evaluation of k -NN classifiers Proceedings on Privacy Enhancing Technologies10.2478/popets-2021-00202021:2(111-129)Online publication date: 29-Jan-2021
  • (2021)Towards Communication-Efficient and Attack-Resistant Federated Edge Learning for Industrial Internet of ThingsACM Transactions on Internet Technology10.1145/345316922:3(1-22)Online publication date: 6-Dec-2021
  • Show More Cited By

Index Terms

  1. Preserving data privacy in outsourcing data aggregation services

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Internet Technology
    ACM Transactions on Internet Technology  Volume 7, Issue 3
    Special Issue on the Internet and Outsourcing
    August 2007
    97 pages
    ISSN:1533-5399
    EISSN:1557-6051
    DOI:10.1145/1275505
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 August 2007
    Published in TOIT Volume 7, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Privacy
    2. classification
    3. confidentiality
    4. outsourcing

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)21
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A security framework for QaaS model in intelligent transportation systemsMicroprocessors & Microsystems10.1016/j.micpro.2022.10450090:COnline publication date: 1-Apr-2022
    • (2021) Efficient homomorphic evaluation of k -NN classifiers Proceedings on Privacy Enhancing Technologies10.2478/popets-2021-00202021:2(111-129)Online publication date: 29-Jan-2021
    • (2021)Towards Communication-Efficient and Attack-Resistant Federated Edge Learning for Industrial Internet of ThingsACM Transactions on Internet Technology10.1145/345316922:3(1-22)Online publication date: 6-Dec-2021
    • (2020)RETRACTED ARTICLE: A novel privacy preserving digital forensic readiness provable data possession technique for health care data in cloudJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-020-01931-112:5(4911-4924)Online publication date: 21-Jul-2020
    • (2019)Distributed Privacy-Preserving Data Aggregation Against Dishonest Nodes in Network SystemsIEEE Internet of Things Journal10.1109/JIOT.2018.28345446:2(1462-1470)Online publication date: Apr-2019
    • (2018)Private Collaborative Business Benchmarking in the CloudIntelligent Computing10.1007/978-3-030-01177-2_101(1359-1365)Online publication date: 2-Nov-2018
    • (2017)Differentially private nearest neighbor classificationData Mining and Knowledge Discovery10.1007/s10618-017-0532-z31:5(1544-1575)Online publication date: 1-Sep-2017
    • (2017)Enforcing Privacy in Cloud DatabasesBig Data Analytics and Knowledge Discovery10.1007/978-3-319-64283-3_5(53-73)Online publication date: 3-Aug-2017
    • (2016)A New Spatial Transformation Scheme for Preventing Location Data Disclosure in Cloud ComputingGeospatial Research10.4018/978-1-4666-9845-1.ch084(1752-1776)Online publication date: 2016
    • (2016)Securing SQL with Access Control for Database as a Service ModelProceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies10.1145/2905055.2905163(1-6)Online publication date: 4-Mar-2016
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media