Abstract
Useful information for analysis and learning purposes is often located in heterogeneous and autonomous sources. We consider the problem where data owners want to share data that has access control policies associated with it. Data sources share information relying on entity matching rules (Conditions for which two records from different sources are considered as a match, i.e., represent the same real-world object) between their contents. In this paper, we propose an entity matching-oriented and policy-oriented methodology to provide a secure data sharing framework. We present an algorithm for translating a query submitted against one schema into an augmented query for the other schema to capture concerned tuples, based on entity matching rules. Then, we provide a methodology to answer queries while preserving local access control policies and also avoiding any inference leakage that could result from entity matching. Furthermore, we adduce details on our implementation and describe the key findings of the experiments.






Similar content being viewed by others
Notes
Semantically equivalent, they denote the same real-world object.
These keywords are provided by the users to serve as a kind of synonyms table.
References
Agoun J, Hacid MS (2019) Data sharing in presence of access control policies. In: OTM confederated international conferences“ On the move to meaningful internet systems”. Springer, pp 301–309
Arocena PC, Garzetti M, Jiang L, Kementsietsidis A, Kiringa I, Masud M, Miller R.J, Mylopoulos J (2005) Data sharing in the hyperion peer database system. In: VLDB
Chawathe S, Garcia-Molina H, Hammer J, Ireland K, Papakonstantinou Y, Ullman J, Widom J (1994) The tsimmis project: integration of heterogenous information sources
Ciriani V, Di Vimercati SDC, Foresti S, Jajodia S, Paraboschi S, Samarati P (2009) Keep a few: outsourcing data while maintaining confidentiality. In: European symposium on research in computer security. Springer, pp 440–455
Clifton C, Kantarcioǧlu M, Doan A, Schadow G, Vaidya J, Elmagarmid A, Suciu D (2004) Privacy-preserving data integration and sharing. In: Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, pp 19–26
Dong X, Li R, He H, Zhou W, Xue Z, Wu H (2015) Secure sensitive data sharing on a big data platform. Tsinghua Sci. Technol. 20:72–80
Dong X, Yu J, Luo Y, Chen Y, Xue G, Li M (2014) Achieving an effective, scalable and privacy-preserving data sharing service in cloud computing. Comput Secur 42:151–164
Dove ES, Laurie GT, Knoppers BM (2017) Data sharing and privacy. In: Genomic and precision medicine. pp 143–160
Dwork C (2008) Differential privacy: a survey of results. In: International conference on theory and applications of models of computation. Springer, pp 1–19
Elmeleegy H, Ouzzani M, Elmagarmid A, Abusalah A (2010) Preserving privacy and fairness in peer-to-peer data integration. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. pp 759–770
Fagin R, Kolaitis PG, Miller RJ, Popa L (2005) Data exchange: semantics and query answering. Theor Comput Sci 336(1):89–124
Fan W, Jia X, Li J, Ma S (2009) Reasoning about record matching rules. Proc VLDB Endow 2(1):407–418
Franke M, Sehili Z, Rahm E (2018) Parallel privacy-preserving record linkage using lsh-based blocking. In: IoTBDS. pp 195–203
Frikken K, Atallah M, Li J (2006) Attribute-based access control with hidden policies and hidden credentials. IEEE Trans Comput 55(10):1259–1270
Gravano L, Ipeirotis PG, Jagadish HV, Koudas N, Muthukrishnan S, Srivastava D, (2001) et al.: Approximate string joins in a database (almost) for free. In: VLDB, vol 1. pp 491–500
Haddad M, Stevovic J, Chiasera A, Velegrakis Y, Hacid M.S (2014) Access control for data integration in presence of data dependencies. In: International conference on database systems for advanced applications. Springer. pp 203–217
Halevy AY (2001) Answering queries using views: a survey. VLDB J 10(4):270–294
Hu Y, Kumar S, Popa RA (2020) Ghostor: toward a secure data-sharing system from decentralized trust. In: 17th symposium on networked systems design and implementation, pp 851–877
Inan A, Kantarcioglu M, Ghinita G, Bertino E (2010) Private record matching using differential privacy. In: Proceedings of the 13th international conference on extending database technology. ACM, pp 123–134
Kaushik R, Ramamurthy R (2011) Efficient auditing for complex sql queries. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, SIGMOD ’11. pp 697–708
Kementsietsidis A, Arenas M (2004) Data sharing through query translation in autonomous sources. In: Proceedings of the thirtieth international conference on very large data bases-volume 30. pp 468–479
Kementsietsidis A, Arenas M, Miller R.J (2003) Mapping data in peer-to-peer systems: Semantics and algorithmic issues. In: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 325–336
Kimmig A, Memory A, Getoor L, et al. (2017) A collective, probabilistic approach to schema mapping. In: 2017 IEEE 33rd international conference on data engineering (ICDE). pp 921–932
Labrinidis A, Jagadish HV (2012) Challenges and opportunities with big data. Proc VLDB Endow 5(12):2032–2033
Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, pp 233–246
Li J, Zhang Y, Chen X, Xiang Y (2018) Secure attribute-based data sharing for resource-limited users in cloud computing. Comput Security 72:1–12
Liu J, Fan W (2011) Polymorphic queries for p2p systems. Inf Syst 36(5):825–842
Lu X, Pan Z, Xian H (2020) An efficient and secure data sharing scheme for mobile devices in cloud computing. J Cloud Comput 9(1):1–13
Ng WS, Ooi BC, Tan KL, Zhou A (2003) Peerdb: a p2p-based system for distributed data sharing. In: Proceedings 19th international conference on data engineering (Cat. No. 03CH37405). IEEE, pp 633–644
Norman TL (2012) Electronic Access Control. Newton, MA, USA
Ortega-Binderberger M, Chakrabarti K, Mehrotra S (2002) An approach to integrating query refinement in sql. In: International conference on extending database technology. Springer, pp 15–33
Reddy PM, Manjula S, Venugopal K (2017) Secure data sharing in cloud computing: a comprehensive review. Int J Comput (IJC) 25(1):80–115
Rezaeibagha F, Mu Y (2016) Distributed clinical data sharing via dynamic access-control policy transformation. Int J Med Inf 89:25–31
Scannapieco M, Figotin I, Bertino E, Elmagarmid A.K (2007) Privacy preserving schema and data matching. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data. ACM, pp 653–664
Singh R, Meduri VV, Elmagarmid A, Madden S, Papotti P, Quiané-Ruiz JA, Solar-Lezama A, Tang N (2017) Synthesizing entity matching rules by examples. Proc VLDB Endow 11(2):189–202
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl-Based Syst 10(05):557–570
Vatsalan D, Sehili Z, Christen P, Rahm E (2017) Privacy-preserving record linkage for big data: current approaches and research challenges. In: Handbook of big data technologies, pp 851–895
di Vimercati SDC, Foresti S, Livraga G, Paraboschi S, Samarati P (2018) Confidentiality protection in large databases. pp 457–472
Wang J, Li G, Yu JX, Feng J (2011) Entity matching: how similar is similar. Proc. VLDB Endow. 4:622–633
Wu X, Zhu X, Wu GQ, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research is performed within the scope of the DataCert (Coq deep specification of privacy aware data integration) project that is funded by ANR (Agence Nationale de la Recherche) Grant ANR-15-CE39-0009—http://datacert.lri.fr/
Rights and permissions
About this article
Cite this article
Agoun, J., Hacid, MS. Access control based on entity matching for secure data sharing. SOCA 16, 31–44 (2022). https://doi.org/10.1007/s11761-021-00331-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11761-021-00331-3