Skip to main content
Log in

Access control based on entity matching for secure data sharing

  • Original Research Paper
  • Published:
Service Oriented Computing and Applications Aims and scope Submit manuscript

Abstract

Useful information for analysis and learning purposes is often located in heterogeneous and autonomous sources. We consider the problem where data owners want to share data that has access control policies associated with it. Data sources share information relying on entity matching rules (Conditions for which two records from different sources are considered as a match, i.e., represent the same real-world object) between their contents. In this paper, we propose an entity matching-oriented and policy-oriented methodology to provide a secure data sharing framework. We present an algorithm for translating a query submitted against one schema into an augmented query for the other schema to capture concerned tuples, based on entity matching rules. Then, we provide a methodology to answer queries while preserving local access control policies and also avoiding any inference leakage that could result from entity matching. Furthermore, we adduce details on our implementation and describe the key findings of the experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Semantically equivalent, they denote the same real-world object.

  2. See: https://perso.liris.cnrs.fr/jagoun/SOCA2021_appx.pdf.

  3. https://github.com/eulerto/pg_similarity.

  4. https://recordlinkage.readthedocs.io/en/latest/about.html.

  5. https://www.cs.utexas.edu/users/ml/riddle/data/restaurant.tar.gz.

  6. These keywords are provided by the users to serve as a kind of synonyms table.

  7. The Hyperion prototype implements the main results/algorithms presented in [21, 22].

  8. https://h2020qualitop.liris.cnrs.fr/wordpress/index.php/project/.

References

  1. Agoun J, Hacid MS (2019) Data sharing in presence of access control policies. In: OTM confederated international conferences“ On the move to meaningful internet systems”. Springer, pp 301–309

  2. Arocena PC, Garzetti M, Jiang L, Kementsietsidis A, Kiringa I, Masud M, Miller R.J, Mylopoulos J (2005) Data sharing in the hyperion peer database system. In: VLDB

  3. Chawathe S, Garcia-Molina H, Hammer J, Ireland K, Papakonstantinou Y, Ullman J, Widom J (1994) The tsimmis project: integration of heterogenous information sources

  4. Ciriani V, Di Vimercati SDC, Foresti S, Jajodia S, Paraboschi S, Samarati P (2009) Keep a few: outsourcing data while maintaining confidentiality. In: European symposium on research in computer security. Springer, pp 440–455

  5. Clifton C, Kantarcioǧlu M, Doan A, Schadow G, Vaidya J, Elmagarmid A, Suciu D (2004) Privacy-preserving data integration and sharing. In: Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, pp 19–26

  6. Dong X, Li R, He H, Zhou W, Xue Z, Wu H (2015) Secure sensitive data sharing on a big data platform. Tsinghua Sci. Technol. 20:72–80

    Article  MathSciNet  Google Scholar 

  7. Dong X, Yu J, Luo Y, Chen Y, Xue G, Li M (2014) Achieving an effective, scalable and privacy-preserving data sharing service in cloud computing. Comput Secur 42:151–164

    Article  Google Scholar 

  8. Dove ES, Laurie GT, Knoppers BM (2017) Data sharing and privacy. In: Genomic and precision medicine. pp 143–160

  9. Dwork C (2008) Differential privacy: a survey of results. In: International conference on theory and applications of models of computation. Springer, pp 1–19

  10. Elmeleegy H, Ouzzani M, Elmagarmid A, Abusalah A (2010) Preserving privacy and fairness in peer-to-peer data integration. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. pp 759–770

  11. Fagin R, Kolaitis PG, Miller RJ, Popa L (2005) Data exchange: semantics and query answering. Theor Comput Sci 336(1):89–124

    Article  MathSciNet  Google Scholar 

  12. Fan W, Jia X, Li J, Ma S (2009) Reasoning about record matching rules. Proc VLDB Endow 2(1):407–418

    Article  Google Scholar 

  13. Franke M, Sehili Z, Rahm E (2018) Parallel privacy-preserving record linkage using lsh-based blocking. In: IoTBDS. pp 195–203

  14. Frikken K, Atallah M, Li J (2006) Attribute-based access control with hidden policies and hidden credentials. IEEE Trans Comput 55(10):1259–1270

    Article  Google Scholar 

  15. Gravano L, Ipeirotis PG, Jagadish HV, Koudas N, Muthukrishnan S, Srivastava D, (2001) et al.: Approximate string joins in a database (almost) for free. In: VLDB, vol 1. pp 491–500

  16. Haddad M, Stevovic J, Chiasera A, Velegrakis Y, Hacid M.S (2014) Access control for data integration in presence of data dependencies. In: International conference on database systems for advanced applications. Springer. pp 203–217

  17. Halevy AY (2001) Answering queries using views: a survey. VLDB J 10(4):270–294

    Article  Google Scholar 

  18. Hu Y, Kumar S, Popa RA (2020) Ghostor: toward a secure data-sharing system from decentralized trust. In: 17th symposium on networked systems design and implementation, pp 851–877

  19. Inan A, Kantarcioglu M, Ghinita G, Bertino E (2010) Private record matching using differential privacy. In: Proceedings of the 13th international conference on extending database technology. ACM, pp 123–134

  20. Kaushik R, Ramamurthy R (2011) Efficient auditing for complex sql queries. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, SIGMOD ’11. pp 697–708

  21. Kementsietsidis A, Arenas M (2004) Data sharing through query translation in autonomous sources. In: Proceedings of the thirtieth international conference on very large data bases-volume 30. pp 468–479

  22. Kementsietsidis A, Arenas M, Miller R.J (2003) Mapping data in peer-to-peer systems: Semantics and algorithmic issues. In: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 325–336

  23. Kimmig A, Memory A, Getoor L, et al. (2017) A collective, probabilistic approach to schema mapping. In: 2017 IEEE 33rd international conference on data engineering (ICDE). pp 921–932

  24. Labrinidis A, Jagadish HV (2012) Challenges and opportunities with big data. Proc VLDB Endow 5(12):2032–2033

    Article  Google Scholar 

  25. Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, pp 233–246

  26. Li J, Zhang Y, Chen X, Xiang Y (2018) Secure attribute-based data sharing for resource-limited users in cloud computing. Comput Security 72:1–12

    Article  Google Scholar 

  27. Liu J, Fan W (2011) Polymorphic queries for p2p systems. Inf Syst 36(5):825–842

    Article  Google Scholar 

  28. Lu X, Pan Z, Xian H (2020) An efficient and secure data sharing scheme for mobile devices in cloud computing. J Cloud Comput 9(1):1–13

    Article  Google Scholar 

  29. Ng WS, Ooi BC, Tan KL, Zhou A (2003) Peerdb: a p2p-based system for distributed data sharing. In: Proceedings 19th international conference on data engineering (Cat. No. 03CH37405). IEEE, pp 633–644

  30. Norman TL (2012) Electronic Access Control. Newton, MA, USA

    Google Scholar 

  31. Ortega-Binderberger M, Chakrabarti K, Mehrotra S (2002) An approach to integrating query refinement in sql. In: International conference on extending database technology. Springer, pp 15–33

  32. Reddy PM, Manjula S, Venugopal K (2017) Secure data sharing in cloud computing: a comprehensive review. Int J Comput (IJC) 25(1):80–115

    Google Scholar 

  33. Rezaeibagha F, Mu Y (2016) Distributed clinical data sharing via dynamic access-control policy transformation. Int J Med Inf 89:25–31

    Article  Google Scholar 

  34. Scannapieco M, Figotin I, Bertino E, Elmagarmid A.K (2007) Privacy preserving schema and data matching. In: Proceedings of the 2007 ACM SIGMOD international conference on Management of data. ACM, pp 653–664

  35. Singh R, Meduri VV, Elmagarmid A, Madden S, Papotti P, Quiané-Ruiz JA, Solar-Lezama A, Tang N (2017) Synthesizing entity matching rules by examples. Proc VLDB Endow 11(2):189–202

    Article  Google Scholar 

  36. Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl-Based Syst 10(05):557–570

    Article  MathSciNet  Google Scholar 

  37. Vatsalan D, Sehili Z, Christen P, Rahm E (2017) Privacy-preserving record linkage for big data: current approaches and research challenges. In: Handbook of big data technologies, pp 851–895

  38. di Vimercati SDC, Foresti S, Livraga G, Paraboschi S, Samarati P (2018) Confidentiality protection in large databases. pp 457–472

  39. Wang J, Li G, Yu JX, Feng J (2011) Entity matching: how similar is similar. Proc. VLDB Endow. 4:622–633

    Article  Google Scholar 

  40. Wu X, Zhu X, Wu GQ, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juba Agoun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research is performed within the scope of the DataCert (Coq deep specification of privacy aware data integration) project that is funded by ANR (Agence Nationale de la Recherche) Grant ANR-15-CE39-0009—http://datacert.lri.fr/

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agoun, J., Hacid, MS. Access control based on entity matching for secure data sharing. SOCA 16, 31–44 (2022). https://doi.org/10.1007/s11761-021-00331-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11761-021-00331-3

Keywords

Navigation