Skip to main content
Log in

A SPARQL-based framework to preserve privacy of sensitive data on the semantic web

  • Original Research Paper
  • Published:
Service Oriented Computing and Applications Aims and scope Submit manuscript

Abstract

Over the last few years, the web of data has been evolved. Indeed, it allows sharing of a significant interconnection of a huge amount of data in several domains and it keeps increasing continuously. Due to the confidential nature of some data, sectors such as health, financial, and government, it have limited participation with fewer data to publish. Thus, to develop the web of data and make it more trustworthy, we have to take into consideration the confidentiality, sensitivity, and utility of data. We propose, in this paper, a framework for the confidentiality preservation, and sharing of linked data. Our approach provides the means to specify privacy policies and protect sensitive data in RDF triples. Subsequently, the application of the policy on the graph will allow their replacement by their encryption, which ensures a balance between confidentiality and the utility of data. We have experimented the performance of our proposed solution on benchmarks of different sizes by showing how to preserve the privacy of sensitive data and proving how hard it is to decrypt. The obtained results have shown the effectiveness of our developed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://Solidproject.org/.

  2. http://www.w3.org/2005/Incubator/webid/spec/.

  3. https://www.w3.org/TR/ldp/.

  4. https://www.w3.org/wiki/WebAccessControl.

  5. https://www.w3.org/2011/09/SparqlAlgebra/ARQalgebra.

  6. https://jena.apache.org.

  7. https://www.w3.org/TR/vocab-data-cube/.

  8. https://jena.apache.org.

References

  1. Ahn J, Im D (2020) Efficient access control of large scale RDF data using prefix-based labeling. IEEE Access 8:122405–122412. https://doi.org/10.1109/ACCESS.2020.3007592

    Article  Google Scholar 

  2. Berners-Lee T (2006) Linked data—design issues. W3C (09/20). http://www.w3.org/DesignIssues/LinkedData.html

  3. Bizer C, Schultz A (2009) The berlin SPARQL benchmark. Int J Semantic Web Inf Syst 5:1–24

    Google Scholar 

  4. Costabello L, Villata S, Delaforge N, et al (2012) Linked data access goes mobile: Context-aware authorization for graph stores. In: Bizer C, Heath T, Berners-Lee T, et al (eds) WWW2012 workshop on linked data on the web, Lyon, France, 16 April, 2012, CEUR Workshop Proceedings, vol 937. CEUR-WS.org. http://ceur-ws.org/Vol-937/ldow2012-paper-05.pdf

  5. Cuzzocrea A, Karras P, Vlachou A (2022) Effective and efficient skyline query processing over attribute-order-preserving-free encrypted data in cloud-enabled databases. Future Gener Comput Syst 126:237–251. https://doi.org/10.1016/j.future.2021.08.008

    Article  Google Scholar 

  6. Delanaux R, Bonifati A, Rousset M, et al (2018) Query-based linked data anonymization. In: Vrandecic D, Bontcheva K, Suárez-Figueroa MC, et al (eds) The Semantic Web—ISWC 2018—17th international semantic web conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I, Lecture Notes in Computer Science, vol 11136. Springer, Berlin, pp 530–546. https://doi.org/10.1007/978-3-030-00671-6_31

  7. Delanaux R, Bonifati A, Rousset M, et al (2019) RDF graph anonymization robust to data linkage. In: Cheng R, Mamoulis N, Sun Y, et al (eds) Web information systems engineering—WISE 2019—20th international conference, Hong Kong, China, November 26–30, 2019, Proceedings, Lecture Notes in Computer Science, vol 11881. Springer, Berlin, pp 491–506. https://doi.org/10.1007/978-3-030-34223-4_31

  8. El kholy M, Mohamed A (2021) Efficient security model for RDF files used in IoT applications. Int J Adv Comput Sci Appl 12(4):233–239. https://doi.org/10.14569/IJACSA.2021.0120431

    Article  Google Scholar 

  9. Endris KM, Almhithawi Z, Lytra I, et al (2018) BOUNCER: privacy-aware query processing over federations of RDF datasets. In: Hartmann S, Ma H, Hameurlain A et al (eds) Database and expert systems applications—29th international conference, DEXA 2018, Regensburg, Germany, September 3–6, 2018, Proceedings, Part I, Lecture Notes in Computer Science, vol 11029. Springer, Berlin, pp 69–84. https://doi.org/10.1007/978-3-319-98809-2_5

  10. Esteves B, Rodríguez-Doncel V, Pandit HJ et al (2022) Using the ODRL profile for access control for solid pod resource governance. The Semantic Web: ESWC 2022 Satellite Events. Springer International Publishing, Cham, pp 16–20

  11. Fernández JD, Kirrane S, Polleres A, et al (2017) Self-enforcing access control for encrypted RDF. In: Blomqvist E, Maynard D, Gangemi A, et al (eds) The semantic web—14th international conference, ESWC 2017, Portorož, Slovenia, May 28–June 1, 2017, Proceedings, Part I, pp 607–622. https://doi.org/10.1007/978-3-319-58068-5_37

  12. Fernández JD, Kirrane S, Polleres A et al (2020) Hdtcrypt: Compression and encryption of RDF datasets. Semantic Web 11(2):337–359. https://doi.org/10.3233/SW-180335

    Article  Google Scholar 

  13. Finin TW, Joshi A, Kagal L, et al (2008) ROWLbac: representing role based access control in OWL. In: Ray I, Li N (eds) 13th ACM symposium on access control models and technologies, SACMAT 2008, Estes Park, CO, USA, June 11–13, 2008, Proceedings. ACM, pp 73–82. https://doi.org/10.1145/1377836.1377849

  14. Gabillon A, Letouzey L (2010) A view based access control model for SPARQL. In: Xiang Y, Samarati P, Hu J, et al (eds) Fourth international conference on network and system security, NSS 2010, Melbourne, Victoria, Australia, September 1-3, 2010. IEEE Computer Society, pp 105–112. https://doi.org/10.1109/NSS.2010.35,

  15. Giereth M (2005) On partial encryption of RDF-graphs. In: Gil Y, Motta E, Benjamins VR, et al (eds) The Semantic Web—ISWC 2005, 4th international semantic web conference, ISWC 2005, Galway, Ireland, November 6–10, 2005, proceedings, lecture notes in computer science, vol 3729. Springer, pp 308–322. https://doi.org/10.1007/11574620_24

  16. Goncalves M, Vidal M, Endris KM (2019) PURE: A privacy aware rule-based framework over knowledge graphs. In: Hartmann S, Küng J, Chakravarthy S, et al (eds) Database and expert systems applications—30th international conference, DEXA 2019, Linz, Austria, August 26–29, 2019, Proceedings, Part I, Lecture Notes in Computer Science, vol 11706. Springer, Berlin, pp 205–214. https://doi.org/10.1007/978-3-030-27615-7_15

  17. Grau BC, Kostylev EV (2016) Logical foundations of privacy-preserving publishing of linked data. In: Schuurmans D, Wellman MP (eds) Proceedings of the thirtieth AAAI conference on artificial intelligence, February 12–17, 2016, Phoenix, Arizona, USA. AAAI Press, pp 943–949. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12253

  18. Guo Y, Pan Z, Heflin J (2005) LUBM: a benchmark for OWL knowledge base systems. J Web Semant 3(2–3):158–182. https://doi.org/10.1016/j.websem.2005.06.005

    Article  Google Scholar 

  19. Harris S, Seaborne A, Prud’hommeaux E (2013) Sparql 1.1 query language. W3C Recommend 21(10):778

    Google Scholar 

  20. Heitmann B, Hermsen F, Decker S (2017) k-RDF-neighbourhood anonymity: combining structural and attribute-based anonymisation for linked data. In: Brewster C, Cheatham M, d’Aquin M, et al (eds) Proceedings of the 5th workshop on society, privacy and the semantic web—policy and technology (PrivOn2017) co-located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 22, 2017, CEUR Workshop Proceedings, vol 1951. CEUR-WS.org

  21. Hogan A (2020) The web of data. Springer, Berlin

    Book  Google Scholar 

  22. Jeon M, Temuujin O, Ahn J et al (2021) Distributed l-diversity using spark-based algorithm for large resource description frameworks data. J Supercomput 77(7):7270–7286. https://doi.org/10.1007/s11227-020-03583-6

    Article  Google Scholar 

  23. Kasten A, Scherp A, Armknecht F, et al (2013) Towards search on encrypted graph data. In: Decker S, Hendler J, Kirrane S (eds) Proceedings of the workshop on society, privacy and the semantic web—policy and technology (PrivOn2013) co-located with the 12th International Semantic Web Conference (ISWC 2013), Sydney, Australia, October 22, 2013, CEUR Workshop Proceedings, vol 1121. CEUR-WS.org

  24. Kayes A, Rahayu W, Watters P et al (2020) Achieving security scalability and flexibility using fog-based context-aware access control. Future Gener Comput Syst 107:307–323. https://doi.org/10.1016/j.future.2020.02.001

    Article  Google Scholar 

  25. Khan Y, Saleem M, Mehdi M et al (2017) SAFE: SPARQL federation over RDF data cubes with access control. J Biomed Semant 8(1):5:1-5:22. https://doi.org/10.1186/s13326-017-0112-6

    Article  Google Scholar 

  26. Kirrane S, Abdelrahman A, Mileo A, et al (2013) Secure manipulation of linked data. In: Alani H, Kagal L, Fokoue A, et al (eds) The Semantic Web—ISWC 2013—12th international semantic web conference, Sydney, NSW, Australia, October 21–25, 2013, Proceedings, Part I, Lecture Notes in Computer Science, vol 8218. Springer, Berlin, pp 248–263. https://doi.org/10.1007/978-3-642-41335-3_16

  27. Kongruangkit S, Xia Y, Xu X, et al (2021) A case for connecting solid and blockchains: Enforcement of transparent access rights in personal data stores. In: 2021 IEEE international conference on blockchain and cryptocurrency (ICBC), pp 1–5

  28. Kummert H (1998) The PPP triple-des encryption protocol (3DESE). RFC 2420:1–8. https://doi.org/10.17487/RFC2420

  29. Mainini P, Laube-Rosenpflanzer A (2016) Access control in linked data using webid. CoRR arXiv:1610.04405

  30. Mansour E, Sambra AV, Hawke S, et al (2016) A demonstration of the solid platform for social web applications. In: Proceedings of the 25th international conference companion on world wide web. International World Wide Web Conferences Steering Committee, WWW ’16 Companion, pp 223–226. https://doi.org/10.4018/jswis.2009040101

  31. McGlinn K, Rutherford MA, Gisslander K et al (2022) FAIRVASC: a semantic web approach to rare disease registry integration. Comput Biol Med 145:105313. https://doi.org/10.1016/j.compbiomed.2022.105313

    Article  Google Scholar 

  32. Miller EJ (2001) An introduction to the resource description framework. J Libr Adm 34(3–4):245–255

    Article  Google Scholar 

  33. Sacco O, Passant A (2011) A privacy preference ontology (PPO) for linked data. In: Bizer C, Heath T, Berners-Lee T, et al (eds) WWW2011 Workshop on Linked Data on the Web, Hyderabad, India, March 29, 2011, CEUR Workshop Proceedings, vol 813. CEUR-WS.org. http://ceur-ws.org/Vol-813/ldow2011-paper01.pdf

  34. Sayah T, Coquery E, Thion R, et al (2015) Inference leakage detection for authorization policies over RDF data. In: Samarati P (ed) Data and applications security and privacy XXIX—29th Annual IFIP WG 11.3 Working Conference, DBSec 2015, Fairfax, VA, USA, July 13–15, 2015, Proceedings, Lecture Notes in Computer Science, vol 9149. Springer, Berlin, pp 346–361. https://doi.org/10.1007/978-3-319-20810-7_24

  35. Shahin OR, Aissa AB, Fouad Y et al (2020) A new method of data encryption based on one to one functions. Int J Adv Sci Eng Inf Technol 10(3):1169–1175. https://doi.org/10.18517/ijaseit.10.3.10765

  36. Thouvenot M, Curé O, Calvez P (2020) Knowledge graph anonymization using semantic anatomization. In: Wu X, Jermaine C, Xiong L, et al (eds) 2020 IEEE international conference on big data (IEEE BigData 2020), Atlanta, GA, USA, December 10-13, 2020. IEEE, pp 4065–4074. https://doi.org/10.1109/BigData50022.2020.9377824

  37. Villata S, Delaforge N, Gandon F, et al (2011) An access control model for linked data. In: Meersman R, Dillon TS, Herrero P (eds) On the Move to Meaningful Internet Systems: OTM 2011 Workshops - Confederated International Workshops and Posters: EI2N+NSF ICE, ICSP+INBAST, ISDE, ORM, OTMA, SWWS+MONET+SeDeS, and VADER 2011, Hersonissos, Crete, Greece, October 17–21, 2011. Proceedings, Lecture Notes in Computer Science, vol 7046. Springer, Berlin, pp 454–463. https://doi.org/10.1007/978-3-642-25126-9_57

  38. Xiao X, Tao Y (2006) Anatomy: simple and effective privacy preservation. In: Dayal U, Whang K, Lomet DB, et al (eds) Proceedings of the 32nd international conference on very large data bases, Seoul, Korea, September 12-15, 2006. ACM, pp 139–150

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fethi Imad Benaribi.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Benaribi, F.I., Malki, M., Faraoun, K.M. et al. A SPARQL-based framework to preserve privacy of sensitive data on the semantic web. SOCA 17, 183–199 (2023). https://doi.org/10.1007/s11761-023-00368-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11761-023-00368-6

Keywords

Navigation