Skip to main content

Empirical Exploration of Open-Source Issues for Predicting Privacy Compliance

  • Conference paper
  • First Online:
Advances in Conceptual Modeling (ER 2023)

Abstract

In the last decade, privacy has gained a significant interest in software and information systems engineering mainly due to the emergence of privacy regulations, including the General Data Protection Regulation (GDPR). However, checking privacy compliance is challenging and depends on many factors, such as the programming language and the software architecture, as well as the underlying regulation. In this exploratory research, we aim to study whether positive discussions on privacy-related issues in Open-Source Software (OSS) environments can predict privacy compliance of the software. Such predictions are beneficial in different scenarios, including in software reuse. Our main contribution will lie in conceptually modeling and understanding the relations between privacy compliance and positive discussions of privacy-related OSS issues. The research comprises three parts: (1) identifying privacy-related issues using supervised machine learning techniques; (2) improving the identification of privacy-related issues utilizing ontologies; and (3) identifying the sentiment of privacy-related issues and analyzing relations to privacy compliance. This paper describes the design and results of part 1, as well as the design of parts 2 and 3.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://gdpr.eu.

  2. 2.

    https://unctad.org/page/data-protection-and-privacy-legislation-worldwide.

  3. 3.

    The dataset is available at https://zenodo.org/record/8351237.

References

  1. Hennig, A., Schulte, L., Mayer, P.: Understanding issues related to personal data and data protection in open source projects on GitHub. In: Proceedings of International Conference on Mining Software Repositories (MSR 2023) (2023)

    Google Scholar 

  2. Khalajzadeh, H., Shahin, M., Obie, H.O., Grundy, J.: How are diverse end-user human-centric issues discussed on GitHub? In: Association for Computing Machinery (2022)

    Google Scholar 

  3. Gharib, M., Giorgini, P., Mylopoulos, J.: Towards an ontology for privacy requirements via a systematic literature review. In: Mayr, H., Guizzardi, G., Ma, H., Pastor, O. (eds.) Conceptual Modeling, ER 2017, vol. 10650, pp. 193–208. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69904-2_16

    Chapter  Google Scholar 

  4. Tom, J., Sing, E., Matulevičius, R.: Conceptual representation of the GDPR: model and application directions. In: Zdravkovic, J., Grabis, J., Nurcan, S., Stirna, J. (eds.) Perspectives in Business Informatics Research, BIR 2018, vol. 330, pp. 18–28. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99951-7_2

    Chapter  Google Scholar 

  5. Torre, D., Alferez, M., Soltana, G., Sabetzadeh, M., Briand, L.: Modeling data protection and privacy: application and experience with GDPR. Softw. Syst. Model. 20(6), 2071–2087 (2021). https://doi.org/10.1007/s10270-021-00935-5

    Article  Google Scholar 

  6. Sangaroonsilp, P., Dam, H.K., Choetkiertikul, M., Ragkhitwetsagul, C., Ghose, A.: A taxonomy for mining and classifying privacy requirements in issue reports. Inf. Softw. Technol. 157, 107162 (2023). https://doi.org/10.1016/j.infsof.2023.107162

    Article  Google Scholar 

  7. Campos, R., Mangaravite, V., Pasquali, A., Jorge, A., Nunes, C., Jatowt, A.: YAKE! keyword extraction from single documents using multiple local features. Inf. Sci. 509, 257–289 (2020). https://doi.org/10.1016/j.ins.2019.09.013

    Article  Google Scholar 

  8. Jayalakshmi, T., Santhakumaran, A.: Statistical normalization and back propagation for classification. Int. J. Comput. Theory Eng. 3(1), 1–6 (2011). https://doi.org/10.7763/IJCTE.2011.V3.288

    Article  Google Scholar 

  9. Quinlan, J.R.: Simplifying decision trees. Int. J. Hum. Comput. Stud. 27, 221–234 (1987). https://doi.org/10.1006/ijhc.1987.0321

    Article  Google Scholar 

  10. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    Article  MATH  Google Scholar 

  11. Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jegou, H., Mikolov, T.: FASTTEXT.ZIP: compressing text classification models. In: ICLR 2017, pp. 1–13 (2017)

    Google Scholar 

  12. Ding, J., Sun, H., Wang, X., Liu, X.: Entity-level sentiment analysis of issue comments. In: IEEE/ACM 3rd International Workshop on Emotion Awareness in Software Engineering, SEmotion 2018, pp. 7–13 (2018). https://doi.org/10.1145/3194932.3194935

  13. Hoepman, J.-H.: Privacy design strategies. In: Cuppens-Boulahia, N., Cuppens, F., Jajodia, S., Abou El Kalam, A., Sans, T. (eds.) SEC 2014. IAICT, vol. 428, pp. 446–459. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-55415-5_38

    Chapter  Google Scholar 

  14. Farhadi, M., Haddad, H., Shahriar, H.: Compliance checking of open source EHR applications for HIPAA and ONC security and privacy requirements. In: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), pp. 704–713 (2019). https://doi.org/10.1109/COMPSAC.2019.00106

  15. Farhadi, M., Pierre, G., Miorandi, D.: Towards automated privacy compliance checking of applications in cloud and fog environments. In: 2021 8th International Conference on Future Internet of Things and Cloud, pp. 11–18 (2021). https://doi.org/10.1109/FiCloud49777.2021.00010

  16. Malik, S, Jain, S.: Semantic ontology-based approach to enhance text classification. In: ISIC 2021 (2021)

    Google Scholar 

  17. Sanchez-pi, N., Martí, L., Cristina, A., Garcia, B.: Improving ontology-based text classification : an occupational health and security application. J. Appl. Log. 17, 48–58 (2016). https://doi.org/10.1016/j.jal.2015.09.008

    Article  MathSciNet  Google Scholar 

  18. Allahyari, M., Kochut, K.J., Janik, M.: Ontology-based text classification into dynamically defined topics. In: 2014 IEEE International Conference on Semantic Computing (2014)

    Google Scholar 

  19. Murgia, A., Adams, B.: Do developers feel emotions ? an exploratory analysis of emotions in software artifacts. In: MSR 2014, pp. 262–271 (2014). https://doi.org/10.1145/2597073.2597086

  20. Junior, R.S.C., Carneiro, G.D.F.: Impact of developers sentiments on practices and artifacts in open source software projects : a systematic literature review. In: Proceedings of the 22nd International Conference on Enterprise Information Systems (ICEIS 2020), vol. 2, pp. 978–989. (2020). https://doi.org/10.5220/0009313200310042

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jenny Guber .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guber, J., Reinhartz-Berger, I., Litvak, M. (2023). Empirical Exploration of Open-Source Issues for Predicting Privacy Compliance. In: Sales, T.P., Araújo, J., Borbinha, J., Guizzardi, G. (eds) Advances in Conceptual Modeling. ER 2023. Lecture Notes in Computer Science, vol 14319. Springer, Cham. https://doi.org/10.1007/978-3-031-47112-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47112-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47111-7

  • Online ISBN: 978-3-031-47112-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics