Skip to main content

Dataset Anonimyzation for Machine Learning: An ISP Case Study

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12950))

Abstract

Internet Service Providers technical support needs personal data to predict potential anomalies. In this paper, we performed a comparative study of forecasting performance using raw data and anonymized data, in order to assess how much performance may vary, when plain personal data are replaced by anonymized personal data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    www.flyber.it.

  2. 2.

    https://mikrotik.com/thedude, [Online; accessed 02-May-2021].

References

  1. Cambium Networks. https://www.cambiumnetworks.com/products/software/cnmaestro-management/. Accessed 02 May 2021

  2. Aamot, H., Kohl, C.D., Richter, D., Knaup-Gregori, P.: Pseudonymization of patient identifiers for translational research. BMC Med. Inform. Decis. Mak. 13(1), 1–15 (2013). https://doi.org/10.1186/1472-6947-13-75

    Article  Google Scholar 

  3. Aggarwal, C.C.: On k-anonymity and the curse of dimensionality. VLDB 5, 901–909 (2005)

    Google Scholar 

  4. Article 29 Data Protection Working Party: Opinion 05/2014 on Anonymisation Techniques. Working Party Opinions (April), 1–37 (2014). http://ec.europa.eu/justice/data-protection/index_en.htm%0Aec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf

  5. Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast data anonymization with low information loss. In: 33rd International Conference on Very Large Data Bases, VLDB 2007 - Conference Proceedings, pp. 758–769 (2007)

    Google Scholar 

  6. Goldsteen, A., Ezov, G., Shmelkin, R., Moffie, M., Farkash, A.: Anonymizing machine learning models. arXiv (2020)

    Google Scholar 

  7. González-Serrano, F.-J., Amor-Martín, A., Casamayón-Antón, J.: Supervised machine learning using encrypted training data. Int. J. Inf. Secur. 17(4), 365–377 (2017). https://doi.org/10.1007/s10207-017-0381-1

    Article  Google Scholar 

  8. Hesamifard, E., Takabi, H., Ghasemi, M., Wright, R.N.: Privacy-preserving machine learning as a service. In: Proceedings on Privacy Enhancing Technologies, vol. 2018, no. 3, pp. 123–142 (2018)

    Google Scholar 

  9. Murthy, S., Abu Bakar, A., Abdul Rahim, F., Ramli, R.: A Comparative study of data anonymization techniques. In: Proceedings - 5th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2019, 5th IEEE International Conference on High Performance and Smart Computing, HPSC 2019 and 4th IEEE International Conference on Intelligent Data and Security, IDS 2019, pp. 306–309 (2019). https://doi.org/10.1109/BigDataSecurity-HPSC-IDS.2019.00063

  10. Park, H., Kim, P., Kim, H., Park, K.W., Lee, Y.: Efficient machine learning over encrypted data with non-interactive communication. Comput. Stan. Interf. 58, 87–108 (2018)

    Article  Google Scholar 

  11. Rifaut, A.: Office of inspector general, health care compliance association: guidance note: guidance on anonymisation and pseudonymisation. In: 2011 4th International Workshop on Requirements Engineering and Law, RELAW 2011, Proceedings - Held in Conjunction with the 19th International Requirements Engineering Conference (June), pp. 1–54 (2019). https://oig.hhs.gov/compliance/101/files/HCCA-OIG-Resource-Guide.pdf

  12. Singapore, P.D.P.C.: Guide to basic data anonymisation techniques. In: Published 25 January 2018. Personal Data Protection Commission Singapore (PDPC) (January), pp. 1–39 (2018). https://www.pdpc.gov.sg/-/media/Files/PDPC/PDF-Files/Other-Guides/Guide-to-Anonymisation_v1-(250118).pdf

  13. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002). https://doi.org/10.1142/S0218488502001648

    Article  MathSciNet  MATH  Google Scholar 

  14. Zhong, S., Yang, Z., Wright, R.N.: Anonymization of customer data, vol. 1, pp. 139–147 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlo Sanghez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Campanile, L., Forgione, F., Marulli, F., Palmiero, G., Sanghez, C. (2021). Dataset Anonimyzation for Machine Learning: An ISP Case Study. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12950. Springer, Cham. https://doi.org/10.1007/978-3-030-86960-1_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86960-1_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86959-5

  • Online ISBN: 978-3-030-86960-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics