Skip to main content

\(\mathsf {EMA}\): Auditing Data Removal from Trained Models

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 (MICCAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12905))

Abstract

Data auditing is a process to verify whether certain data have been removed from a trained model. A recently proposed method [10] uses Kolmogorov-Smirnov (KS) distance for such data auditing. However, it fails under certain practical conditions. In this paper, we propose a new method called Ensembled Membership Auditing (\(\mathsf {EMA}\)) for auditing data removal to overcome these limitations. We compare both methods using benchmark datasets (MNIST and SVHN) and Chest X-ray datasets with multi-layer perceptrons (MLP) and convolutional neural networks (CNN). Our experiments show that \(\mathsf {EMA}\) is robust under various conditions, including the failure cases of the previously proposed method. Our code is available at: https://github.com/Hazelsuko07/EMA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The real medical data experiment follows the similar setting.

References

  1. Act, A.: Health insurance portability and accountability act of 1996. Public Law 104, 191 (1996)

    Google Scholar 

  2. Bourtoule, L., et al.: Machine unlearning. arXiv preprint arXiv:1912.03817 (2019)

  3. Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, pp. 267–284. USENIX Association, August 2019. https://www.usenix.org/conference/usenixsecurity19/presentation/carlini

  4. Guo, C., Goldstein, T., Hannun, A., Maaten, L.v.d.: Certified data removal from machine learning models. arXiv preprint arXiv:1911.03030 (2019)

  5. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  6. Kermany, D., Zhang, K., Goldbaum, M., et al.: Labeled optical coherence tomography (OCT) and chest X-ray images for classification. Mendeley Data 2(2) (2018)

    Google Scholar 

  7. Kermany, D.S., et al.: Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172(5), 1122–1131 (2018)

    Article  Google Scholar 

  8. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  9. LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010). http://yann.lecun.com/exdb/mnist/

  10. Liu, X., Tsaftaris, S.A.: Have you forgotten? A method to assess if machine learning models have forgotten data. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 95–105. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_10

    Chapter  Google Scholar 

  11. Nasr, M., Shokri, R., Houmansadr, A.: Machine learning with membership privacy using adversarial regularization. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 634–646 (2018)

    Google Scholar 

  12. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)

    Google Scholar 

  13. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. arXiv preprint arXiv:1912.01703 (2019)

  14. Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)

  15. Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., Backes, M.: ML-leaks: model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246 (2018)

  16. Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017)

    Google Scholar 

  17. Song, L., Mittal, P.: Systematic evaluation of privacy risks of machine learning models. arXiv preprint arXiv:2003.10595 (2020)

  18. Voigt, P., Von dem Bussche, A.: The EU general data protection regulation (GDPR). Intersoft consulting (2018)

    Google Scholar 

  19. Wang, L., Lin, Z.Q., Wong, A.: COVID-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci. Rep. 10(1), 1–12 (2020)

    Article  Google Scholar 

  20. Zhang, Y., Jia, R., Pei, H., Wang, W., Li, B., Song, D.: The secret revealer: generative model-inversion attacks against deep neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 253–261 (2020)

    Google Scholar 

Download references

Acknowledgement

This project is supported in part by Princeton University fellowship and Amazon Web Services (AWS) Machine Learning Research Awards. The authors would like to thank Liwei Song and Dr. Quanzheng Li for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiaoxiao Li or Kai Li .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7780 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, Y., Li, X., Li, K. (2021). \(\mathsf {EMA}\): Auditing Data Removal from Trained Models. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12905. Springer, Cham. https://doi.org/10.1007/978-3-030-87240-3_76

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87240-3_76

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87239-7

  • Online ISBN: 978-3-030-87240-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics