ABSTRACT
Differential privacy provides powerful guarantees that individuals incur minimal additional risk by including their personal data in a database. Most work in differential privacy has focused on differentially private algorithms that produce models, counts, and histograms. Nevertheless, even with a classification model produced by a differentially private algorithm, directly reporting the classifier's performance on a database has the potential for disclosure. Thus, differentially private computation of evaluation metrics for machine learning is an important research area. We find effective mechanisms for area under the receiver-operating characteristic (ROC) curve and average precision.
- K. Bache and M. Lichman. UCI machine learning repository, 2013.Google Scholar
- K. Boyd, V. S. Costa, J. Davis, and D. Page. Unachievable region in precision-recall space and its effect on empirical evaluation. In ICML, pages 639--646, 2012.Google Scholar
- K. Chaudhuri, C. Monteleoni, and A. D. Sarwate. Differentially private empirical risk minimization. The Journal of Machine Learning Research, 12:1069--1109, 2011. Google ScholarDigital Library
- K. Chaudhuri and S. A. Vinterbo. A stability-based validation procedure for differentially private machine learning. In NIPS, pages 2652--2660, 2013.Google ScholarDigital Library
- C. Dwork. Differential privacy. In ICALP. Springer, 2006. Google ScholarDigital Library
- C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, pages 265--284. Springer, 2006. Google ScholarDigital Library
- A. Friedman and A. Schuster. Data mining with differential privacy. In KDD, pages 493--502. ACM, 2010. Google ScholarDigital Library
- A. Ghosh, T. Roughgarden, and M. Sundararajan. Universally utility-maximizing privacy mechanisms. In STOC, 2009. Google ScholarDigital Library
- R. Hall, A. Rinaldo, and L. Wasserman. Differential privacy for functions and functional data. The Journal of Machine Learning Research, 14(1):703--727, 2013. Google ScholarDigital Library
- D. Kifer and A. Machanavajjhala. Pufferfish: A framework for mathematical privacy definitions. ACM Trans. Database Syst., 39(1):3:1--3:36, Jan. 2014. Google ScholarDigital Library
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarDigital Library
- G. J. Matthews and O. Harel. An examination of data confidentiality and disclosure issues related to publication of empiricalROC\ curves. Academic Radiology, 20(7):889 -- 896, 2013.Google ScholarCross Ref
- K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC, page 75. ACM Press, 2007. Google ScholarDigital Library
- M. S. Pepe. The statistical evaluation of medical tests for classification and prediction. Oxford University Press, USA, 2004.Google Scholar
- F. J. Provost, T. Fawcett, and R. Kohavi. The case against accuracy estimation for comparing induction algorithms. In ICML, volume 98, pages 445--453, 1998. Google ScholarDigital Library
- B. I. Rubinstein, P. L. Bartlett, L. Huang, and N. Taft. Learning in a large function space: Privacy-preserving mechanisms for svm learning. preprint arXiv:0911.5708, 2009.Google Scholar
- B. Stoddard, Y. Chen, and A. Machanavajjhala. Differentially private algorithms for empirical machine learning. preprint arXiv:1411.5428, 2014.Google Scholar
- L. Wasserman and S. Zhou. A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489):375--389, 2010.Google ScholarCross Ref
- J. Zhang, Z. Zhang, X. Xiao, Y. Yang, and M. Winslett. Functional mechanism: regression analysis under differential privacy. VLDB, 5(11):1364--1375, 2012. Google ScholarDigital Library
Index Terms
- Differential Privacy for Classifier Evaluation
Recommendations
A Novel Differential Privacy Approach that Enhances Classification Accuracy
C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software EngineeringIn the recent past, there has been a tremendous increase of large repositories of data, examples being in healthcare data, consumer data from retailers, and airline passenger data. These data are continually being shared with interested parties, either ...
Applying Differential Privacy to Matrix Factorization
RecSys '15: Proceedings of the 9th ACM Conference on Recommender SystemsRecommender systems are increasingly becoming an integral part of on-line services. As the recommendations rely on personal user information, there is an inherent loss of privacy resulting from the use of such systems. While several works studied ...
Differential Privacy: Now it's Getting Personal
POPL '15: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesDifferential privacy provides a way to get useful information about sensitive data without revealing much about any one individual. It enjoys many nice compositionality properties not shared by other approaches to privacy, including, in particular, ...
Comments