Skip to main content
Log in

Secure support vector machines outsourcing with random linear transformation

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The support vector machine (SVM) is a popular classification algorithm. Since training the SVM is very time-consuming, outsourcing the computations of solving the SVM to external service providers benefits the data owner who is not familiar with the techniques of the SVM or has only limited computing resources. In outsourcing, the data privacy is a critical concern for some legal or commercial reasons since there may be sensitive information contained in the data. Existing privacy-preserving data mining works are either not appropriate to outsourcing the SVM or weak in security. In this paper, we propose a scheme for privacy-preserving outsourcing of the SVM, which perturbs the data by random linear transformation. The service provider solves the SVM from the perturbed data without knowing the actual content of the data, and the generated SVM classifier is also perturbed, which can only be recovered by the data owner. Both the privacy of data and generated classifiers are protected. The proposed scheme is stronger in security than existing techniques, and incurs very little additional communication and computation cost. The experimental results show that the proposed scheme imposes very little overhead on the data owner, and the classification accuracy is similar to a normal SVM classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. It is not necessary to put the whole matrix \(M\) in the main memory. The computation can be decomposed to \(M \mathbf {x}= x_1 M_{:,1} + \cdots + x_n M_{:,n} \), where \(M_{:,i}\) is the \(i\)-th column of \(M\).

References

  1. Standard for privacy of individually identifiable health information, 2001. [Online]. http://www.hhs.gov/ocr/privacy/index.html

  2. Aggarwal CC, Yu PS (2004) A condensation approach to privacy preserving data mining. In: Proceedings of the 9th international conference on extending database technology (EDBT)

  3. Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS)

  4. Agrawal R, Kiernan J, Srikant R, Xu Y (2004) Order preserving encryption for numeric data. In: Proceedings of the 2004 ACM SIGMOD international conference on management of data (SIGMOD)

  5. Agrawal R, Srikant R (2000) Privacy preserving data mining. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data (SIGMOD)

  6. Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml

  7. Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  8. Chen K, Liu L (2005) Privacy preserving data classification with rotation perturbation. In: Proceedings of the 5th IEEE international conference on data mining (ICDM)

  9. Chen K, Sun G, Liu L (2007) Towards attack-resilient geometric data perturbation. In: Proceedings of the 7th SIAM international conference on data mining (SDM)

  10. Chen M-S, Han J, Yu PS (1996) Data mining: an overview from database perspective. IEEE Trans Knowl Data Eng 8(6):866–883

    Article  Google Scholar 

  11. Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD)

  12. Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Proceedings of the 41st annual ACM symposium on theory of computing (STOC)

  13. Gentry C (2010) Computing arbitrary functions of encrypted data. Commun ACM 53(3):97–105

    Article  Google Scholar 

  14. Goldwasser S, Micali S (1984) Probabilistic encryption. J Comput Syst Sci 28(2):270–299

    Article  MATH  MathSciNet  Google Scholar 

  15. Hacıgümüş H, Iyer B, Li C, Mehrotra S (2002) Executing SQL over encrypted data in the database-service-provider model. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data (SIGMOD)

  16. Hsu C-W, Chang C-C, Lin C-J (2003) A practical guide to support vector classification. Department of Computer Science, National Taiwan University. Technical report. http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf

  17. Inan A, Kantarcioglu M, Bertino E (2009) Using anonymized data for classification. In: Proceedings of the 25th IEEE international conference on data engineering (ICDE)

  18. Kantarcioglu M, Clifton C (2004) Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans Knowl Data Eng 16(9):1026–1037

    Article  Google Scholar 

  19. Laur S, Lipmaa H, Mielikäinen T (2006) Cryptographically private support vector machines. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD)

  20. Lee Y-J, Huang S-Y (2007) Reduced support vector machines: a statistical theory. IEEE Trans Neural Netw 18(1):1–13

    Article  Google Scholar 

  21. Lee Y-J, Mangasarian OL (2001) RSVM: reduced support vector machines. In: Proceedings of the 1st SIAM international conference on data mining (SDM)

  22. Lee Y-J, Mangasarian OL (2001) SSVM: a smooth support vector machine for classification. Comput Optim Appl 20(1):5–22

    Article  MATH  MathSciNet  Google Scholar 

  23. Lin K-M, Lin C-J (2003) A study on reduced support vector machines. IEEE Trans Neural Netw 14(6):1449–1459

    Article  Google Scholar 

  24. Lin K-P, Chen M-S (2008) Releasing the SVM classifier with privacy-preservation. In: Proceedings of the 8th IEEE international conference on data mining (ICDM)

  25. Lindell Y, Pinkas B (2002) Privacy preserving data mining. J Cryptol 15:177–206

    Article  MATH  MathSciNet  Google Scholar 

  26. Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) \(l\)-Diversity: privacy beyond \(k\)-anonymity. In: Proceedings of the 22nd IEEE international conference on data engineering (ICDE), pp 24–24

  27. Mangasarian OL, Wild EW, Fung GM (2008) Privacy-preserving classification of vertically partitioned data via random kernels. ACM Trans Knowl Discov Data 2(3):12:1–12:16

  28. Mangasarian OL, Wild T (2007) Privacy-preserving classification of horizontally partitioned data via random kernels. 07-03, Data Mining Institute, Computer Sciences Department, University of Wisconsin–Madison, technical report

  29. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Advances in cryptography—EUROCRYPT’99, vol 1592 of lecture notes in computer science. Springer, Berlin, pp 223–238

  30. Pinkas B (2002) Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explor Newsl 4(2):12–19

    Article  Google Scholar 

  31. Platt JC (1998) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges CJC, Mika S (eds) Advances in kernel methods: support vector learning. MIT Press, Cambridge, pp 185–208

  32. Prokhorov D (2001) IJCNN 2001 neural network competition. Slide presentation in IJCNN’01, Ford Research Laboratory, technical report

  33. Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027

    Article  Google Scholar 

  34. Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

    Google Scholar 

  35. Sweeney L (2002) Achieving \(k\)-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl Based Syst 10(5):571–588

    Article  MATH  MathSciNet  Google Scholar 

  36. Sweeney L (2002) \(k\)-Anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570

    Article  MATH  MathSciNet  Google Scholar 

  37. Vaidya J, Clifton C (2002) Privacy-preserving association rule mining in vertically partitioned data. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD)

  38. Vaidya J, Yu H, Jiang X (2008) Privacy-preserving SVM classification. Knowl Inf Syst 14:161–178

    Article  Google Scholar 

  39. Vapnik VN (1998) Statistical learning theory. Wiley, Hoboken

    MATH  Google Scholar 

  40. Wong WK, Cheung DW, Hung E, Kao B, Mamoulis N (2007) Security in outsourcing of association rule mining. In: Proceedings of the 33rd international conference on very large data bases (VLDB)

  41. Wong WK, Cheung DW, Kao B, Mamoulis N (2009) Secure kNN computation on encrypted databases. In: Proceedings of the 35th SIGMOD international conference on management of data (SIGMOD)

  42. Yu H, Jiang X, Vaidya J (2006) Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. In: Proceedings of the 2006 ACM symposium on applied computing (SAC)

  43. Yu H, Vaidya J, Jiang X (2006) Privacy-preserving SVM classification on vertically partitioned data. In: Proceedings of the 10th Pacific-Asia conference on knowledge discovery and data mining (PAKDD)

Download references

Acknowledgments

This work was supported in part by the National Science Council, Taiwan, under NSC 101-2410-H-110-081-MY2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keng-Pei Lin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, KP., Chang, YW. & Chen, MS. Secure support vector machines outsourcing with random linear transformation. Knowl Inf Syst 44, 147–176 (2015). https://doi.org/10.1007/s10115-014-0751-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-014-0751-1

Keywords

Navigation