Skip to main content
Log in

Privacy preserving and fast decision for novelty detection using support vector data description

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Support vector data description (SVDD) has been widely used in novelty detection applications. Since the decision function of SVDD is expressed through the support vectors which contain sensitive information, the support vectors will be disclosed when SVDD is used to detect the unknown samples. Accordingly, privacy concerns arise. In addition, when it is applied to large datasets, SVDD does not scale well as its complexity is linear with the size of the training dataset (actually the number of support vectors). Our work here is distinguished in two aspects. First, by decomposing the kernel mapping space into three subspaces and exploring the pre-image of the center of SVDD’s sphere in the original space, a fast decision approach of SVDD, called FDA-SVDD, is derived, which includes three implementation versions, called FDA-SVDD-I, FDA-SVDD-II and FDA-SVDD-III. The decision complexity of the proposed method is reduced to only \(O\)(1). Second, as the decision function of FDA-SVDD only refers to the pre-image of the sphere center, the privacy of support vectors can be preserved. Therefore, the proposed FDA-SVDD is particularly attractive in privacy-preserving novelty detection applications. Empirical analysis conducted on UCI and USPS datasets demonstrates the effectiveness of the proposed approach and verifies the derived theoretical results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. USPS dataset can be downloaded from http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

References

  • BakIr G, Zien A, Tsuda K (2004) Learning to find graph pre-images. in Proc. of the 26th DAGM Symposium on Pattern Recognition, pp 253–261

  • Chung FL, Deng ZH, Wang ST (2009) From minimum enclosing ball to fast fuzzy inference system training on large datasets. IEEE Trans Fuzzy Systems 17(1):173–184

    Article  Google Scholar 

  • Collobert R, Bengio S, Bengio Y (2002) A parallel mixture of SVMs for very large scale problems. Neural Comput 14(5):1105–1114

    Article  MATH  Google Scholar 

  • Cortes C, Vapnik VN (1995) Support vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  • Deng ZH, Chung FL, Wang ST (2008) FRSDE: fast reduced set density estimator using minimal enclosing ball approximation. Pattern Recogn 41:1363–1372

    Article  MATH  Google Scholar 

  • Frank A, Asuncion A (2010) UCI Machine Learning Repository. http://archive.ics.uci.edu/ml.Irvine

  • Friedman A, Wolff R, Schuster A (2008) Providing k-anonymity in data mining. VLDB J 17(4):789–804

    Article  Google Scholar 

  • Geebelen D, Suykens JAK, Vandewalle J (2010) Reducing the number of support vectors of SVM classifiers using the smoothed separable case approximation. IEEE Trans Neural Netw Learn Systems 23(4):682–688

    Article  Google Scholar 

  • Ha MH, Wang C, Chen JQ (2013) The support vector machine based on intuitionistic fuzzy number and kernel function. Soft Comput 17(4):635–641

    Article  MATH  Google Scholar 

  • Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16(5):550–554

    Article  Google Scholar 

  • Jeffreys H, Jeffreys BS (1988) Mean-value theorems. Methods of Mathematical Physics, ed. 3. Cambridge University Press, Cambridge, England, pp 49–50

  • Kwok JT, Tsang IW (2004) The pre-image problem in kernel methods. IEEE Trans Neural Netw 15(6):1517–1525

    Article  Google Scholar 

  • Lin KP, Chen MS (2011) On the design and analysis of the privacy-preserving SVM classifier. IEEE Trans Knowl Data Eng 23(11):1704–1717

    Article  Google Scholar 

  • Liu YH, Liu YC, Chen YJ (2010) Fast support vector data descriptions for novelty detection. IEEE Trans Neural Netw Learn Systems 21(8):1296–1313

    Article  Google Scholar 

  • Mozafari B, Zaniolo C (2009) Publishing naive bayesian classifiers: privacy without accuracy loss. In: Proceedings of the 35th International Conference on Very Large Data Bases (VLDB) 2(1): 1174–1185

  • Ogiela MR, Ogiela U (2012) DNA-like linguistic secret sharing for strategic information systems. Int J Inf Manag 32(2):175–181

    Article  MathSciNet  Google Scholar 

  • Osuna E, Girosi F (1999) Reducing the run-time complexity of support vector machines. in Advances in Kernel Methods: Support Vector Learning, Schölkopf B, Burges CJC, Smola A, Eds. Cambridge 271–283

  • Roberts S, Tarassenko L (1994) A probabilistic resource allocation network for novelty detection. Neural Comput 6:270–284

    Article  Google Scholar 

  • Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  • Stokes K, Torra V (2012) Reidentification and k-anonymity: a model for disclosure risk in graphs. Soft Comput 16(10):1657–1670

    Article  MATH  Google Scholar 

  • Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Systems 10(5):557–570

    Article  MATH  MathSciNet  Google Scholar 

  • Tang B, Mazzoni D (2006) Multiclass reduced-set support vector machines. in Proc. 23rd Int. Conf. Mach. Learning 921–928

  • Tenenbaum JB, Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

  • Tex DMJ et al (2004) Support vector data description. Mach Learn 54(1):45–66

  • Towel GG (2000) Local expert autoassociators for anomaly detection. Proc. 17th ICML 1023–1030

  • Tsang IW, Kwok JT et al (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6:363–392

  • Tsang IW, Kwok JT, Zurada JM (2006) Generalized core vector machines. IEEE Trans Neural Netw 17(5):1126–1140

    Article  Google Scholar 

  • Vaidya J, Yu H, Jiang X (2008) Privacy-preserving SVM classification. Knowl Inf Systems 14:161–178

    Article  Google Scholar 

  • Wang C, Liu LZ, Gao LJ (2013) Research on k-Anonymity algorithm in privacy protection. Adv Mater Res 756–759:3471–3475

    Article  Google Scholar 

  • Wu MR, Ye JP (2009) A small sphere and large margin approach for novelty detection using training data with outliers. IEEE Trans Pattern Anal Mach Intell 31:2088–2092

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the Hong Kong Polytechnic University under Grant G-UA68, by the National Natural Science Foundation of China under Grants 61170122, 61170029, 61272210, 61202311, 61370173, by the Natural Science Foundation of Jiangsu Province under Grants BK2011003, BK2011417, by Jiangsu 333 expert engineering Grant BRA2011142 and by 2011, 2012 Postgraduate Student’s Creative Research Fund of Jiangsu Province, the Natural Science Foundation of Zhejiang Province under Grants LY13F020011, LY14F010010, LY14F020009, and R1090244, and Independent Design Project of Zhejiang Province Key Technological Innovation Team under Grant 2011R09014-05.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shitong Wang.

Additional information

Communicated by V. Loia.

Appendix

Appendix

Proof of Theorem 4:

According to Eq. (16) and Eq. (19), we have

$$\begin{aligned} \text {ISE}({\varvec{w}})&\approx \mathop {\sum }\limits _{{\varvec{x}}_i \in \chi _\mathrm{in} \cup \chi _\mathrm{on}} \mathop {\sum }\limits _{{{\varvec{x}}_i \in \chi _\mathrm{in} \cup \chi _\mathrm{on} }} w_i w_j k({\varvec{x}}_i ,{\varvec{x}}_j )\nonumber \\&\quad -2\mathop {\sum }\limits _{{\varvec{x}}_i \in \chi _{in} \cup \chi _{on}} {w_i \sum \limits _{{\varvec{x}}_j \in SV} {\alpha _j k({\varvec{x}}_i ,{\varvec{x}}_j )} }\nonumber \\&\quad +\sum \limits _{{\varvec{x}}_i \in SV} {\sum \limits _{{\varvec{x}}_j \in SV} {\alpha _i \alpha _j k({\varvec{x}}_i ,{\varvec{x}}_j )} } \end{aligned}$$
(23)
$$\begin{aligned} w_i =\frac{\sum \nolimits _{{\varvec{x}}_j \in SV} {\alpha _j k({\varvec{x}}_i ,{\varvec{x}}_j )} }{\sum \nolimits _{{\varvec{x}}_j \in \chi _{in} \cup \chi _{on} } {k({\varvec{x}}_i ,{\varvec{x}}_j )} }. \end{aligned}$$
(24)

Then,

$$\begin{aligned}&\frac{\partial ISE({\varvec{w}})}{\partial w_k }=2\varphi ({\varvec{x}}_k )^T\sum \limits _{{\varvec{x}}_i \in \chi _{in} \cup \chi _{on} } {w_i \varphi ({\varvec{x}}_i )} -2\varphi ({\varvec{x}}_k )^T\nonumber \\&\quad \times \sum \limits _{{\varvec{x}}_j \in SV} {\alpha _j \varphi ({\varvec{x}}_j )}\nonumber \\&=2\varphi ({\varvec{x}}_k )^T\left( {\sum \limits _{{\varvec{x}}_i \in \chi _{in} \cup \chi _{on} } {w_i \varphi ({\varvec{x}}_i )} -\sum \limits _{{\varvec{x}}_j \in SV} {\alpha _j \varphi ({\varvec{x}}_j )} }\right) . \nonumber \\ \end{aligned}$$
(25)

Substitute Eq. (24) to Eq. (25), i.e.,

$$\begin{aligned}&\frac{\partial ISE({\varvec{w}})}{\partial w_k }\nonumber \\&\quad =2\varphi ({\varvec{x}}_k )^T\left( {\sum \limits _{{\varvec{ x}}_i \in \chi _{in} \cup \chi _{on} } {\frac{\sum \nolimits _{{\varvec{ x}}_j \in SV} {\alpha _j k({\varvec{ x}}_i ,{\varvec{ x}}_j )} }{\sum \nolimits _{{\varvec{ x}}_j \in \chi _{in} \cup \chi _{on} } {k({\varvec{ x}}_i ,{\varvec{x}}_j )} }\varphi ({\varvec{ x}}_i )} -\sum \limits _{{\varvec{x}}_j \in SV} {\alpha _j \varphi ({\varvec{x}}_j )} }\right) \nonumber \\&\quad =2\varphi ({\varvec{ x}}_k )^T\left( {\sum \limits _{{\varvec{ x}}_i \in \chi _{in} \cup \chi _{on} } {\frac{\sum \nolimits _{{\varvec{x}}_j \in SV} {\alpha _j \varphi ({\varvec{ x}}_i )\varphi ({\varvec{ x}}_j )} }{\sum \nolimits _{{\varvec{ x}}_j \in \chi _{in} \cup \chi _{on} } {\varphi ({\varvec{x}}_i )\varphi ({\varvec{ x}}_j )} }\varphi ({\varvec{ x}}_i )} -\sum \limits _{{\varvec{ x}}_j \in SV} {\alpha _j \varphi ({\varvec{ x}}_j )} }\right) \nonumber \\&\quad =2\varphi ({\varvec{x}}_k )^T\left( {\sum \limits _{{\varvec{ x}}_i \in \chi _{in} \cup \chi _{on} } {\frac{\sum \nolimits _{{\varvec{ x}}_j \in SV} {\alpha _j \varphi ({\varvec{ x}}_i )\varphi ({\varvec{ x}}_j )} }{\sum \nolimits _{{\varvec{x}}_j \in \chi _{in} \cup \chi _{on} } {\varphi ({\varvec{ x}}_j )} }} -\sum \limits _{{\varvec{ x}}_j \in SV} {\alpha _j \varphi ({\varvec{x}}_j )} }\right) \nonumber \\&\quad =2\varphi ({\varvec{x}}_k )^T\left( {\frac{\sum \nolimits _{{\varvec{ x}}_i \in \chi _{in} \cup \chi _{on} } {\sum \nolimits _{{\varvec{ x}}_j \in SV} {\alpha _j \varphi ({\varvec{x}}_i )\varphi ({\varvec{ x}}_j )} } }{\sum \nolimits _{{\varvec{ x}}_j \in \chi _{in} \cup \chi _{on} } {\varphi ({\varvec{ x}}_j )} }-\sum \limits _{{\varvec{ x}}_j \in SV} {\alpha _j \varphi ({\varvec{x}}_j )} }\right) \nonumber \\&\quad =2\varphi ({\varvec{ x}}_k )^T\left( {\frac{\sum \nolimits _{{\varvec{x}}_i \in \chi _{in} \cup \chi _{on} } {\varphi ({\varvec{ x}}_i )} }{\sum \nolimits _{{\varvec{ x}}_j \in \chi _{in} \cup \chi _{on} } {\varphi ({\varvec{x}}_j )} }\sum \limits _{{\varvec{x}}_j \in SV} {\alpha _j \varphi ({\varvec{ x}}_j )} -\sum \limits _{{\varvec{ x}}_j \in SV} {\alpha _j \varphi ({\varvec{ x}}_j )} }\right) \nonumber \\&\quad =2\left( {\sum \limits _{{\varvec{ x}}_j \in SV} {\alpha _j k({\varvec{x}}_j ,{\varvec{ x}}_k )} -\sum \limits _{{\varvec{ x}}_j \in SV} {\alpha _j k({\varvec{x}}_j ,\mathrm{\mathbf{x}}_k )} }\right) \nonumber \\&\quad =0.\end{aligned}$$
(26)

Clearly, Theorem 4 holds.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, W., Wang, S., Chung, Fl. et al. Privacy preserving and fast decision for novelty detection using support vector data description. Soft Comput 19, 1171–1186 (2015). https://doi.org/10.1007/s00500-014-1331-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-014-1331-8

Keywords

Navigation