Skip to main content
Log in

Privacy preserving perceptron learning in malicious model

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Privacy preserving data mining algorithms are proposed to protect the participating parties’ data privacy in data mining processes. So far, most of these algorithms only work in the semi-honest model that assumes all parties follow the algorithms honestly. In this paper, we propose two privacy preserving perceptron learning algorithms in the malicious model, for horizontally and vertically partitioned data sets, respectively. So far as we know, our algorithms are the first perceptron learning algorithms that can protect data privacy in the malicious model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. The word adversary is mainly used in the cryptography area. Here in our scenario, we use it to refer to a curious participant of data mining.

  2. Here we neglect the cut-and-choose step. The reason is that the cut-and-choose steps are relatively light weighted in the secure comparison algorithm. Also same as [25], we assume at least every one in four generated candidate r is less than N.

  3. So far, the known optimal size complexity for a boolean circuit to compute multiplication is \(\Upomega(l\log l)\) [13].

  4. These do not include the messages for synchronization, or messages of challenge since the values of these messages can be arbitrary numbers and changing their values are meaningless.

References

  1. Agrawal R, Srikant R (2000) Privacy-preserving data mining. SIGMODREC: ACM SIGMOD record 29

  2. Canetti R (2000) Security and composition of multiparty cryptographic protocols. J Cryptol 13(1):143–202

    Article  MathSciNet  MATH  Google Scholar 

  3. Chen K, Liu L (2005) Privacy preserving data classification with rotation perturbation. In: Proceedings of the fifth IEEE international conference on data mining, IEEE Computer Society, pp 589–592

  4. Chen T, Zhong S (2009) Privacy-preserving backpropagation neural network learning. IEEE Trans Neural Netw 20(10):1554–1564

    Article  Google Scholar 

  5. Cramer R, Damgård I, Nielsen J (2001) Multiparty computation from threshold homomorphic encryption. In: EUROCRYPT: advances in cryptology: proceedings of EUROCRYPT

  6. Cristofaro ED, Kim J, Tsudik G (2010) Linear-complexity private set intersection protocols secure in malicious model. In: ASIACRYPT, Lecture Notes in Computer Science, vol 6477. Springer, pp 213–231

  7. Dai W Crypto++ library (2010). http://www.cryptopp.com

  8. Damgård I, Jurik M (2000) Efficient protocols based on probabilistic encryption using composite degree residue classes. BRICS, Department of Computer Science, University of Aarhus, Aarhus

    Google Scholar 

  9. Damgård I, Geisler M, Kroigard M (2008) Homomorphic encryption and secure comparison. Int J Appl Cryptogr 1:22–31

    Article  MathSciNet  MATH  Google Scholar 

  10. Duda RO, Hart PE, Stork DG (2001) Pattern classification, vol 2. Wiley, New York

    MATH  Google Scholar 

  11. Fouque PA, Poupard G, Stern J (2001) Sharing decryption in the context of voting or lotteries. In: Financial cryptography, Springer, pp 90–104

  12. Frank A, Asuncion A (2010) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine. http://www.ics.uci.edu/mlearn/MLRepository.html

  13. Fürer M (2007) Faster integer multiplication. In: Proceedings of the thirty-ninth annual ACM symposium on theory of computing, ACM, pp 57–66

  14. Gilburd B, Schuster A, Wolff R (2004) Privacy-preserving data mining on data grids in the presence of malicious participants. In: HPDC. IEEE computer society, pp 225–234

  15. Goldreich O (2010) A short tutorial of zero-knowledge. http://www.wisdom.weizmann.ac.il/oded/PS/zk-tut10.ps

  16. Goldreich O, Micali S, Wigderson A (1987) How to play any mental game. In: Proceedings of the nineteenth annual ACM symposium on theory of computing, ACM, pp 218–229

  17. Goldwasser S, Micali S, Rackoff C (1985) The knowledge complexity of interactive proof-systems. In: Proceedings of the seventeenth annual ACM symposium on theory of computing, ACM, pp 291–304

  18. Heer GR (1993) A bootstrap procedure to preserve statistical confidentiality in contingency tables. In: Proceedings of the international seminar on statistical confidentiality, pp 261–271

  19. Kantarcioglu M, Kardes O (2009) Privacy-preserving data mining in the malicious model. Int J Inf Comput Secur 2:353–375

    Google Scholar 

  20. Laur S, Lipmaa H, Mielikänen T (2006) Cryptographically private sup-a port vector machines. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 618–624

  21. Lin X, Clifton C, Zhu M (2005) Privacy-preserving clustering with distributed em mixture modeling. Knowl Inf Syst 8(1):68–81

    Article  Google Scholar 

  22. Lindell Y, Pinkas B (2007) An efficient protocol for secure two-party computation in the presence of malicious adversaries. Advances in cryptology- EUROCRYPT 2007, pp 52–78

  23. Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: Bellare M (ed) CRYPTO, volume 1880 of Lecture Notes in Computer Science. Springer, New York, pp 36–54

    Google Scholar 

  24. Minsky M, Papert S (1969) Perceptrons: an introduction to computational geometry. MIT Press, Cambridge

    MATH  Google Scholar 

  25. Nishide T, Ohta K (2007) Multiparty computation for interval, equality, and comparison without bit-decomposition protocol. Public key cryptography PKC 2007, pp 343–360

  26. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Advances in cryptology EUROCRYPT99, Springer, pp 223–238

  27. Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65:386–408

    Article  MathSciNet  Google Scholar 

  28. Shah D, Zhong S (2007) Two methods for privacy preserving data mining with malicious participants. Inf Sci 177(23):5468–5483

    Article  MATH  Google Scholar 

  29. Vaidya J, Clifton C (2003) Privacy-preserving k-means clustering over vertically partitioned data. In: Getoor L, Senator TE, Domingos P, Faloutsos C (eds) KDD. ACM, New York, pp 206–215

    Google Scholar 

  30. Vaidya J, Kantarcioglu M, Clifton C (2008) Privacy-preserving naive bayes classification. VLDB J 17(4):879–898

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng Zhong.

Appendix: Security proofs

Appendix: Security proofs

In this section, we give formal security proofs for our algorithms in the malicious model. In particular, we adopt the security definitions in [19], which adopt the security definitions in [5] and apply them in the two-party case. The security of a protocol is formally defined by comparing the executions of this protocol in the real-life model that an active static adversary is present and in the ideal model that an incorruptible trusted party is available also.

Definition 1

The Real-Life Model (two-party case) [5] Let π be a two-party protocol and \({k\in\mathbb{N}}\) be the security parameter. In the real-life model, each party \(P^i\,(i\in\{0,1\})\) has a secret input x i s and a public input x i p . After executing the protocol, each party gets a private output y i j and returns a public output y i p . Let A be an adversary that can corrupt one of the two players and \(C\in\{0,1\}\) be the index of the corrupted party. A receives the public input and output of all parties.

Let \(\overrightarrow{x}=(x^0_s,x^0_p,x^1_s,x^1_p)\) be the two parties’ input, \(\overrightarrow{r} = (r^0,r^1,r^A)\) be the random input of P 0P 1 and A, and \(a\in{\{0,1\}}^{\ast}\) be the A’s auxiliary input. After the execution of π in the real-life model with given input \(\overrightarrow{x}\) and under the attack of A, denote by \({ADVR}_{\pi,A}(k,\overrightarrow{x},C,a,\overrightarrow{r})\) and \({EXEC}^i_{\pi,A}(k,\overrightarrow{x},C,a,\overrightarrow{r})\) the output of the adversary and the output of party P i, respectively. Let

$$ \begin{aligned} EXEC_{{\pi ,A}} (k,\vec{x},C,a,\vec{r}) & = (ADVR_{{\pi ,A}} (k,\vec{x},C,a,\vec{r}), \\ & \quad EXEC_{{\pi ,A}}^{0} (k,\vec{x},C,a,\vec{r}), \\ & \quad EXEC_{{\pi ,A}}^{1} (k,\vec{x},C,a,\vec{r})) \\ \end{aligned} $$

and denote by \({EXEC}_{\pi,A}(k,\overrightarrow{x},C,a)\) the random variable \({EXEC}_{\pi,A}(k,\overrightarrow{x},C,a,\overrightarrow{r})\) when \(\overrightarrow{r}\) is uniformly chosen.

Finally, we define a distribution ensemble EXEC π,A with security parameter k and index \((\overrightarrow{x},C,a)\) by

$$ EXEC_{\pi,A} = {\{EXEC_{\pi,A}(k,\overrightarrow{x},C,a)\}}_{k\in{\mathbb{N}},\overrightarrow{x}\in{({\{0,1\}}^\ast)}^4,C\in\{0,1\}, a\in{\{0,1\}}^\ast}. $$

Definition 2

The Ideal Model (two-party case) [5] Let \({f: \mathbb{N}\times{({\{0,1\}}^\ast)}^{4}\times{\{0,1\}}^\ast\to{({\{0,1\}}^\ast)}^{4}}\) be a polynomial-time-bounded probabilistic two-party function. Define the inputs and outputs of f as follows

$$ f(k,x^0_s,x^0_p,x^1_s,x^1_p,r) = (y^0_s,y^0_p,y^1_s,y^1_p) $$

In the ideal model, each party \(P^i\,(i\in\{0,1\})\) sends her input (x i s x i p ) to an incorruptible trusted party. The party draws r uniformly random and returns (y i j y i p ) to party P i. The entire procedure takes place in the presence of an active static adversary S. At the beginning of the procedure, S sees both parties’ public inputs and also the corrupted party P C’s private input, substitutes (x C s x i j ) with \(({x^C_s}^\prime,{x^C_p}^\prime)\) of his choice. Therefore, f is evaluated by the trusted party using the modified inputs. Similarly, we define

$$ \begin{aligned} {IDEAL}_{f,S}(k,\overrightarrow{x},C,a,\overrightarrow{r})&=({ADVR}_{f,S}(k,\overrightarrow{x},C,a,\overrightarrow{r}),\\ &{IDEAL}^0_{f,S}(k,\overrightarrow{x},C,a,\overrightarrow{r}), \\ &{IDEAL}^1_{f,S}(k,\overrightarrow{x},C,a,\overrightarrow{r})) \\ \end{aligned} $$

and a distribution ensemble by

$$ IDEAL_{f,S} = {\{IDEAL_{f,S}(k,\overrightarrow{x},C,a)\}}_{k\in{\mathbb{N}},\overrightarrow{x}\in{({\{0,1\}}^\ast)}^4,C\in\{0,1\}, a\in{\{0,1\}}^\ast}. $$

Definition 3

Security in the Real-Life Model (two-party case) [5] Let f be a two-party function and let π be a two-party protocol. π securely evaluates f in the malicious model if for any polynomial-time bounded active static adversary A, there exists an polynomial-time bounded ideal-model adversary S such that

$$ IDEAL_{f,S}\stackrel{c}{\approx}EXEC_{\pi,A} $$

where \(\stackrel{c}{\approx}\) denotes the computational indistinguishability between two distribution ensembles.

We need to combine several secure function evaluation protocols to construct our perceptron learning protocol. In order to prove the security of the composed protocol, we first prove the composed protocol is secure when these secure sub-protocols can be called as oracles (the corresponding model is called hybrid model [5]). Then, using the composition theorem in [2], we can conclude the composed protocol that replaces the oracle calls with corresponding secure protocols is still secure.

Definition 4

The Hybrid Model (two-party case) [5] In the \((g_1,\ldots, g_m)\)-hybrid model, the execution of a protocol π proceeds as in the real-life model, except that the parties have oracle access to a trusted party for evaluating m two-party functions \((g_1,\ldots, g_m).\) These functions are evaluated as in the ideal model. Similarly we define the protocol’s output with the distribution ensemble:

$$ EXEC^{g_1,\ldots, g_m}_{\pi,A} = {\{EXEC^{g_1,\ldots, g_m}_{\pi,A}(k,\overrightarrow{x},C,a)\}}_{k\in{\mathbb{N}},\overrightarrow{x}\in{({\{0,1\}}^\ast)}^4,C\in\{0,1\}, a\in{\{0,1\}}^\ast}. $$

Definition 5

Security in the Hybrid Model (two-party case) [5] Similar to the security definition in the real-life model, the security in the hybrid model can be defined by requiring for any adversary S in the hybrid model, there exists a polynomial-time bounded adversary S such that

$$ IDEAL_{f,S}\stackrel{c}{\approx}EXEC^{g_1,\ldots, g_m}_{\pi,A}. $$

Now we prove our algorithms’ security in the hybrid model by replacing the secure fundamental algorithms PKP, PMC, TTD, SVE, SVB, SXR, SSB, SVZ, and SCA used in our algorithms with the corresponding oracle calls to the trusted party in the hybrid model.

Theorem 1

Given the fundamental algorithms above are secure in the malicious model, Algorithm 2 is secure in the hybrid model.

Proof

Without loss of generality, we assume P 0 is corrupted by the adversary A in the hybrid model. For simplicity of presenting, we will use A as a subroutine, as well as the simulators for the secure algorithms in constructing the corresponding adversary in the ideal model.

For any adversary A operating in the hybrid model, given the description of A, the private inputs \({\bf X}^{\bf \prime}_{\bf 1},\ldots, {\bf X}^{\bf \prime}_{\bf n0}\) and the final output \(\overline{{\bf W}},\) we construct an adversary S A in the ideal model as follows:

In step 0.1, S A runs A to get \({\bf W}^{\bf 0}, \overline{{\bf W}^{\bf 0}}\) and \(PKP(\overline{{\bf W}^{\bf 0}})\). S A runs the S PKP , the simulator of POK, by giving the state of A and \(\overline{{\bf W}^{\bf 0}}\) as the inputs to S PKP . If S PKP outputs that the proof fails, S A terminates the protocol, otherwise, sets the state of A the state returned by S PKP . S A simulates the honest party P 1 by generating random W 1 and constructing correct zero-knowledge proof. S A feeds A with the proof. Then, S A runs A to compute \(\overline{{\bf W}} = \overline{{\bf W}^{\bf 0}}+_h\overline{{\bf W}^{{\bf 1}}}{\bf .}\)

In step 0.2, S A runs A to get \(\overline{{\bf X}^{\bf \prime}_{\bf k}}\) and \(PKP(\overline{{\bf X}^{\bf \prime}_{\bf k}})\) for every \({\bf X}^{\bf \prime}_{\bf k}\) that P 0 owns, and feeds S PKP with state of A and the proof \(PKP(\overline{{\bf X}^{\bf \prime}_{\bf k}}).\) If the simulator outputs that the proof fails, S A terminates the protocol, otherwise, sets the state of A the state returned by the simulator. S A simulates the honest party P 1 by generating random \({\bf Y}^{\bf \prime}_{\bf k}\) for all \({\bf Y}^{\bf \prime}_{\bf k}\) that P 1 owns and constructing correct zero-knowledge proofs. S A feeds A with these proofs.

In step 0.3, S A runs A to get \(\overline{{\varvec{\Updelta}} {\bf W}}.\)

In step 1.1, S A runs A to update \(\overline{{\bf W}}. \)

In step 2.1, for any sample \({\bf X}^{\bf \prime}_{\bf k}\) that belongs to P 0. S A runs A to get \(\overline{{\bf W}\cdot{\bf X}^{\bf \prime}_{\bf k}}\) and proofs \(PMC(\overline{x_{kj}^{\prime}}, \overline{w_j}, \overline{{x^{\prime}_{kj}} \cdot w_j}\)) for every \(j\in[0,p].\) Then, S A feeds A’s state and the proofs to S PMC , the simulator of PMC, sequentially. If any proof fails according to the outputs of the simulator, S A terminates the protocol, otherwise, sets the state of A to the state returned by the last run of S PMC . For any sample \({\bf X}^{\bf \prime}_{\bf k}\) that belongs to P 1S A simulates the honest party P 1 by computing \(\overline{{\bf W}\cdot{\bf X}^{\bf \prime}_k},\) generating \(PMC(\overline{x_{kj}^{\prime}}, \overline{w_j}, \overline{{x^{\prime}_{kj}} \cdot w_j})\) for every \(j\in[0,p]\) correctly and feeds A with these proofs.

In step 2.2, S A feeds S SCA , the simulator of SCA, with state of A and \(\overline{{\bf W}\cdot{\bf X}^{\bf \prime}_{\bf k}},\,\overline{0}\) as inputs. If the simulator outputs that the computation is incorrect, S A terminates the protocol, otherwise, sets the state of A the state returned by the simulator.

In step 2.3, if sample X k is owned by P 0S A runs A to get \(\overline{{\bf X}^{\bf \prime}_{\bf k}\cdot(1-r_k)},\) and the proofs \(PMC(\overline{{x^{\prime}_{kj}}}, \overline{1-r_k}, \overline{{x^{\prime}_{kj}} \cdot (1-r_k)})\) for every \(j\in[0,p].\) S A feeds the state of A and \(PMC(\overline{{x^{\prime}_{kj}}}, \overline{1-r_k}, \overline{{x^{\prime}_{kj}} \cdot (1-r_k)})\) as the input of S P MC sequentially for \(j\in[0,p], \) if any proofs fails, S A terminates the protocol, otherwise, sets the state of A the state returned by the last run of the simulator. Then, S A runs A to update the \(\overline{{\varvec{\Updelta} {\bf W}}}.\) If sample X k is owned by P 1S A simulates honest party P 1 by constructing the required zero-knowledge proofs and feeds A with these proofs. Then, S A runs A to update the \(\overline{{\varvec{\Updelta}} {\bf W}}.\)

Note that, in each step, S A either runs A directly or uses simulators to simulate the real world by passing the ciphertexts or zero-knowledge proofs as inputs. It is straightforward to see A’s states in the first case are identical in both two worlds. Also in the second case, the ciphertexts and the zero-knowledge proofs are generated using a semantically secure cipher, and the views of A in both worlds are computationally indistinguishable. Therefore, in both cases, the states of A in both worlds are identical in both worlds for every step, and the outputs in both worlds should be computationally indistinguishable.    □

Theorem 2

Given the fundamental sub-algorithms are secure in the malicious model, Algorithm 3 is secure in the hybrid model.

Proof

Without loss of generality, we assume P 0 is corrupted by the adversary A in the hybrid model. For simplicity of presenting, we will use A as a subroutine, as well as the simulators for the secure algorithms in constructing the corresponding adversary in the ideal model.

For any adversary A operating in the hybrid model, given the description of A, the private inputs \({\bf X}^{\bf \prime}_{\bf 1},\ldots, {\bf X}^{\bf \prime}_{\bf n}\) and the final output \(\overline{{\bf W}},\) we construct an adversary S A in the ideal model as follows:

In step 0.1, S A runs A to get \({\bf W}^{\bf 0}, \overline{{\bf W}^{\bf 0}}\) and \(PKP(\overline{{\bf W}^{\bf 0}}).\) S A runs the S PKP , the simulator of POK, by giving the state of A and \(\overline{{\bf W}^{\bf 0}}\) as the inputs to S PKP . If S PKP outputs that the proof fails, S A terminates the protocol, otherwise, sets the state of A the state returned by S PKP . S A simulates the honest party P 1 by generating random W 1 and constructing correct zero-knowledge proof. S A feeds A with the proof. Then, S A runs A to compute \(\overline{{\bf W}} = \overline{{\bf W}^{\bf 0}}+_h\overline{{\bf W}^{{\bf 1}}}\).

In step 0.2, S A runs A to get \(\overline{{\bf X}^{\bf \prime}_{\bf k}}\) and \(PKP(\overline{{\bf X}^{\bf \prime}_{\bf k}})\) for every \({\bf X}^{\bf \prime}_{\bf k}\) that P 0 owns, and feeds S PKP with state of A and the proof \(PKP(\overline{{\bf X}^{\bf \prime}_{\bf k}}).\) If the simulator outputs that the proof fails, S A terminates the protocol, otherwise, sets the state of A the state returned by the simulator. S A simulates the honest party P 1 by generating random \({\bf Y}^{\bf \prime}_{\bf k}\) for all \({\bf Y}^{\bf \prime}_{\bf k}\) that P 1 owns and constructing correct zero-knowledge proofs. S A feeds A with these proofs.

In step 0.3, S A runs A to get \(\overline{{\varvec{\Updelta} {\bf W}}}.\)

In step 1.1, S A runs A to update \(\overline{{\bf W}}.\)

In step 2.1, S A runs A to get \(\overline{{\bf W}^{\bf x}\cdot{\bf X}^{\bf \prime}_{\bf k}}\) and proofs \(PMC(\overline{x_{kj}^{\prime}}, \overline{w^x_j}, \overline{{x^{\prime}_{kj}} \cdot w_j^x})\) for every \(j\in[0,p].\) Then S A feeds A’s state and the proofs to S PMC , the simulator of PMC, sequentially. If any proof fails according to the outputs of the simulator, S A terminates the protocol, otherwise, sets the state of A to the state returned by the last run of S PMC . For any sample \({\bf Y}^{\bf \prime}_{\bf k}\) that belongs to P 1S A simulates the honest party P 1 by computing \(\overline{{\bf W}^{\bf y}\cdot{{\bf Y}^{\bf \prime}_{\bf k}}},\) generating \(PMC(\overline{y_{kj}^{\prime}}, \overline{w^y_j}, \overline{{y^{\prime}_{kj}} \cdot w_j^y})\) for every \(j\in[0,q]\) correctly and feeds A with these proofs.

In step 2.2, S A runs A to get \(\overline{{\bf Z}_{\bf k}^{\bf \prime}\cdot{{\bf W}}}.\)

In step 2.3, given the state of A,  S A feeds S SCA , the simulator of SCA, with \(\overline{{\bf Z}^{\bf \prime}_{\bf k}\cdot{\bf W}},\,\overline{0}\) as inputs. If the simulator outputs that the computation is incorrect, S A terminates the protocol, otherwise, sets the state of A the state returned by the simulator.

In step 2.4, S A runs A to get \(\overline{{\bf X}^{\bf \prime}_{\bf k}\cdot(1-r_k)},\) and the proofs \(PMC(\overline{{x^{\prime}_{kj}}}, \overline{1-r_k}, \overline{{x^{\prime}_{kj}} \cdot (1-r_k)})\) for every \(j\in[0,p].\) S A feeds \(PMC(\overline{{x^{\prime}_{kj}}}, \overline{1-r_k}, \overline{{x^{\prime}_{kj}} \cdot (1-r_k)})\) as the input of S P MC sequentially for \(j\in[0,p], \) if any proofs fails, S A terminates the protocol, otherwise, sets the state of A the state returned by the last run of the simulator. Then, S A simulates the honest party P 1 by computing the \(\overline{{\bf Y}^{\bf \prime}_{\bf k}\cdot(1-r_k)}\) and generating correct proofs \(PMC(\overline{{y^{\prime}_{kj}}}, \overline{1-r_k}, \overline{{y^{\prime}_{kj}} \cdot (1-r_k)})\) for every \(j\in[0,q].\) S A feeds A with these proofs.

In step 2.5, S A runs A to update the \(\overline{{\varvec{\Updelta}} {\bf W}}.\)

Similarly, in each step, S A either runs A directly or uses simulators to simulate the real world by passing the ciphertexts or zero-knowledge proofs as inputs. It is straightforward to see A’s states in the first case are identical in both two worlds. Also in the second case, the ciphertexts and the zero-knowledge proofs are generated using a semantically secure cipher, the views of A in both worlds are computationally indistinguishable. Therefore, in both cases, the states of A in both worlds are identical in both worlds for every step, and the outputs in both worlds should be computationally indistinguishable.    □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Zhong, S. Privacy preserving perceptron learning in malicious model. Neural Comput & Applic 23, 843–856 (2013). https://doi.org/10.1007/s00521-012-1006-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-1006-2

Keywords

Navigation