Skip to main content
Log in

Privacy-preserving LOF outlier detection

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

LOF is a well-known approach for density-based outlier detection and has received much attention recently. It is important to design a privacy-preserving LOF outlier detection algorithm as the data on which LOF runs is typically spilt among multiple participants and no one is willing to disclose his sensitive information due to legal or moral considerations. This is, however, a hard problem since participants need to find the maximum one of the distances between an object and its k-Nearest Neighbors (k-NN) without learning the information of these objects. In this paper, we propose an efficient protocol for privacy-preserving LOF outlier detection. We first employ a shuffle protocol to permute the distance vectors owned by different participants. Then, we design a secure selection method to obtain the garbled k-NN indexes and shares of k-distance for given objects. For each object, we make use of the k-distance of all objects to construct a vector, based on which the permute protocol is executed again to obtain new shares of k-distance. Finally, the shares corresponding to the garbled k-NN indexes are selected as the expected result. Our protocol ensures that all the intermediates are shared between multiple participants and thus avoid information leaking. In addition, our protocol is efficient as we prove that the computation and communication complexity of our protocol is bounded by \(O(n^2)\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://viff.dk/.

References

  1. Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD’00), pp 439–450

  2. Amirbekyan A, Estivill-Castro V (2009) Practical protocol for Yao’s millionaires problem enables secure multi-party computation of metrics and efficient privacy-preserving k-nn for large data sets. Knowl Inf Syst 21:327–363

    Article  Google Scholar 

  3. Beaver D (1991) Secure multiparty protocols and zero-knowledge proof systems tolerating a faulty minority. J Cryptol 4:75–122

    MATH  Google Scholar 

  4. Bellare M, Hoang VT, Rogaway P (2012) Foundations of garbled circuits. In: Proceedings of the 2012 ACM conference on computer and communications, security (CCS’12), pp 784–796

  5. Ben-David A, Nisan N, Pinkas B (2008) Fariplaymp: a system for secure multi-party computation. In: Proceedings of the 15th ACM conference on Computer and communications, security (CCS’08), pp 257–266

  6. Bogdanov D, Laur S, Willemson J (2008) Sharemind: a framework for fast privacy-preserving computations. In: Proceedings of 13th European symposium on research in computer, security (ESORICS’08), pp 192–206

  7. Bogdanov D, Niitsoo M, Toft T, Willemson J (2012) High-performance secure multi-party computation for data mining applications. Int J Inf Secur 11:403–418

    Article  Google Scholar 

  8. Breunig M, Kriegel H, Ng R et al (2000) LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD’00), pp 93–104

  9. Canetti R (2001) Universally composable security: A new paradigm for cryptographic protocols. In: Proceedings of the 42nd IEEE symposium on foundations of Computer Science (FOCS’01), pp 136–145

  10. Clifton C, Kantarcioglu M, Vaidya J et al (2002) Tools for privacy preserving distributed data mining. ACM SIGKDD Explor Newsl 4:28–34

    Article  Google Scholar 

  11. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms. MIT press, Cambridge

    MATH  Google Scholar 

  12. Directive E (1995) Directive 95/46/EC of the european parliament and of the council of 24 october 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official Journal of the European Communities of 23 November 1995, p 31

  13. Du W, Atallah M (2001) Privacy-preserving cooperative statistical analysis. In: Proceedings of the 17th annual computer security applications conference (ACSAC’01), pp 102–110

  14. Goethals B, Laur S, Lipmaa H et al (2004) On private scalar product computation for privacy-preserving data mining. In: Proceedings of the 7th international conference on Information Security and Cryptology (ICISC’04), pp 104–120

  15. Goldrich O (2004) Foundations of cryptography: vol 2, Basic Applications. Cambridge university press, Cambridge

    Book  Google Scholar 

  16. Goldschlag D, Reed M, Syverson P (1999) Onion routing. Commun ACM 42:39–41

    Article  Google Scholar 

  17. Henecka W, Sadeghi A, Schneider T et al (2010) Tasty: tool for automating secure two-party computations. In: Proceedings of the 17th ACM conference on Computer and communications, security (CCS’10), pp 451–462

  18. Huang Y, Evans D, Katz J et al (2011) Faster secure two-party computation using garbled circuits. In: 20th USENIX Security Symposium

  19. Jagannathan G, Wright R (2005) Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery in data mining (KDD’05), pp 593–599

  20. Kantarcioglu M, Clifton C (2004) Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans Knowl Data Eng 16:1026–1037

    Article  Google Scholar 

  21. Knorr E, Ng R (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24th international conference on very large data, bases (VLDB’98), pp 392–403

  22. Kolesnikov V, Sadeghi A, Schneider T (2009) Improved garbled circuit building blocks and applications to auctions and computing minima. In: Proceedings of the 8th international conference on cryptology and, network security (CANS’09), pp 1–20

  23. Kreuter B, Shelat A, Shen C (2012) Billion-gate secure computation with malicious adversaries. In: Proceedings of the 21st USENIX conference on security symposium

  24. Laur S, Willemson J, Zhang B (2011) Round-efficient oblivious database manipulation. In: Proceedings of the 14th international conference on information, security (ISC’11), pp 262–277

  25. Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: Proceedings of the 20th annual international cryptology conference (CRYPTO’00), pp 36–54

  26. Lindell Y, Pinkas B (2004) A proof of Yao’s protocol for secure two-party computation. In: Electronic Colloquium on Computational Complexity—ECCC, No. 063

  27. Lindell Y, Pinkas B, Smart N (2008) Implementing two-party computation efficiently with security against malicious adversaries. In: Proceedings of the 6th international conference on Security and Cryptography for Networks (SCN’08), pp 2–20

  28. Malkhi D, Nisan N, Pinkas B (2004) Fairplay-secure two-party computation systems. In: Proceedings of the 14th USENIX conference on Security symposium, pp 287–302

  29. McLachlan J, Tran A, Hopper N et al (2009) Scalable onion routing with torsk. In: Proceedings of the 16th ACM conference on Computer and communications security (CCS’09), pp 590–599

  30. Merugu S, Ghosh J (2003) Privacy-preserving distributed clustering using generative models. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM’03), pp 211–218

  31. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Proceedings of the 17th international conference on theory and application of cryptographic, techniques (EUROCRYPT’99), pp 223–238

  32. Pinkas B, Schneider T, Smart N (2009) Secure two-party computation is practical. In; Proceedings of the 15th International Conference on the theory and application of cryptology and information, Security (ASIACRYPT’09), pp 250–267

  33. Qi Y, Atallah M (2008) Efficient privacy-preserving k-nearest neighbor search. In: Proceedings of the 28th International Conference on Distributed Computing Systems (ICDCS’08), pp 311–319

  34. Ramaswame S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data (SIGMOD’00), pp 427–438

  35. Vaidya J, Clifton C (2002) Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’02), pp 639–644

  36. Vaidya J, Clifton C (2003) Privacy-preserving k-means clustering over vertically partitioned data. In: Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’03), pp 206–215

  37. Vaidya J, Clifton C (2004) Privacy preserving naive bayes classifier for vertically partitioned data. In: Proceedings of the 2004 SIAM international conference on data mining (SDM’04), pp 522–526

  38. Vaidya J, Clifton C (2004) Privacy-preserving outlier detection. In: Proceedings of the 4th IEEE international conference on data mining (ICDM’04), pp 233–240

  39. Vaidya J, Clifton C (2009) Privacy-preserving kth element score over vertically partitioned data. IEEE Trans Knowl Data Eng 21:253–258

    Article  Google Scholar 

  40. Wikstrom D (2004) A universally composable mix-net. In: Proceedings of the 1st theory of cryptography conference (TCC’04), pp 317–335

  41. Yao A (1986) How to generate and exchange secrets. In: Proceedings of the 27th annual symposium on foundations of computer science (FOCS’86), pp 162–167

  42. Zhang N, Wang S, Zhao W (2005) A new scheme on privacy-preserving data classification. Proceedings of the 11th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’05), pp 374–383

Download references

Acknowledgments

We thank anonymous reviewers for their very useful comments and suggestions. This work was supported by the National Natural Science Foundation of China (Nos. 60903217 & 61202407 & 61003044), the Fundamental Research Funds for the Central Universities (Nos. WK0110000027 & WK0110000033), the Guangdong Province Strategic Cooperation Project with the Chinese Academy of Sciences (No. 2012B090400013) and the Natural Science Foundation of Jiangsu Province of China (No. BK2011357).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lu Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, L., Huang, L., Yang, W. et al. Privacy-preserving LOF outlier detection. Knowl Inf Syst 42, 579–597 (2015). https://doi.org/10.1007/s10115-013-0692-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0692-0

Keywords

Navigation