Efficient Privacy Preserving Cross-Datasets Collaborative Outlier Detection

Wei, Zhaohui; Pei, Qingqi; Liu, Xuefeng; Ma, Lichuan

doi:10.1007/978-3-030-37352-8_31

Efficient Privacy Preserving Cross-Datasets Collaborative Outlier Detection

Zhaohui Wei¹¹,
Qingqi Pei¹¹,
Xuefeng Liu¹² &
…
Lichuan Ma¹²

Conference paper
First Online: 03 January 2020

1126 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11983))

Abstract

Outlier detection is one of the most important data analytics tasks and is used in numerous applications and domains. It is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data. The accuracy of the outlier detection depends on sufficient data. However, the underlying data is distributed across different organizations. If outlier detection is done locally, the results obtained are not as accurate as when outlier detection is done collaboratively over the combined data. Unfortunately, competitive advantage, privacy concerns and regulations, and issues surrounding data sovereignty and jurisdiction prevent many organizations from openly sharing their data. In this paper, we address precisely this issue. We present new and efficient protocols for privacy preserving outlier detection to find outliers from arbitrarily partitioned categorical data. Our protocols fall in the two-server model where data owners distribute their private data among two non-colluding servers who detects on the joint data using secure two-party computation (2PC). Our method is based on Local Distance-based Outlier Factor (LDOF) using the relative location of an object to its neighbours to determine the degree to which the object deviates from its neighbourhood. We provide the privacy guarantee by using secure multiparty computation techniques. We implement our system in C++ on real data. Our experiments validate that our protocols are both effective and efficient.

Supported by the National Key Research and Development Program of China under Grant 2016YFB0800601, and the Key Program of NSFC-Tongyong Union Foundation under Grant U1636209.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Vaidya, J., Clifton, C., Kantarcioglu, M.: Tools for privacy preserving distributed data mining. ACM SIGKDD Explor. Newsl. 4, 28–34 (2002)
Article Google Scholar
Clifton, C., Kantarcioglu, M.: Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans. Knowl. Data Eng. 16, 1026–1037 (2004)
Article Google Scholar
Clifton, C., Vaidya, J.: Privacy preserving association rule mining in vertically partitioned data. In: 8th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining (KDD02), pp. 639–644 (2002)
Google Scholar
Wright, R., Jagannathan, G.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD05), pp 593–599 (2005)
Google Scholar
Ghosh, J., Merugu, S.: Privacy-preserving distributed clustering using generative models. In: 3rd IEEE International Conference on Data Mining (ICDM03), pp. 211–218 (2003)
Google Scholar
Clifton, C., Vaidya, J.: Privacy-preserving k-means clustering over vertically partitioned data. In: 9th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003), pp 206–215 (2003)
Google Scholar
Srikant, R., Agrawal, R.: Privacy-preserving data mining. In: ACMSIGMOD International Conference on Management of Data (SIGMOD 2000), pp. 439–450 (2000)
Google Scholar
Pinkas, B., Lindell, Y.: Privacy preserving data mining. In: 20th Annual International Cryptology Conference (CRYPTO 2000), pp. 36–54 (2000)
Google Scholar
Clifton, C., Vaidya, J.: Privacy preserving naive Bayes classifier for vertically partitioned data. In: SIAM International Conference on Data Mining (SDM 2004), pp. 522–526 (2004)
Google Scholar
Wang, S., Zhao, W., Zhang, N.: A new scheme on privacy-preserving data classification. In: 11th ACM-SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2005), pp. 374–383 (2005)
Google Scholar
Aggarwal, C.C.: Outlier Analysis. Springer-Verlag, New York (2013). https://doi.org/10.1007/978-1-4614-6396-2
Book MATH Google Scholar
Rastogi, R., Shim, K., Ramaswame, S.: Efficient algorithms formining outliers from large data sets. In: ACMSIGMOD International Conference on Management of Data (SIGMOD 2000), pp. 427–438 (2000)
Google Scholar
Atallah, M., Du, W.: Privacy-preserving cooperative statistical analysis. In: 17th Annual Computer Security Applications Conference (ACSAC 2001), pp. 102–110 (2001)
Google Scholar
Syverson, P., Goldschlag, D., Reed, M.: Onion routing. Commun. ACM 42, 39–41 (1999)
Google Scholar
Vaidya, J., Clifton, C.: Privacy-preserving outlier detection. In: 4th IEEE International Conference on Data Mining, pp. 233–240 (2004)
Google Scholar
Asif, H., Talukdar, T., Vaidya, J.: Differentially private outlier detection in a collaborative environment. Int. J. Coop. Inf. Syst. 27(03), 1850005 (2018)
Article Google Scholar
Nikolaenko, V., Ioannidis, S., Weinsberg, U., Joye, M., Taft, N., Boneh, D.: Privacy-preserving matrix factorization. In: ACM SIGSAC Conference on Computer Communications Security, pp. 801–812 (2013)
Google Scholar
Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: IEEE Symposium on Security and Privacy, pp. 334–348 (2013)
Google Scholar
Zhang, K., Hutter, M., Jin, H.: A new local distance-based outlier detection approach for scattered real-world data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 813–822. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01307-2_84
Chapter Google Scholar
Lindell, Y., Pinkas, B.: A proof of security of Yaos protocol for two-party computation. J. Cryptology 22, 161–188 (2009)
Article MathSciNet Google Scholar
Ishai, Y., Kilian, J., Nissim, K., Petrank, E.: Extending oblivious transfers efficiently. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 145–161. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45146-4_9
Chapter Google Scholar
Schneider, T., Asharov, G., Lindell, Y., Zohner, M.: More efficient oblivious transfer and extensions for faster secure computation. In: ACM CCS 2013, pp. 535–548 (2013)
Google Scholar
Nielsen, J.B., Nordholt, P.S., Orlandi, C., Burra, S.S.: A new approach to practical active-secure two-party computation. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 681–700. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32009-5_40
Chapter Google Scholar
Demmler, D., Schneider, T., Zohner, M.: ABY-a framework for efficient mixed-protocol secure two-party computation. In: NDSS (2015)
Google Scholar
Nisan, N., Pinkas, B., Sella, Y., Malkhi, D., et al. : Fairplay a secure two-party computation system. In: 13th Conference on USENIX Security Symposium, vol. 13, pp. 20–20 (2004)
Google Scholar
Beaver, D.: Efficient multiparty protocols using circuit randomization. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 420–432. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-46766-1_34
Chapter Google Scholar
Kamara, S., Mohassel, P., Raykova, M.: Outsourcing multiparty computation. IACR Cryptology ePrint Archive 272 (2011)
Google Scholar
Canetti, R.: Universally composable security: a new paradigm for cryptographic protocols. In: 42nd IEEE Symposium on Foundations of Computer Science (FOCS 2001), pp. 136–145 (2001)
Google Scholar
Zhang, Y., Mohassel, P.: SecureML: a system for scalable privacy-preserving machine learning. In: IEEE Symposium on Security and Privacy (SP), pp. 19–38 (2017)
Google Scholar
GMP library. https://gmplib.org
NTL library. http://www.shoup.net/ntl

Download references

Acknowledgments

This work is supported by the National Key Research and Development Program of China under Grant 2016YFB0800601, and the Key Program of NSFC-Tongyong Union Foundation under Grant U1636209.

Author information

Authors and Affiliations

School of Telecommunications Engineering, Xidian University, Xian, 710071, Shaanxi, China
Zhaohui Wei & Qingqi Pei
School of Cyber Engineering, Xidian University, Xian, Shaanxi, China
Xuefeng Liu & Lichuan Ma

Authors

Zhaohui Wei
View author publications
You can also search for this author in PubMed Google Scholar
Qingqi Pei
View author publications
You can also search for this author in PubMed Google Scholar
Xuefeng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lichuan Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qingqi Pei .

Editor information

Editors and Affiliations

Rutgers University, Newark, NJ, USA
Jaideep Vaidya
Beihang University, Beijing, China
Xiao Zhang
Guangzhou University, Guangzhou, China
Jin Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, Z., Pei, Q., Liu, X., Ma, L. (2019). Efficient Privacy Preserving Cross-Datasets Collaborative Outlier Detection. In: Vaidya, J., Zhang, X., Li, J. (eds) Cyberspace Safety and Security. CSS 2019. Lecture Notes in Computer Science(), vol 11983. Springer, Cham. https://doi.org/10.1007/978-3-030-37352-8_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-37352-8_31
Published: 03 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37351-1
Online ISBN: 978-3-030-37352-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics