Encrypted data indexing for the secure outsourcing of spectral clustering

Liu, Bozhong; Chen, Ling; Zhu, Xingquan; Qiu, Weidong

doi:10.1007/s10115-018-1262-2

Encrypted data indexing for the secure outsourcing of spectral clustering

Regular Paper
Published: 07 September 2018

Volume 60, pages 1307–1328, (2019)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Bozhong Liu ORCID: orcid.org/0000-0003-2839-1499¹,
Ling Chen²,
Xingquan Zhu³ &
…
Weidong Qiu⁴

351 Accesses
2 Citations
Explore all metrics

Abstract

Spectral clustering is one of the most popular clustering methods and is particularly useful for pattern recognition and image analysis. When using spectral clustering for analysis, users are either required to implement their own platforms, which requires strong data analytics and machine learning skills, or allow a third party to access and analyze their data, which may compromise their data privacy or security. Traditionally, this problem is solved by privacy-preserving data mining using randomization perturbation or secure multi-party computation. However, the existing methods suffer from the problems of inaccurate results or high computational requirements on the data owner’s side. To address these problems, in this paper, we propose a new secure outsourcing data mining (SODM) paradigm, which allows data owners to encrypt their data to ensure maximum data security. After the encryption, data owners can outsource their encrypted data to data analytics service providers (i.e., data analytics agent) for knowledge discovery, with a guarantee that neither the data analytics agent nor the other parties can compromise data privacy. To allow data mining to be efficiently carried out on encrypted data, we design a secure KD-tree to index all the encrypted data. Based on the SODM framework, a secure spectral clustering algorithm is proposed. The experiments on real-world datasets demonstrate the effectiveness and the efficiency of the system for the secure outsourcing of data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A systematic review of homomorphic encryption and its contributions in healthcare industry

Article Open access 03 May 2022

Big healthcare data: preserving security and privacy

Article Open access 09 January 2018

Big data privacy: a technological perspective and review

Article Open access 26 November 2016

References

Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: SIGMOD, pp 439–450
Agrawal R, Srikant R, Thomas D (2005) Privacy preserving OLAP. In: SIGMOD, pp 251–262
Ashouri-Talouki M, Baraani-Dastjerdi A, Selçuk AA (2015) The cloaked-centroid protocol: location privacy protection for a group of users of location-based services. Knowl Inf Syst 45(3):589–615
Article Google Scholar
Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517
Article MathSciNet MATH Google Scholar
Bock RK (2007) UC Irvine machine learning repository. http://archive.ics.uci.edu/ml/index.html
Bunn P, Ostrovsky R (2007) Secure two-party k-means clustering. In: CCS, pp 486–497
van Dijk M, Juels A (2010) On the impossibility of cryptography alone for privacy-preserving cloud computing. In: USENIX
Elmehdwi Y, Samanthula BK, Jiang W (2014) Secure k-nearest neighbor query over encrypted data in outsourced environments. In: ICDE, pp 664–675
Evfimievski AV, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: SIGKDD, pp 217–228
Gambs S, Kégl B, Aïmeur E (2007) Privacy-preserving boosting. Data Min Knowl Discov 14(1):131–170
Article MathSciNet Google Scholar
Goldreich O (2004) Foundations of cryptography, vol 2. Basic applications. University Press, Cambridge
Book MATH Google Scholar
Jagannathan G, Wright RN (2005) Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: SIGKDD, pp 593–599
Jagannathan G, Pillaipakkamnatt K, Wright RN (2006) A new privacy-preserving distributed k-clustering algorithm. In: SDM, pp 494–498
Kantarcioglu M, Clifton C (2004) Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans Knowl Data Eng 16(9):1026–1037
Article Google Scholar
Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: ICDM, pp 99–106
Kieseberg P, Hobel H, Schrittwieser S, Weippl ER, Holzinger A (2014) Protecting anonymity in data-driven biomedical science. In: Interactive knowledge discovery and data mining in biomedical informatics-state-of-the-art and future challenges, pp 301–316
Kieseberg P, Malle B, Frühwirt P, Weippl ER, Holzinger A (2016) A tamper-proof audit and control system for the doctor in the loop. Brain Inf 3(4):269–279
Article Google Scholar
Lee DT, Wong CK (1977) Worst-case analysis for region and partial region searches in multidimensional binary search trees and balanced quad trees. Acta Inf 9:23–29
Article MathSciNet MATH Google Scholar
Lin K (2013) Privacy-preserving kernel k-means outsourcing with randomized kernels. In: ICDM workshops, pp 860–866
Lin K, Chang Y, Chen M (2015) Secure support vector machines outsourcing with random linear transformation. Knowl Inf Syst 44(1):147–176
Article Google Scholar
Lin Z, Jaromczyk JW (2011) Privacy preserving spectral clustering over vertically partitioned data sets. In: FSKD, pp 1206–1211
Lindell Y, Pinkas B (2009) Secure multiparty computation for privacy-preserving data mining. J Priv Confid 1(1):59–98
Google Scholar
Liu D, Bertino E, Yi X (2014) Privacy of outsourced k-means clustering. In: ASIACCS, pp 123–134
Ma Q, Deng P (2008) Secure multi-party protocols for privacy preserving data mining. In: WASA, pp 526–537
Malle B, Kieseberg P, Weippl ER, Holzinger A (2016) The right to be forgotten: towards machine learning on perturbed knowledge bases. In: IFIP WG 8.4, 8.9, CD-ARES 2016, and PAML 2016, pp 251–266
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: NIPS, pp 849–856
Ning H, Xu W, Chi Y, Gong Y, Huang TS (2007) Incremental spectral clustering with application to monitoring of evolving blog communities. In: SDM, pp 261–272
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: EUROCRYPT, pp 223–238
Polat H, Du W (2005) SVD-based collaborative filtering with privacy. In: SAC, pp 791–795
Rao F, Samanthula BK, Bertino E, Yi X, Liu D (2015) Privacy-preserving and outsourced multi-user k-means clustering. In: IEEE conference on collaboration and internet computing (CIC 2015), pp 80–89
Rizvi S, Haritsa JR (2002) Maintaining data privacy in association rule mining. In: VLDB, pp 682–693
Sindhumol SS, Kumar A, Balakrishnan K (2013) Spectral clustering independent component analysis for tissue classification from brain MRI. Biomed Signal Process Control 8(6):667–674
Article Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Article Google Scholar
Sun Y, Wen Q, Zhang Y, Zhang H, Jin Z, Li W (2014) Two-cloud-servers-assisted secure outsourcing multiparty computation. Sci World J 2014:7
Google Scholar
Symeonidis P, Iakovidou N, Mantas N, Manolopoulos Y (2013) From biological to social networks: link prediction based on multi-way spectral clustering. Data Knowl Eng 87:226–242
Article Google Scholar
Tasdemir K (2012) Vector quantization based approximate spectral clustering of large datasets. Pattern Recognit 45(8):3034–3044
Article Google Scholar
Vaidya J, Clifton C (2002) Privacy preserving association rule mining in vertically partitioned data. In: SIGKDD, pp 639–644
Vaidya J, Clifton C (2003) Privacy-preserving k-means clustering over vertically partitioned data. In: SIGKDD, pp 206–215
Vaidya J, Kantarcioglu M, Clifton C (2008) Privacy-preserving Naïve Bayes classification. VLDB 17(4):879–898
Article Google Scholar
Yao AC (1986) How to generate and exchange secrets (extended abstract). In: 27th annual symposium on foundations of computer science, pp 162–167
Yi X, Zhang Y (2013) Equally Contributory privacy-preserving k-means clustering over vertically partitioned data. Inf Syst 38(1):97–107
Article Google Scholar
Zhu MY, Liu L (2004) Optimal randomization for privacy preserving data mining. In: SIGKDD, pp 761–766

Download references

Acknowledgements

This work was supported, in part, by the Australia Research Council (ARC) Discovery Project under Grant No. DP180100966, National Key Research and Development Program of China under Grant 2017YFB0802704 and program of Shanghai Technology Research Leader under Grant 16XD1424400.

Author information

Authors and Affiliations

Sangfor Technologies Inc., Shenzhen, Guangdong, China
Bozhong Liu
Centre for Artificial Intelligence, University of Technology, Sydney, Sydney, Australia
Ling Chen
Department of Computer & Electrical Engineering and Computer Science, Engineering East (EE)-509, Florida Atlantic University, 777 Glades Road, Boca Raton, FL, 33431, USA
Xingquan Zhu
School of Cyber Security, Shanghai Jiao Tong University, Shanghai, China
Weidong Qiu

Authors

Bozhong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ling Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xingquan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Weidong Qiu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bozhong Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, B., Chen, L., Zhu, X. et al. Encrypted data indexing for the secure outsourcing of spectral clustering. Knowl Inf Syst 60, 1307–1328 (2019). https://doi.org/10.1007/s10115-018-1262-2

Download citation

Received: 07 January 2016
Revised: 09 February 2018
Accepted: 26 May 2018
Published: 07 September 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s10115-018-1262-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Encrypted data indexing for the secure outsourcing of spectral clustering

Abstract

Access this article

Similar content being viewed by others

A systematic review of homomorphic encryption and its contributions in healthcare industry

Big healthcare data: preserving security and privacy

Big data privacy: a technological perspective and review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Encrypted data indexing for the secure outsourcing of spectral clustering

Abstract

Access this article

Similar content being viewed by others

A systematic review of homomorphic encryption and its contributions in healthcare industry

Big healthcare data: preserving security and privacy

Big data privacy: a technological perspective and review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation