skip to main content
10.1145/3573942.3573957acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

A Differential Privacy K-Means Algorithm for Improving Privacy Budget Allocation

Published: 16 May 2023 Publication History

Abstract

As a privacy protection method with strict mathematical definition, differential privacy has been widely used in various fields of data mining including clustering algorithm. However, the traditional differential privacy k-means algorithm is sensitive to the selection of initial value, and the allocation of privacy budget is relatively single, which reduces the availability of the algorithm. In order to further improve the availability of the differential privacy K-means algorithm, this paper proposes a privacy budget allocation method combining error analysis to optimize algorithm iteration times and merge clustering, and carries out theoretical analysis and experimental verification at the same time. The results show that the algorithm not only satisfies the definition of differential privacy, but also improves the availability of clustering effectively.

References

[1]
Greene C S, Tan J, Ung M, Big Data Bioinformatics[J]. Journal of Cellular Physiology, 2014, 229(12):1896–1900.
[2]
Li Hong-cheng, WU Xiao-ping, Chen Yan.  K-means clustering method supporting differential privacy Protection under MapReduce Framework [J].  Journal of Communications, 2016, 37(2):7.
[3]
Machanavajjhala A, Kifer D, Gehrke J, L-diversity: privacy beyond k-anonymity[J]. ACM Transactions on Knowledge Discovery from Data, 2007, 1(1):3.
[4]
Ganta S R, Kasiviswanathan S P, Smith A . Composition Attacks and Auxiliary Information in Data Privacy[J]. ACM, 2008.
[5]
[5]DWORK C. Differential privacy[C]//Proceeding of the 33rd International Conference on Automata,Language and Programming-Volume Part Ⅱ.Springer,Berlin,Heidelberg,2006:1-19
[6]
Nissim K, Mcsherry F D, Dwork C, Practical privacy: the SuLQ framework[J]. ACM, 2005.
[7]
Dwork, Cynthia. A firm foundation for private data analysis[J]. Communications of the ACM, 2011.
[8]
Fan Z, Xu X . APDPk-Means: A New Differential Privacy Clustering Algorithm Based on Arithmetic Progression Privacy Budget Allocation[C]// 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE, 2019.
[9]
Zhang Yaling, Qu Lingyu.  Differential Privacy Protection K-means Algorithm using BWP Index [J]. Computer Engineering and Applications,202,58(10):108-115. 
[10]
Li Yang, Hao Zhifeng, Xiao Yanshan,   Multi-dimensional data visualization based on DPE K-means data Aggregation [J].  Journal of Microcomputers, 2013, 34(7):1637-1640.
[11]
Fu Yan-ming, Li Zhen-duo.  Research on K-Means ++ Clustering Algorithm for Differential Privacy Protection Based on Laplacian Mechanism [J].  Information Network Security, 2019(2):10. 
[12]
Ren J, Xiong J, Yao Z, DPLK-Means: A Novel Differential Privacy K-Means Mechanism[C]// IEEE Second International Conference on Data Science in Cyberspace. IEEE, 2017.
[13]
Dong S U, Cao J N E, Ninghui L I, Differentially Private K-Means Clustering and a Hybrid Approach to Private Optimization[J]. Acm Transaction on Information & System Security, 2017, 20(4):16.1-16.33.
[14]
Dwork C . Differential Privacy: A Survey of Results[C]// International Conference on Theory and Applications of Models of Computation. Springer, Berlin, Heidelberg, 2008.
[15]
Ghazi B, Kumar R, Manurangsi P . Differentially Private Clustering: Tight Approximation Ratios[J]. 2020.
[16]
Fan Yi-kang, Liu Jian-wei.  Parallel K-means Algorithm supporting Differential Privacy Protection and Outlier Elimination [J].  Computer application research, 2019, 4 (6) : 1776-1781 + 1787. / j.i SSN. 1001-3695.2017.12.0825. 
[17]
Zhu GUANG-hui, HUANG Sheng-bin, YUAN Chun-feng, SCoS: Design and implementation of parallel spectral clustering algorithm based on Spark [J].  Chinese Journal of Computers, 2018, 41(4):18. 
[18]
Su D, Cao J, Li N, Differentially Private $k$-Means Clustering[C]// ACM. ACM, 2016:26-37.
[19]
Ni T, Qiao M, Chen Z, Utility-efficient differentially private K-means clustering based on cluster merging[J]. Neurocomputing, 2020.

Cited By

View all

Index Terms

  1. A Differential Privacy K-Means Algorithm for Improving Privacy Budget Allocation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition
    September 2022
    1221 pages
    ISBN:9781450396899
    DOI:10.1145/3573942
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 May 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Data mining
    2. Differential privacy
    3. K-means algorithm
    4. Privacy protection

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AIPR 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 49
      Total Downloads
    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media