Skip to main content

Towards Privacy-Preserving Data Mining in Online Social Networks: Distance-Grained and Item-Grained Differential Privacy

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9722))

Abstract

Online social networks have become increasingly popular, where users are more and more lured to reveal their private information. This brings about convenient personalized services but also incurs privacy concerns. To balance utility and privacy, many privacy-preserving mechanisms such as differential privacy have been proposed. However, most existent solutions set a single privacy protection level across the network, which does not well meet users’ personalized requirements. In this paper, we propose a fine-grained differential privacy mechanism for data mining in online social networks. Compared with traditional methods, our scheme provides query responses with respect to different privacy protection levels depending on where the query is from (i.e., is distance-grained), and also supports different protection levels for different data items (i.e., is item-grained). In addition, we take into consideration the collusion attack on differential privacy, and give a countermeasure in privacy-preserving data mining. We evaluate our scheme analytically, and conduct experiments on synthetic and real-world data to demonstrate its utility and privacy protection.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Without of causing confusing, we interchangeably use node and user in this paper.

  2. 2.

    https://snap.stanford.edu/data/index.html.

References

  1. Dwyer, C., Hiltz, S., Passerini, K.: Trust and privacy concern within social networking sites: a comparison of Facebook and MySpace. In: 13th Americas Conference on Information Systems (AMCIS), pp. 339:1–339:13 (2007)

    Google Scholar 

  2. Zhang, C., Sun, J., Zhu, X., Fang, Y.: Privacy and security for online social networks: challenges and opportunities. Network 24(4), 13–18 (2010). IEEE

    Google Scholar 

  3. Fogues, R., Such, J.M., Espinosa, A., Garcia-Fornes, A.: Open challenges in relationship-based privacy mechanisms for social network services. Int. J. Hum.-comput. Interact. 31(5), 350–370 (2015)

    Article  Google Scholar 

  4. Wu, X., Zhu, X., Wu, G.Q., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014). IEEE

    Article  Google Scholar 

  5. Fung, B., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. (CSUR) 42(4), 14 (2010). ACM

    Article  Google Scholar 

  6. Sweeney, L.: \(k\)-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  7. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: \(l\)-diversity: privacy beyond \(k\)-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 3:1–3:52 (2007)

    Google Scholar 

  8. Li, N., Li, T., Venkatasubramanian, S.: \(t\)-closeness: Privacy beyond \(k\)-anonymity and \(l\) -diversity. In: 23rd International Conference on Data Engineering (ICDE 2007), pp. 106–115. IEEE (2007)

    Google Scholar 

  9. Wong, R.C.W., Li, J., Fu, A.W.C., Wang, K.: (\(\alpha \),\(k\))-anonymity: an enhanced \(k\)-anonymity model for privacy preserving data publishing. In: 12th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2006), pp. 754–759. ACM (2006)

    Google Scholar 

  10. Dwork, C.: A firm foundation for private data analysis. Commun. ACM 54(1), 86–95 (2011). ACM

    Article  Google Scholar 

  11. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Baden, R., Bender, A., Spring, N., Bhattacharjee, B., Starin, D.: Persona: an online social network with user-defined privacy. ACM SIGCOMM Comput. Commun. Rev. 39(4), 135–146 (2009). ACM

    Article  Google Scholar 

  13. Li, Y., Chen, M., Li, Q., Zhang, W.: Enabling multilevel trust in privacy preserving data mining. IEEE Trans. Knowl. Data Eng. 24(9), 1598–1612 (2012). IEEE

    Article  Google Scholar 

  14. Ebadi, H., Sands, D., Schneider, G.: Differential privacy: now it’s getting personal. In: 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 2015), pp. 69–81 (2015)

    Google Scholar 

  15. Jorgensen, Z., Yu, T., Cormode, G.: Conservative or liberal? Personalized differential privacy. In: 31st International Conference on Data Engineering (ICDE 2015), pp. 13–17. IEEE (2015)

    Google Scholar 

  16. Koufogiannis, F., Pappas, G.: Diffusing private data over networks (2015). arXiv preprint arXiv:1511.06253

  17. Alaggan, M., Gambs, S., Kermarrec, A.M.: Heterogeneous differential privacy. arXiv preprint (2015). arXiv:1504.06998

  18. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  19. Yuan, M., Chen, L., Yu, P.S.: Personalized privacy protection in social networks. Proc. VLDB Endowment 4(2), 141–150 (2010). ACM

    Article  Google Scholar 

  20. Koufogiannis, F., Han, S., Pappas, G.J.: Gradual release of sensitive data under differential privacy (2015). arXiv preprint arXiv:1504.00429

  21. Zhang, N., Li, M., Lou, W.: Distributed data mining with differential privacy. In: IEEE International Conference on Communications (ICC 2011). IEEE (2011)

    Google Scholar 

Download references

Acknowledgment

The authors would like to thank the anonymous reviewers for their valuable comments. This work was supported by the National Natural Science Foundation of China under Grant 61272479, the National 973 Program of China under Grant 2013CB338001, and the Strategic Priority Research Program of Chinese Academy of Sciences under Grant XDA06010702.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen-Tao Zhu .

Editor information

Editors and Affiliations

Appendix

Appendix

Proof of Theorem 3: Assume that \(V_1\sim Lap(\frac{1}{\epsilon _1})\), \(V_2\sim Lap(\frac{1}{\epsilon _2})\), where \(\epsilon _1 < \epsilon _2\). The conditional probability \(\Pr [V_1|V_2]\) has a density function \(\phi (x)\). Additionally, \(V_1=h_{\epsilon _1}(x)=\epsilon _1\exp (-\epsilon _1|x|)\), \(V_2=h_{\epsilon _2}(y)=\epsilon _2\exp (-\epsilon _2|y|)\). We use g(xy) to denote the joint distribution density of \(V_1\) and \(V_2\). So g(xy) holds:

$$\begin{aligned} g(x,y)=\phi (y-x)h(x) \end{aligned}$$
(3)

The density (3) should satisfy the following marginal distributions:

$$\begin{aligned} \int _{-\infty }^\infty g(x,y)dy=h_{\epsilon _1}(x)\quad \int _{-\infty }^\infty g(x,y)dx=h_{\epsilon _2}(y) \end{aligned}$$
(4)

The Eq. (4) could be seen as a convolution operation \(\int _{-\infty }^\infty \phi (y-x)h_{\epsilon _1}(x)dx=h_{\epsilon _2}(y)\). We use Convolution Theorem to solve this equation:

$$\begin{aligned} \mathcal {F}_{\phi }(s)=\frac{\mathcal {F}_{h_{\epsilon _2}}(s)}{\mathcal {F}_{h_{\epsilon _1}}(s)}, \end{aligned}$$
(5)

where \(\mathcal {F}\) denotes Fourier Transform. According to (5), we get:

$$\begin{aligned} \mathcal {F}_{\phi }(s)=\frac{\mathcal {F}_{h_{\epsilon _2}}(s)}{\mathcal {F}_{h_{\epsilon _1}}(s)}=\frac{1-\frac{s^2}{\epsilon _2^2}}{1-\frac{s^2}{\epsilon _1^2}} =(\frac{\epsilon _1}{\epsilon _2})^2(1+\frac{\epsilon _2^2-\epsilon _1^2}{\epsilon _1^2+s^2}) \end{aligned}$$

We set \(b(x)=|x|^{1-\frac{n}{2}}K_{\frac{n}{2}-1}(|x|)\), where K denotes the modified Bessel function of the second kind, and \(\mathcal {F}_{b}(s)=\frac{(2\pi )^\frac{n}{2}}{1+s^2}\). So,

$$\begin{aligned} \mathcal {F}^{-1}_{\phi }(s)=(\frac{\epsilon _1}{\epsilon _2})^2[\delta (x)+(\frac{\epsilon _2^2}{\epsilon _1^2}-1)\frac{\epsilon _1}{\sqrt{2\pi }}\sqrt{\epsilon _1 x}K_{-\frac{1}{2}}(\epsilon _1x)] \end{aligned}$$
$$\begin{aligned} \phi (x)\simeq (1-\frac{\epsilon _1^2}{\epsilon _2^2})\frac{\epsilon _1}{\sqrt{2\pi }}\sqrt{\epsilon _1 x}K_{-\frac{1}{2}}(\epsilon _1x) \end{aligned}$$

More relevant details about the proof are given by Koufogiannis et al. in [16].

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Yan, S., Pan, S., Zhao, Y., Zhu, WT. (2016). Towards Privacy-Preserving Data Mining in Online Social Networks: Distance-Grained and Item-Grained Differential Privacy. In: Liu, J., Steinfeld, R. (eds) Information Security and Privacy. ACISP 2016. Lecture Notes in Computer Science(), vol 9722. Springer, Cham. https://doi.org/10.1007/978-3-319-40253-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40253-6_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40252-9

  • Online ISBN: 978-3-319-40253-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics