Skip to main content
Log in

Defeating traffic analysis via differential privacy: a case study on streaming traffic

  • regular contribution
  • Published:
International Journal of Information Security Aims and scope Submit manuscript

Abstract

In this paper, we explore the adaption of techniques previously used in the domains of adversarial machine learning and differential privacy to mitigate the ML-powered analysis of streaming traffic. Our findings are twofold. First, constructing adversarial samples effectively confounds an adversary with a predetermined classifier but is less effective when the adversary can adapt to the defense by using alternative classifiers or training the classifier with adversarial samples. Second, differential-privacy guarantees are very effective against such statistical-inference-based traffic analysis, while remaining agnostic to the machine learning classifiers used by the adversary. We propose three mechanisms for enforcing differential privacy for encrypted streaming traffic and evaluate their security and utility. Our empirical implementation and evaluation suggest that the proposed statistical privacy approaches are promising solutions in the underlying scenarios

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. The model converged after 40 epochs. Training for 1000 epochs improved the accuracy by only 0.024.

  2. https://docs.python.org/3/tutorial/floatingpoint.html

  3. \(\textit{thres} \)(\(d^*\),0.25,30) = 0.0000111524321020, \(\textit{thres} \)(\(d_{\mathrm {L1}}\),0.25,30) = 0.0000024160161657.

  4. https://github.com/jpillora/xhook

  5. https://github.com/nicolaspanel/numjs

  6. https://github.com/mvarshney/simjs-source

  7. A burst is the total size of all packets whose timestamps are no farther apart than a threshold. Here, the threshold is set to 0.5s.

  8. In previous sections, the evaluations were performed on the \( BPB \) feature; here, we show that extracting more features does not really help the classification.

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467

  2. Bengio, Y., et al.: Learning deep architectures for ai. Foundations and trends® in Machine Learning (2009)

  3. Benhamouda, F., Joye, M., Libert, B.: A new framework for privacy-preserving aggregation of time-series data. ACM Trans. Inf. Syst. Secur. (TISSEC) 18, 1–21 (2016)

    Article  Google Scholar 

  4. Brickell, E., Graunke, G., Neve, M., Seifert, J.P.: Software mitigations to hedge AES against cache-based software side channel vulnerabilities. In: IACR Cryptology ePrint Archive (2006)

  5. Cai, X., Nithyanand, R., Wang, T., Johnson, R., Goldberg, I.: A systematic approach to developing and evaluating website fingerprinting defenses. In: 2014 ACM Conference on Computer and Communications Security. ACM (2014)

  6. Cao, J., Xiao, Q., Ghinita, G., Li, N., Bertino, E., Tan, K.L.: Efficient and accurate strategies for differentially-private sliding window queries. In: 16th International Conference on Extending Database Technology. ACM (2013)

  7. Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy. IEEE (2017)

  8. Chan, T.H.H., Shi, E., Song, D.: Private and continual release of statistics. ACM Trans. Inf. Syst. Secur. (TISSEC) 14, 1–24 (2011)

    Article  Google Scholar 

  9. Chatzikokolakis, K., Andrés, M.E., Bordenabe, N.E., Palamidessi, C.: Broadening the scope of differential privacy using metrics. In: International Symposium on Privacy Enhancing Technologies Symposium. Springer (2013)

  10. Chen, Q.A., Qian, Z., Mao, Z.M.: Peeking into your app without actually seeing it: Ui state inference and novel android attacks. In: USENIX Security Symposium (2014)

  11. Dahl, G.E., Stokes, J.W., Deng, L., Yu, D.: Large-scale malware classification using random projections and neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2013)

  12. Diao, W., Liu, X., Li, Z., Zhang, K.: No pardon for the interruption: New inference attacks on android through interrupt timing analysis. In: 2016 IEEE Symposium on Security and Privacy (SP). IEEE (2016)

  13. Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. VLDB Endowment (2008)

  14. Dwork, C.: Differential privacy. In: 33rd International Conference on Automata, Languages and Programming (ICALP) (2006)

  15. Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–407 (2014)

    Article  MathSciNet  Google Scholar 

  16. Fan, L., Xiong, L.: An adaptive approach to real-time aggregate monitoring with differential privacy. IEEE Trans. Knowl. Eng. 26, 2094–2106 (2014)

    Article  Google Scholar 

  17. Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1915–1929 (2013)

    Article  Google Scholar 

  18. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv:1412.6572

  19. Hamm, J.: Machine vs machine: minimax-optimal defense against adversarial examples (2017). arXiv:1711.04368

  20. Hayes, J., Danezis, G.: k-fingerprinting: a robust scalable website fingerprinting technique. In: USENIX Security Symposium (2016)

  21. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012)

    Article  Google Scholar 

  22. Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation (2014). arXiv:1412.2007

  23. Juarez, M., Imani, M., Perry, M., Diaz, C., Wright, M.: Toward an efficient website fingerprinting defense. In: European Symposium on Research in Computer Security. Springer (2016)

  24. Kellaris, G., Papadopoulos, S., Xiao, X., Papadias, D.: Differentially private event sequences over infinite streams. VLDB Endow 7, 1155–1666 (2014)

    Article  Google Scholar 

  25. Keramidas, G., Antonopoulos, A., Serpanos, D.N., Kaxiras, S.: Non deterministic caches: a simple and effective defense against side channel attacks. Design Autom. Embed. Syst. 12, 221–230 (2008)

    Article  Google Scholar 

  26. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)

  27. Kwon, A., Corrigan-Gibbs, H., Devadas, S., Ford, B.: Atom: Horizontally scaling strong anonymity. In: 26th Symposium on Operating Systems Principles. ACM (2017)

  28. Lazar, D., Zeldovich, N.: Alpenhorn: Bootstrapping secure communication without leaking metadata. In: OSDI (2016)

  29. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature (2015)

  30. LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks (1995)

  31. Lecuyer, M., Atlidakis, V., Geambasu, R., Hsu, D., Jana, S.: Certified robustness to adversarial examples with differential privacy. In: 2019 IEEE Symposium on Security and Privacy (SP). IEEE (2019)

  32. Li, P., Gao, D., Reiter, M.K.: Mitigating access-driven timing channels in clouds using stopwatch. In: 43rd International Conference on Dependable systems and networks. IEEE (2013)

  33. Liu, W., Gao, D., Reiter, M.K.: On-demand time blurring to support side-channel defense. In: European Symposium on Research in Computer Security. Springer (2017)

  34. Liu, X., Zhou, Z., Diao, W., Li, Z., Zhang, K.: When good becomes evil: Keystroke inference with smartwatch. In: 22nd ACM Conference on Computer and Communications Security. ACM (2015)

  35. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks (2017). arXiv:1706.06083

  36. Marohn, B., Wright, C.V., Feng, W.C., Rosulek, M., Bobba, R.B.: Approximate thumbnail preserving encryption. In: 1st International Workshop on Multimedia Privacy and Security (2017)

  37. Mikolov, T., Deoras, A., Povey, D., Burget, L., Černockỳ, J.: Strategies for training large scale neural network language models. In: 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE (2011)

  38. Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models (2012). arXiv:1206.6426

  39. Mondal, A., Sengupta, S., Reddy, B.R., Koundinya, M., Govindarajan, C., De, P., Ganguly, N., Chakraborty, S.: Candid with youtube: Adaptive streaming behavior and implications on data consumption. In: 27th Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV). ACM (2017)

  40. Nasr, M., Bahramali, A., Houmansadr, A.: Blind adversarial network perturbations (2020). arXiv:2002.06495

  41. Nicolae, M.I., Sinn, M., Tran, M.N., Buesser, B., Rawat, A., Wistuba, M., Zantedeschi, V., Baracaldo, N., Chen, B., Ludwig, H., Molloy, I., Edwards, B.: Adversarial robustness toolbox v1.2.0. CoRR (2018). arXiv:1807.01069

  42. Oh, S.J., Fritz, M., Schiele, B.: Adversarial image perturbation for privacy protection—a game theory perspective. In: IEEE International Conference on Computer Vision (2017)

  43. Panchenko, A., Lanze, F., Pennekamp, J., Engel, T., Zinnen, A., Henze, M., Wehrle, K.: Website fingerprinting at internet scale. In: NDSS (2016)

  44. Panchenko, A., Niessen, L., Zinnen, A., Engel, T.: Website fingerprinting in onion routing based anonymization networks. In: 10th Annual ACM Workshop on Privacy in the Electronic Society. ACM (2011)

  45. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. (2011)

  46. Pinot, R., Yger, F., Gouy-Pailler, C., Atif, J.: A unified view on differential privacy and robustness to adversarial examples (2019). arXiv:1906.07982

  47. Rastogi, V., Nath, S.: Differentially private aggregation of distributed time-series with transformation and encryption. In: 2010 ACM SIGMOD International Conference on Management of data. ACM (2010)

  48. Sainath, T.N., Mohamed, A.R., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2013)

  49. Schuster, R., Shmatikov, V., Tromer, E.: Beauty and the burst: Remote identification of encrypted video streams. In: USENIX Security Symposium (2017)

  50. Shi, E., Chan, H., Rieffel, E., Chow, R., Song, D.: Privacy-preserving aggregation of time-series data. In: NDSS (2011)

  51. Sirinam, P., Imani, M., Juarez, M., Wright, M.: Deep fingerprinting: Undermining website fingerprinting defenses with deep learning (2018). arXiv:1801.02265 (2018)

  52. Sun, Q., Simon, D.R., Wang, Y.M., Russell, W., Padmanabhan, V.N., Qiu, L.: Statistical identification of encrypted web browsing traffic. In: IEEE Symposium on Security and Privacy. IEEE (2002)

  53. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)

  54. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., et al.: Going deeper with convolutions. In: CVPR (2015)

  55. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks (2013). arXiv:1312.6199

  56. Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems (2014)

  57. Tyagi, N., Gilad, Y., Leung, D., Zaharia, M., Zeldovich, N.: Stadium: A distributed metadata-private messaging system. In: 26th Symposium on Operating Systems Principles. ACM (2017)

  58. Van Den Hooff, J., Lazar, D., Zaharia, M., Zeldovich, N.: Vuvuzela: Scalable private messaging resistant to traffic analysis. In: 25th Symposium on Operating Systems Principles. ACM (2015)

  59. Vattikonda, B.C., Das, S., Shacham, H.: Eliminating fine grained timers in xen. In: 3rd ACM Workshop on Cloud Computing Security Workshop. ACM (2011)

  60. Wang, T., Cai, X., Nithyanand, R., Johnson, R., Goldberg, I.: Effective attacks and provable defenses for website fingerprinting. In: USENIX Security Symposium (2014)

  61. Wang, T., Goldberg, I.: Walkie-talkie: An efficient defense against passive website fingerprinting attacks. In: USENIX Security Symposium (2017)

  62. Xiao, Q., Reiter, M.K., Zhang, Y.: Mitigating storage side channels using statistical privacy mechanisms. In: 22nd ACM Conference on Computer and Communications Security. ACM (2015)

  63. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary LP norms (2000)

  64. Zhang, X., Wang, X., Bai, X., Zhang, Y., Wang, X.: Os-level side channels without PROCFS: Exploring cross-app information leakage on IOS. In: NDSS (2018)

  65. Zhang, Y., Juels, A., Reiter, M.K., Ristenpart, T.: Cross-VM side channels and their use to extract private keys. In: 19th ACM conference on Computer and communications security. ACM (2012)

  66. Zhang, Y., Reiter, M.K.: Düppel: retrofitting commodity operating systems to mitigate cache side channels in the cloud. In: 2013 ACM conference on Computer and communications security. ACM (2013)

Download references

Funding

This project is supported in part by NSF grants 1718084, 1750809, 1801494, and grant W911NF-17-1-0370 from the Army Research Office. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yinqian Zhang.

Ethics declarations

Conflicts of interest

The authors have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Appendix

Appendix A: Appendix

Theorem 1

c \(\mathbb {D}\), we have method A that is \(\epsilon \)-private, and method B that is \((d^*, \epsilon )\)-private. We denote the maximum and minimum \(d^*\) distance in \(\mathbb {D}\) as \(d_{max}\) and \(d_{min}\). Then, we have:

(1) If B is (\(d^*,\epsilon \))-private, then B is \((\epsilon d_{max})\)-private.

(2) If A is \(\epsilon \)-private, then A is \((d^*,\frac{\epsilon }{d_{min}})\)-private.

Proof According to the definitions, we have:

$$\begin{aligned}&A:\mathbb {P}(A(x)\in Z) \le exp(\epsilon _A) \times \mathbb {P}(A(x')\in Z) \end{aligned}$$
(A.1)
$$\begin{aligned}&B:\mathbb {P}(A(x)\in Z) \le exp(\epsilon _B \times d^*(x,x')) \times \mathbb {P}(A(x')\in Z)\nonumber \\ \end{aligned}$$
(A.2)

For B, we have:

$$\begin{aligned} \epsilon _B \times d_{min} \le \epsilon _B \times d^*(x,x') \le \epsilon _B \times d_{max} \end{aligned}$$
(A.3)

If B is (\(d^*,\epsilon \))-private,

$$\begin{aligned} \frac{\mathbb {P}(A(x)\in Z)}{\mathbb {P}(A(x')\in Z)} = exp(\epsilon \times d^*(x,x')) \le exp(\epsilon \times d_{max})\nonumber \\ \end{aligned}$$
(A.4)

So B is at least \((\epsilon d_{max})\)-private. Similarly, if A is \(\epsilon \)-private, let \(\epsilon =\epsilon ' \times d^*(x,x')\), we have:

$$\begin{aligned} \epsilon ' = \frac{\epsilon }{d^*(x,x')} \le \frac{\epsilon }{d_{min}} \end{aligned}$$
(A.5)

So A is at least \((d^*,\frac{\epsilon }{d_{min}})\)-private \(\square \).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Hamm, J., Reiter, M.K. et al. Defeating traffic analysis via differential privacy: a case study on streaming traffic. Int. J. Inf. Secur. 21, 689–706 (2022). https://doi.org/10.1007/s10207-021-00574-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10207-021-00574-3

Keywords

Navigation