Defeating traffic analysis via differential privacy: a case study on streaming traffic

Zhang, Xiaokuan; Hamm, Jihun; Reiter, Michael K.; Zhang, Yinqian

doi:10.1007/s10207-021-00574-3

Defeating traffic analysis via differential privacy: a case study on streaming traffic

regular contribution
Published: 30 January 2022

Volume 21, pages 689–706, (2022)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

Xiaokuan Zhang ORCID: orcid.org/0000-0002-4646-7146¹,
Jihun Hamm²,
Michael K. Reiter³ &
…
Yinqian Zhang⁴

688 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we explore the adaption of techniques previously used in the domains of adversarial machine learning and differential privacy to mitigate the ML-powered analysis of streaming traffic. Our findings are twofold. First, constructing adversarial samples effectively confounds an adversary with a predetermined classifier but is less effective when the adversary can adapt to the defense by using alternative classifiers or training the classifier with adversarial samples. Second, differential-privacy guarantees are very effective against such statistical-inference-based traffic analysis, while remaining agnostic to the machine learning classifiers used by the adversary. We propose three mechanisms for enforcing differential privacy for encrypted streaming traffic and evaluate their security and utility. Our empirical implementation and evaluation suggest that the proposed statistical privacy approaches are promising solutions in the underlying scenarios

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 3

Fig. 9

Fig. 10

Fig. 15

Assessing the Quality of Differentially Private Synthetic Data for Intrusion Detection

A Realistic Approach for Network Traffic Obfuscation Using Adversarial Machine Learning

On Differential Privacy and Adaptive Data Analysis with Bounded Space

Notes

The model converged after 40 epochs. Training for 1000 epochs improved the accuracy by only 0.024.
https://docs.python.org/3/tutorial/floatingpoint.html
$\textit{thres} $($d^*$,0.25,30) = 0.0000111524321020, $\textit{thres} $($d_{\mathrm {L1}}$,0.25,30) = 0.0000024160161657.
https://github.com/jpillora/xhook
https://github.com/nicolaspanel/numjs
https://github.com/mvarshney/simjs-source
A burst is the total size of all packets whose timestamps are no farther apart than a threshold. Here, the threshold is set to 0.5s.
In previous sections, the evaluations were performed on the $ BPB $ feature; here, we show that extracting more features does not really help the classification.

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467
Bengio, Y., et al.: Learning deep architectures for ai. Foundations and trends® in Machine Learning (2009)
Benhamouda, F., Joye, M., Libert, B.: A new framework for privacy-preserving aggregation of time-series data. ACM Trans. Inf. Syst. Secur. (TISSEC) 18, 1–21 (2016)
Article Google Scholar
Brickell, E., Graunke, G., Neve, M., Seifert, J.P.: Software mitigations to hedge AES against cache-based software side channel vulnerabilities. In: IACR Cryptology ePrint Archive (2006)
Cai, X., Nithyanand, R., Wang, T., Johnson, R., Goldberg, I.: A systematic approach to developing and evaluating website fingerprinting defenses. In: 2014 ACM Conference on Computer and Communications Security. ACM (2014)
Cao, J., Xiao, Q., Ghinita, G., Li, N., Bertino, E., Tan, K.L.: Efficient and accurate strategies for differentially-private sliding window queries. In: 16th International Conference on Extending Database Technology. ACM (2013)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy. IEEE (2017)
Chan, T.H.H., Shi, E., Song, D.: Private and continual release of statistics. ACM Trans. Inf. Syst. Secur. (TISSEC) 14, 1–24 (2011)
Article Google Scholar
Chatzikokolakis, K., Andrés, M.E., Bordenabe, N.E., Palamidessi, C.: Broadening the scope of differential privacy using metrics. In: International Symposium on Privacy Enhancing Technologies Symposium. Springer (2013)
Chen, Q.A., Qian, Z., Mao, Z.M.: Peeking into your app without actually seeing it: Ui state inference and novel android attacks. In: USENIX Security Symposium (2014)
Dahl, G.E., Stokes, J.W., Deng, L., Yu, D.: Large-scale malware classification using random projections and neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2013)
Diao, W., Liu, X., Li, Z., Zhang, K.: No pardon for the interruption: New inference attacks on android through interrupt timing analysis. In: 2016 IEEE Symposium on Security and Privacy (SP). IEEE (2016)
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. VLDB Endowment (2008)
Dwork, C.: Differential privacy. In: 33rd International Conference on Automata, Languages and Programming (ICALP) (2006)
Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–407 (2014)
Article MathSciNet Google Scholar
Fan, L., Xiong, L.: An adaptive approach to real-time aggregate monitoring with differential privacy. IEEE Trans. Knowl. Eng. 26, 2094–2106 (2014)
Article Google Scholar
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1915–1929 (2013)
Article Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv:1412.6572
Hamm, J.: Machine vs machine: minimax-optimal defense against adversarial examples (2017). arXiv:1711.04368
Hayes, J., Danezis, G.: k-fingerprinting: a robust scalable website fingerprinting technique. In: USENIX Security Symposium (2016)
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012)
Article Google Scholar
Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation (2014). arXiv:1412.2007
Juarez, M., Imani, M., Perry, M., Diaz, C., Wright, M.: Toward an efficient website fingerprinting defense. In: European Symposium on Research in Computer Security. Springer (2016)
Kellaris, G., Papadopoulos, S., Xiao, X., Papadias, D.: Differentially private event sequences over infinite streams. VLDB Endow 7, 1155–1666 (2014)
Article Google Scholar
Keramidas, G., Antonopoulos, A., Serpanos, D.N., Kaxiras, S.: Non deterministic caches: a simple and effective defense against side channel attacks. Design Autom. Embed. Syst. 12, 221–230 (2008)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)
Kwon, A., Corrigan-Gibbs, H., Devadas, S., Ford, B.: Atom: Horizontally scaling strong anonymity. In: 26th Symposium on Operating Systems Principles. ACM (2017)
Lazar, D., Zeldovich, N.: Alpenhorn: Bootstrapping secure communication without leaking metadata. In: OSDI (2016)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature (2015)
LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks (1995)
Lecuyer, M., Atlidakis, V., Geambasu, R., Hsu, D., Jana, S.: Certified robustness to adversarial examples with differential privacy. In: 2019 IEEE Symposium on Security and Privacy (SP). IEEE (2019)
Li, P., Gao, D., Reiter, M.K.: Mitigating access-driven timing channels in clouds using stopwatch. In: 43rd International Conference on Dependable systems and networks. IEEE (2013)
Liu, W., Gao, D., Reiter, M.K.: On-demand time blurring to support side-channel defense. In: European Symposium on Research in Computer Security. Springer (2017)
Liu, X., Zhou, Z., Diao, W., Li, Z., Zhang, K.: When good becomes evil: Keystroke inference with smartwatch. In: 22nd ACM Conference on Computer and Communications Security. ACM (2015)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks (2017). arXiv:1706.06083
Marohn, B., Wright, C.V., Feng, W.C., Rosulek, M., Bobba, R.B.: Approximate thumbnail preserving encryption. In: 1st International Workshop on Multimedia Privacy and Security (2017)
Mikolov, T., Deoras, A., Povey, D., Burget, L., Černockỳ, J.: Strategies for training large scale neural network language models. In: 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE (2011)
Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models (2012). arXiv:1206.6426
Mondal, A., Sengupta, S., Reddy, B.R., Koundinya, M., Govindarajan, C., De, P., Ganguly, N., Chakraborty, S.: Candid with youtube: Adaptive streaming behavior and implications on data consumption. In: 27th Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV). ACM (2017)
Nasr, M., Bahramali, A., Houmansadr, A.: Blind adversarial network perturbations (2020). arXiv:2002.06495
Nicolae, M.I., Sinn, M., Tran, M.N., Buesser, B., Rawat, A., Wistuba, M., Zantedeschi, V., Baracaldo, N., Chen, B., Ludwig, H., Molloy, I., Edwards, B.: Adversarial robustness toolbox v1.2.0. CoRR (2018). arXiv:1807.01069
Oh, S.J., Fritz, M., Schiele, B.: Adversarial image perturbation for privacy protection—a game theory perspective. In: IEEE International Conference on Computer Vision (2017)
Panchenko, A., Lanze, F., Pennekamp, J., Engel, T., Zinnen, A., Henze, M., Wehrle, K.: Website fingerprinting at internet scale. In: NDSS (2016)
Panchenko, A., Niessen, L., Zinnen, A., Engel, T.: Website fingerprinting in onion routing based anonymization networks. In: 10th Annual ACM Workshop on Privacy in the Electronic Society. ACM (2011)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. (2011)
Pinot, R., Yger, F., Gouy-Pailler, C., Atif, J.: A unified view on differential privacy and robustness to adversarial examples (2019). arXiv:1906.07982
Rastogi, V., Nath, S.: Differentially private aggregation of distributed time-series with transformation and encryption. In: 2010 ACM SIGMOD International Conference on Management of data. ACM (2010)
Sainath, T.N., Mohamed, A.R., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2013)
Schuster, R., Shmatikov, V., Tromer, E.: Beauty and the burst: Remote identification of encrypted video streams. In: USENIX Security Symposium (2017)
Shi, E., Chan, H., Rieffel, E., Chow, R., Song, D.: Privacy-preserving aggregation of time-series data. In: NDSS (2011)
Sirinam, P., Imani, M., Juarez, M., Wright, M.: Deep fingerprinting: Undermining website fingerprinting defenses with deep learning (2018). arXiv:1801.02265 (2018)
Sun, Q., Simon, D.R., Wang, Y.M., Russell, W., Padmanabhan, V.N., Qiu, L.: Statistical identification of encrypted web browsing traffic. In: IEEE Symposium on Security and Privacy. IEEE (2002)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., et al.: Going deeper with convolutions. In: CVPR (2015)
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks (2013). arXiv:1312.6199
Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems (2014)
Tyagi, N., Gilad, Y., Leung, D., Zaharia, M., Zeldovich, N.: Stadium: A distributed metadata-private messaging system. In: 26th Symposium on Operating Systems Principles. ACM (2017)
Van Den Hooff, J., Lazar, D., Zaharia, M., Zeldovich, N.: Vuvuzela: Scalable private messaging resistant to traffic analysis. In: 25th Symposium on Operating Systems Principles. ACM (2015)
Vattikonda, B.C., Das, S., Shacham, H.: Eliminating fine grained timers in xen. In: 3rd ACM Workshop on Cloud Computing Security Workshop. ACM (2011)
Wang, T., Cai, X., Nithyanand, R., Johnson, R., Goldberg, I.: Effective attacks and provable defenses for website fingerprinting. In: USENIX Security Symposium (2014)
Wang, T., Goldberg, I.: Walkie-talkie: An efficient defense against passive website fingerprinting attacks. In: USENIX Security Symposium (2017)
Xiao, Q., Reiter, M.K., Zhang, Y.: Mitigating storage side channels using statistical privacy mechanisms. In: 22nd ACM Conference on Computer and Communications Security. ACM (2015)
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary LP norms (2000)
Zhang, X., Wang, X., Bai, X., Zhang, Y., Wang, X.: Os-level side channels without PROCFS: Exploring cross-app information leakage on IOS. In: NDSS (2018)
Zhang, Y., Juels, A., Reiter, M.K., Ristenpart, T.: Cross-VM side channels and their use to extract private keys. In: 19th ACM conference on Computer and communications security. ACM (2012)
Zhang, Y., Reiter, M.K.: Düppel: retrofitting commodity operating systems to mitigate cache side channels in the cloud. In: 2013 ACM conference on Computer and communications security. ACM (2013)

Download references

Funding

This project is supported in part by NSF grants 1718084, 1750809, 1801494, and grant W911NF-17-1-0370 from the Army Research Office. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein

Author information

Authors and Affiliations

Ohio State University, Columbus, Ohio, USA
Xiaokuan Zhang
Tulane University, New Orleans, LA, USA
Jihun Hamm
Duke University, Durham, NC, USA
Michael K. Reiter
Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China
Yinqian Zhang

Authors

Xiaokuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jihun Hamm
View author publications
You can also search for this author in PubMed Google Scholar
Michael K. Reiter
View author publications
You can also search for this author in PubMed Google Scholar
Yinqian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yinqian Zhang.

Ethics declarations

Conflicts of interest

The authors have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Appendix

Theorem 1

c $\mathbb {D}$, we have method A that is $\epsilon $-private, and method B that is $(d^*, \epsilon )$-private. We denote the maximum and minimum $d^*$ distance in $\mathbb {D}$ as $d_{max}$ and $d_{min}$. Then, we have:

(1) If B is ($d^*,\epsilon $)-private, then B is $(\epsilon d_{max})$-private.

(2) If A is $\epsilon $-private, then A is $(d^*,\frac{\epsilon }{d_{min}})$-private.

Proof According to the definitions, we have:

$$\begin{aligned}&A:\mathbb {P}(A(x)\in Z) \le exp(\epsilon _A) \times \mathbb {P}(A(x')\in Z) \end{aligned}$$

(A.1)

$$\begin{aligned}&B:\mathbb {P}(A(x)\in Z) \le exp(\epsilon _B \times d^*(x,x')) \times \mathbb {P}(A(x')\in Z)\nonumber \\ \end{aligned}$$

(A.2)

For B, we have:

$$\begin{aligned} \epsilon _B \times d_{min} \le \epsilon _B \times d^*(x,x') \le \epsilon _B \times d_{max} \end{aligned}$$

(A.3)

If B is ($d^*,\epsilon $)-private,

$$\begin{aligned} \frac{\mathbb {P}(A(x)\in Z)}{\mathbb {P}(A(x')\in Z)} = exp(\epsilon \times d^*(x,x')) \le exp(\epsilon \times d_{max})\nonumber \\ \end{aligned}$$

(A.4)

So B is at least $(\epsilon d_{max})$-private. Similarly, if A is $\epsilon $-private, let $\epsilon =\epsilon ' \times d^*(x,x')$, we have:

$$\begin{aligned} \epsilon ' = \frac{\epsilon }{d^*(x,x')} \le \frac{\epsilon }{d_{min}} \end{aligned}$$

(A.5)

So A is at least $(d^*,\frac{\epsilon }{d_{min}})$-private $\square $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Hamm, J., Reiter, M.K. et al. Defeating traffic analysis via differential privacy: a case study on streaming traffic. Int. J. Inf. Secur. 21, 689–706 (2022). https://doi.org/10.1007/s10207-021-00574-3

Download citation

Published: 30 January 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10207-021-00574-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Defeating traffic analysis via differential privacy: a case study on streaming traffic

Abstract

Access this article

Similar content being viewed by others

Assessing the Quality of Differentially Private Synthetic Data for Intrusion Detection

A Realistic Approach for Network Traffic Obfuscation Using Adversarial Machine Learning

On Differential Privacy and Adaptive Data Analysis with Bounded Space

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Ethical approval

Additional information

Publisher's Note

Appendix A: Appendix

Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Defeating traffic analysis via differential privacy: a case study on streaming traffic

Abstract

Access this article

Similar content being viewed by others

Assessing the Quality of Differentially Private Synthetic Data for Intrusion Detection

A Realistic Approach for Network Traffic Obfuscation Using Adversarial Machine Learning

On Differential Privacy and Adaptive Data Analysis with Bounded Space

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Ethical approval

Additional information

Publisher's Note

Appendix A: Appendix

Appendix A: Appendix

Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation