Quantification of side-channel information leaks based on data complexity measures for web browsing

He, Zhi-Min; Chan, Patrick P. K.; Yeung, Daniel S.; Pedrycz, Witold; Ng, Wing W. Y.

doi:10.1007/s13042-015-0348-3

Quantification of side-channel information leaks based on data complexity measures for web browsing

Original Article
Published: 03 April 2015

Volume 6, pages 607–619, (2015)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Zhi-Min He¹,
Patrick P. K. Chan¹,
Daniel S. Yeung¹,
Witold Pedrycz^2,3 &
…
Wing W. Y. Ng¹

306 Accesses
5 Citations
Explore all metrics

Abstract

Website fingerprinting attack can identify the visited websites by analyzing the side-channel information of the network traffic even though it is transferred through an encrypted tunnel. The security of web browsing can be evaluated by quantifying the side-channel information leaks. However, most of the current leak quantification measures focus on web applications and may be impractical in web browsing due to their time complexity. Although the revised models were proposed to simplify computations, their assumptions may not be suitable for web browsing. In this paper, the problem of website fingerprinting is analyzed from the viewpoint of pattern classification. The data complexity measures, which quantify the difficulty of separating classes in a classification problem, are applied to describe the leak quantification. The performance of these data complexity measures in representing information leaks is discussed and compared with the existing approaches. This comparative analysis is realized conceptually and through experiments by using two website fingerprinting countermeasures: traffic morphing and BuFLO. Moreover, the parameter selection model based on the leak quantification is proposed to estimate suitable parameters for the website fingerprinting countermeasure. The experimental results confirm that the countermeasures with parameters selected according to the data complexity measures are more secure than other leak quantification measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Techniques and countermeasures of website/wireless traffic analysis and fingerprinting

Article 24 October 2015

Toward an Efficient Website Fingerprinting Defense

Web Attack Detection Using Chromatography-Like Entropy Analysis

References

Backes M, Kopf B, Rybalchenko A (2009) Automatic discovery and quantification of information leaks. In: Proceedings of the 30th IEEE symposium on security and privacy, SP ’09. IEEE Computer Society, Washington, DC, pp 141–153
Backes M, Doychev G, Köpf B (2013) Preventing side-channel leaks in web traffic: a formal approach. In: Proceedngs of 20th network and distributed systems security symposium (NDSS), Internet Society
Bernado-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evolut Comput 9(1):82–104
Article Google Scholar
Biggio B, Fumera G, Roli F (2010) Multiple classifier systems for robust classifier design in adversarial environments. Int J Mach Learn Cybern 1(1–4):27–41
Article Google Scholar
Blasco J, Hernandez-Castro JC, Tapiador JE, Ribagorda A (2012) Bypassing information leakage protection with trusted applications. Comput Secur 31(4):557–568
Article Google Scholar
Boehm O, Hardoon DR, Manevitz LM (2011) Classifying cognitive states of brain activity via one-class neural networks with feature selection by genetic algorithms. Int J Mach Learn Cybern 2(3):125–134
Article Google Scholar
Cai X, Zhang XC, Joshi B, Johnson R (2012) Touching from a distance: website fingerprinting attacks and defenses. In: Proceedings of the 2012 ACM conference on computer and communications security, CCS ’12. ACM, New York, pp 605–616
Chapman P, Evans D (2011) Automated black-box detection of side-channel vulnerabilities in web applications. In: Proceedings of the 18th ACM conference on computer and communications security, CCS ’11. ACM, New York, pp 263–274
Chen S, Wang R, Wang X, Zhang K (2010) Side-channel leaks in web applications: s reality today, a challenge tomorrow. In: Proceedings of the 2010 IEEE symposium on security and privacy, SP ’10. IEEE Computer Society, Washington, DC, pp 191–206
Coull SE, Collins MP, Wright CV, Monrose F, Reiter MK, et al. (2007) On web browsing privacy in anonymized netflows. In: Proceedings of the 16th USENIX security symposium, pp 339–352
Dierks T (2008) The transport layer security (TLS) protocol version 1.2
Dingledine R, Mathewson N, Syverson P (2004) Tor: the second-generation onion router. In: Proceedings of the 13th conference on USENIX security symposium, USENIX Association
Dyer KP, Coull SE, Ristenpart T, Shrimpton T (2012) Peek-a-boo, I still see you: why efficient traffic analysis countermeasures fail. In: Proceedings of the 2012 IEEE symposium on security and privacy, SP ’12. IEEE Computer Society, Washington, DC, pp 332–346
Ho TK, Basu M (2000) Measuring the complexity of classification problems. In: 15th international conference on pattern recognition, vol 2, pp 43–47
Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300
Article Google Scholar
Liberatore M, Levine BN (2006) Inferring the source of encrypted HTTP connections. In: Proceedings of the 13th ACM conference on computer and communications security, CCS ’06. ACM, New York, pp 255–263
Lu L, Chang EC, Chan MC (2010) Website fingerprinting and identification using ordered feature sequences. In: Proceedings of the 15th European conference on research in computer security, ESORICS’10, pp 199–214
Luengo J, Herrera F (2010) Domains of competence of fuzzy rule based classification systems with data complexity measures: a case of study using a fuzzy hybrid genetic based machine learning method. Fuzzy Sets Syst 161(1):3–19
Article MathSciNet Google Scholar
Luengo J, Herrera F (2012) Shared domains of competence of approximate learning models using measures of separability of classes. Inf Sci 185(1):43–65
Article MathSciNet Google Scholar
Luo X, Zhou P, Chan EWW, Lee W, Chang RKC, Perdisci R (2011) HTTPOS: sealing information leaks with browser-side obfuscation of encrypted flows. In: Network and distributed systems symposium (NDSS)
Macià N, Bernadó-Mansilla E, Orriols-Puig A, Ho TK (2013) Learner excellence biased by data set selection: a case for data characterisation and artificial data sets. Pattern Recognit 46(3):1054–1066
Article Google Scholar
Mather L, Oswald E (2012) Quantifying side-channel information leakage from web applications. IACR cryptology ePrint archive, p 269
Nelson B, Barreno M, Chi FJ, Joseph AD, Rubinstein BI, Saini U, Sutton CA, Tygar JD, Xia K (2008) Exploiting machine learning to subvert your spam filter. LEET 8:1–9
Google Scholar
Panchenko A, Niessen L, Zinnen A, Engel T (2011) Website fingerprinting in onion routing based anonymization networks. In: Proceedings of the 10th annual ACM workshop on privacy in the Electronic Society, WPES ’11, pp 103–114
Pironti A, Strub PY, Bhargavan K (2012) Identifying website users by tls traffic analysis: new attacks and effective countermeasures. Technical report RR-8067, INRIA
Sáez JA, Luengo J, Herrera F (2013) Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification. Pattern Recognit 46(1):355–364
Article Google Scholar
Singh S (2003) Multiresolution estimates of classification complexity. IEEE Trans Pattern Anal Mach Intell 25:1534–1539
Article Google Scholar
Song DX, Wagner D, Tian X (2001) Timing analysis of keystrokes and timing attacks on ssh. In: Proceedings of the 10th conference on USENIX security symposium, vol 10, SSYM’01, USENIX Association
Standaert FX, Malkin T, Yung M (2009) A unified framework for the analysis of side-channel key recovery attacks. In: EUROCRYPT, lecture notes in computer science, vol 5479. Springer pp 443–461
Sun D, Guo Y, Yin L, Hu C (2012) Comparison of measuring information leakage for fully probabilistic systems. Int J Innov Comput Inf Control 8(1A):255–267
Google Scholar
Sun Q, Simon DR, Wang YM, Russell W, Padmanabhan VN, Qiu L (2002) Statistical identification of encrypted web browsing traffic. In: Proceedings of the 2002 IEEE symposium on security and privacy, SP ’02. IEEE Computer Society, Washington, DC, pp 19–30
Todo Y, Mitsui T (2014) A learning multiple-valued logic network using genetic algorithm. Int J Innov Comput Inf Control 10(2):565–574
Google Scholar
Tong DL, Mintram R (2010) Genetic algorithm-neural network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection. Int J Mach Learn Cybern 1(1–4):75–87
Article Google Scholar
Wang T, Goldberg I (2013) Improved website fingerprinting on tor. In: Proceedings of the 12th ACM workshop on privacy in the Electronic Society, WPES ’13. ACM, pp 201–212
Wright CV, Coull SE, Monrose F (2009) Traffic morphing: An efficient defense against statistical traffic analysis. In: Proceedings of the 16th network and distributed security symposium. IEEE, pp 237–250
Yao L, Zi X, Pan L, Li J (2009) A study of on/off timing channel based on packet delay distribution. Comput Secur 28(8):785–794
Article Google Scholar
Zhang K, Li Z, Wang R, Wang X, Chen S (2010) Sidebuster: Automated detection and quantification of side-channel leaks in web application development. In: Proceedings of the 17th ACM conference on computer and communications security, CCS ’10. ACM, pp 595–606

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61003171, 61272201 and 61003172).

Author information

Authors and Affiliations

School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
Zhi-Min He, Patrick P. K. Chan, Daniel S. Yeung & Wing W. Y. Ng
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, T6G 2G7, Canada
Witold Pedrycz
Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
Witold Pedrycz

Authors

Zhi-Min He
View author publications
You can also search for this author in PubMed Google Scholar
Patrick P. K. Chan
View author publications
You can also search for this author in PubMed Google Scholar
Daniel S. Yeung
View author publications
You can also search for this author in PubMed Google Scholar
Witold Pedrycz
View author publications
You can also search for this author in PubMed Google Scholar
Wing W. Y. Ng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrick P. K. Chan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, ZM., Chan, P.P.K., Yeung, D.S. et al. Quantification of side-channel information leaks based on data complexity measures for web browsing. Int. J. Mach. Learn. & Cyber. 6, 607–619 (2015). https://doi.org/10.1007/s13042-015-0348-3

Download citation

Received: 04 December 2014
Accepted: 10 March 2015
Published: 03 April 2015
Issue Date: August 2015
DOI: https://doi.org/10.1007/s13042-015-0348-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantification of side-channel information leaks based on data complexity measures for web browsing

Abstract

Access this article

Similar content being viewed by others

Techniques and countermeasures of website/wireless traffic analysis and fingerprinting

Toward an Efficient Website Fingerprinting Defense

Web Attack Detection Using Chromatography-Like Entropy Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Quantification of side-channel information leaks based on data complexity measures for web browsing

Abstract

Access this article

Similar content being viewed by others

Techniques and countermeasures of website/wireless traffic analysis and fingerprinting

Toward an Efficient Website Fingerprinting Defense

Web Attack Detection Using Chromatography-Like Entropy Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation