Skip to main content
Log in

Quantification of side-channel information leaks based on data complexity measures for web browsing

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Website fingerprinting attack can identify the visited websites by analyzing the side-channel information of the network traffic even though it is transferred through an encrypted tunnel. The security of web browsing can be evaluated by quantifying the side-channel information leaks. However, most of the current leak quantification measures focus on web applications and may be impractical in web browsing due to their time complexity. Although the revised models were proposed to simplify computations, their assumptions may not be suitable for web browsing. In this paper, the problem of website fingerprinting is analyzed from the viewpoint of pattern classification. The data complexity measures, which quantify the difficulty of separating classes in a classification problem, are applied to describe the leak quantification. The performance of these data complexity measures in representing information leaks is discussed and compared with the existing approaches. This comparative analysis is realized conceptually and through experiments by using two website fingerprinting countermeasures: traffic morphing and BuFLO. Moreover, the parameter selection model based on the leak quantification is proposed to estimate suitable parameters for the website fingerprinting countermeasure. The experimental results confirm that the countermeasures with parameters selected according to the data complexity measures are more secure than other leak quantification measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Backes M, Kopf B, Rybalchenko A (2009) Automatic discovery and quantification of information leaks. In: Proceedings of the 30th IEEE symposium on security and privacy, SP ’09. IEEE Computer Society, Washington, DC, pp 141–153

  2. Backes M, Doychev G, Köpf B (2013) Preventing side-channel leaks in web traffic: a formal approach. In: Proceedngs of 20th network and distributed systems security symposium (NDSS), Internet Society

  3. Bernado-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evolut Comput 9(1):82–104

    Article  Google Scholar 

  4. Biggio B, Fumera G, Roli F (2010) Multiple classifier systems for robust classifier design in adversarial environments. Int J Mach Learn Cybern 1(1–4):27–41

    Article  Google Scholar 

  5. Blasco J, Hernandez-Castro JC, Tapiador JE, Ribagorda A (2012) Bypassing information leakage protection with trusted applications. Comput Secur 31(4):557–568

    Article  Google Scholar 

  6. Boehm O, Hardoon DR, Manevitz LM (2011) Classifying cognitive states of brain activity via one-class neural networks with feature selection by genetic algorithms. Int J Mach Learn Cybern 2(3):125–134

    Article  Google Scholar 

  7. Cai X, Zhang XC, Joshi B, Johnson R (2012) Touching from a distance: website fingerprinting attacks and defenses. In: Proceedings of the 2012 ACM conference on computer and communications security, CCS ’12. ACM, New York, pp 605–616

  8. Chapman P, Evans D (2011) Automated black-box detection of side-channel vulnerabilities in web applications. In: Proceedings of the 18th ACM conference on computer and communications security, CCS ’11. ACM, New York, pp 263–274

  9. Chen S, Wang R, Wang X, Zhang K (2010) Side-channel leaks in web applications: s reality today, a challenge tomorrow. In: Proceedings of the 2010 IEEE symposium on security and privacy, SP ’10. IEEE Computer Society, Washington, DC, pp 191–206

  10. Coull SE, Collins MP, Wright CV, Monrose F, Reiter MK, et al. (2007) On web browsing privacy in anonymized netflows. In: Proceedings of the 16th USENIX security symposium, pp 339–352

  11. Dierks T (2008) The transport layer security (TLS) protocol version 1.2

  12. Dingledine R, Mathewson N, Syverson P (2004) Tor: the second-generation onion router. In: Proceedings of the 13th conference on USENIX security symposium, USENIX Association

  13. Dyer KP, Coull SE, Ristenpart T, Shrimpton T (2012) Peek-a-boo, I still see you: why efficient traffic analysis countermeasures fail. In: Proceedings of the 2012 IEEE symposium on security and privacy, SP ’12. IEEE Computer Society, Washington, DC, pp 332–346

  14. Ho TK, Basu M (2000) Measuring the complexity of classification problems. In: 15th international conference on pattern recognition, vol 2, pp 43–47

  15. Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300

    Article  Google Scholar 

  16. Liberatore M, Levine BN (2006) Inferring the source of encrypted HTTP connections. In: Proceedings of the 13th ACM conference on computer and communications security, CCS ’06. ACM, New York, pp 255–263

  17. Lu L, Chang EC, Chan MC (2010) Website fingerprinting and identification using ordered feature sequences. In: Proceedings of the 15th European conference on research in computer security, ESORICS’10, pp 199–214

  18. Luengo J, Herrera F (2010) Domains of competence of fuzzy rule based classification systems with data complexity measures: a case of study using a fuzzy hybrid genetic based machine learning method. Fuzzy Sets Syst 161(1):3–19

    Article  MathSciNet  Google Scholar 

  19. Luengo J, Herrera F (2012) Shared domains of competence of approximate learning models using measures of separability of classes. Inf Sci 185(1):43–65

    Article  MathSciNet  Google Scholar 

  20. Luo X, Zhou P, Chan EWW, Lee W, Chang RKC, Perdisci R (2011) HTTPOS: sealing information leaks with browser-side obfuscation of encrypted flows. In: Network and distributed systems symposium (NDSS)

  21. Macià N, Bernadó-Mansilla E, Orriols-Puig A, Ho TK (2013) Learner excellence biased by data set selection: a case for data characterisation and artificial data sets. Pattern Recognit 46(3):1054–1066

    Article  Google Scholar 

  22. Mather L, Oswald E (2012) Quantifying side-channel information leakage from web applications. IACR cryptology ePrint archive, p 269

  23. Nelson B, Barreno M, Chi FJ, Joseph AD, Rubinstein BI, Saini U, Sutton CA, Tygar JD, Xia K (2008) Exploiting machine learning to subvert your spam filter. LEET 8:1–9

    Google Scholar 

  24. Panchenko A, Niessen L, Zinnen A, Engel T (2011) Website fingerprinting in onion routing based anonymization networks. In: Proceedings of the 10th annual ACM workshop on privacy in the Electronic Society, WPES ’11, pp 103–114

  25. Pironti A, Strub PY, Bhargavan K (2012) Identifying website users by tls traffic analysis: new attacks and effective countermeasures. Technical report RR-8067, INRIA

  26. Sáez JA, Luengo J, Herrera F (2013) Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification. Pattern Recognit 46(1):355–364

    Article  Google Scholar 

  27. Singh S (2003) Multiresolution estimates of classification complexity. IEEE Trans Pattern Anal Mach Intell 25:1534–1539

    Article  Google Scholar 

  28. Song DX, Wagner D, Tian X (2001) Timing analysis of keystrokes and timing attacks on ssh. In: Proceedings of the 10th conference on USENIX security symposium, vol 10, SSYM’01, USENIX Association

  29. Standaert FX, Malkin T, Yung M (2009) A unified framework for the analysis of side-channel key recovery attacks. In: EUROCRYPT, lecture notes in computer science, vol 5479. Springer pp 443–461

  30. Sun D, Guo Y, Yin L, Hu C (2012) Comparison of measuring information leakage for fully probabilistic systems. Int J Innov Comput Inf Control 8(1A):255–267

    Google Scholar 

  31. Sun Q, Simon DR, Wang YM, Russell W, Padmanabhan VN, Qiu L (2002) Statistical identification of encrypted web browsing traffic. In: Proceedings of the 2002 IEEE symposium on security and privacy, SP ’02. IEEE Computer Society, Washington, DC, pp 19–30

  32. Todo Y, Mitsui T (2014) A learning multiple-valued logic network using genetic algorithm. Int J Innov Comput Inf Control 10(2):565–574

    Google Scholar 

  33. Tong DL, Mintram R (2010) Genetic algorithm-neural network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection. Int J Mach Learn Cybern 1(1–4):75–87

    Article  Google Scholar 

  34. Wang T, Goldberg I (2013) Improved website fingerprinting on tor. In: Proceedings of the 12th ACM workshop on privacy in the Electronic Society, WPES ’13. ACM, pp 201–212

  35. Wright CV, Coull SE, Monrose F (2009) Traffic morphing: An efficient defense against statistical traffic analysis. In: Proceedings of the 16th network and distributed security symposium. IEEE, pp 237–250

  36. Yao L, Zi X, Pan L, Li J (2009) A study of on/off timing channel based on packet delay distribution. Comput Secur 28(8):785–794

    Article  Google Scholar 

  37. Zhang K, Li Z, Wang R, Wang X, Chen S (2010) Sidebuster: Automated detection and quantification of side-channel leaks in web application development. In: Proceedings of the 17th ACM conference on computer and communications security, CCS ’10. ACM, pp 595–606

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61003171, 61272201 and 61003172).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick P. K. Chan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, ZM., Chan, P.P.K., Yeung, D.S. et al. Quantification of side-channel information leaks based on data complexity measures for web browsing. Int. J. Mach. Learn. & Cyber. 6, 607–619 (2015). https://doi.org/10.1007/s13042-015-0348-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-015-0348-3

Keywords

Navigation