Feature Selection Based Correlation Attack on HTTPS Secure Searching

Sarfaraz, Aaliya; Khan, Ahmed

doi:10.1007/s11277-018-5989-6

Feature Selection Based Correlation Attack on HTTPS Secure Searching

Published: 22 September 2018

Volume 103, pages 2995–3008, (2018)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

268 Accesses
6 Citations
Explore all metrics

Abstract

Search engine plays an irreplaceable role in web information organizing and accessing. It is very common for Internet users to query a search engine when retrieving web information. Sensitive data about search engine user’s intentions or behavior can be inferred from his query phrases, the returned results pages, and the webpages he visits subsequently. In order to protect contents of communications from being eavesdropped, some search engines adopt HTTPS by default to provide bidirectional encryption. This only provides an encrypted channel between user and search engine, the majority of webpages indexed in search engines’ results pages are still on HTTP enabled websites and the contents of these webpages can be observed by attackers once the user click on these links. Imitating attackers, we propose a novel approach for attacking secure search through correlating analysis of encrypted search with unencrypted webpages. We show that a simple weighted TF–DF mechanism is sufficient for selecting guessing phrase candidates. Imitating search engine users, by querying these candidates and enumerating webpages indexed in results pages, we can hit the definite query phrases and meanwhile reconstruct user’s web-surfing trails through DNS-based URLs comparison and flow feature statistics-based network traffic analysis. In the experiment including 28 search phrases, we achieved 67.86% hit rate at first guess and 96.43% hit rate within three guesses. Our empirical research shows that HTTPS traffic can be correlated and de-anonymized through HTTP traffic and secured search of search engines are not always secure unless HTTPS by default enabled everywhere.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online social networks security and privacy: comprehensive review and analysis

Article Open access 01 June 2021

Cyber Security Threats and Vulnerabilities: A Systematic Mapping Study

Article 06 January 2020

A survey on security challenges in cloud computing: issues, threats, and solutions

Article 28 February 2020

References

Naylor, D., Finamore, A., Leontiadis, I., Grunenberger, Y., Mellia, M., Munafo, M., Papagiannaki, K., & Steenkiste, P. (2014). The cost of S in HTTPS. In Proceedings of the CoNext (pp. 133–140).
Yuan, Z., Xue, Y., & Xia, W. (2013) PPI: Towards precise page identification for encrypted web-browsing traffic. In Proceedings of the ANCS (pp. 109–110).
Xia, W., Ren, Y., Yuan, Z., & Xue, Y. (2013). TCPI: A novel method of encrypted page identification. In Proceedings of the CCIS (pp. 453–456).
Miller, B., Huang, L., Joseph, A. D., & Tygar, J. D. (2014). I know why you went to the clinic: Risks and realization of HTTPS traffic analysis. In Proceedings of the PETS (pp. 143–163).
Xie, G., Iliofotou, M., Karagiannis, T., Faloutsos, M., & Jin, Y. (2013). Reconstructing web-surfing activity from network traffic. In Proceedings of the IFIP Networking Conference (pp. 1–9).
Neasbitt, C. (2014). Clickminer: Towards forensic reconstruction of user-browser interactions from network traces. In Proceedings of the ACM CCS (pp. 1244–1255).
Gugelmann, D. (2015). Hviz: HTTP(S) traffic aggregation and visualization for network forensics. Digital Investigation, 12(Sup 1), S1–S11.
Article Google Scholar
Conti, M., Mancini, L. V., Spolaor, R., & Verde, N. V. (2015). Can’t you hear me knocking: Identification of user actions on android apps via traffic analysis. In Proceedings of the ACM SIGSAC CODASPY.
Chen, S., Wang, R., Wang, X. F., & Zhang, K. (2010). Side-channel leaks in web applications: A reality today, a challenge tomorrow. In Proceedings of 2010 IEEE Symposium on Security and Privacy, May 16–19, 2010, Oakland, CA, USA (pp. 191–206). IEEE.
Juarez, M., Afroz, S., Acar, G., Diaz, C., & Greenstadt, R. (2014). A critical evaluation of website fingerprinting attacks. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’14) (pp. 263–274).
Korczynski, M., & Duda, A. (2014). Markov chain fingerprinting to classify encrypted traffic. In Proceedings of the 2014 IEEE Conference on Computer Communications (IEEE INFOCOM 2014), April 27–May 2, 2014, Toronto ON (pp. 781–789). IEEE.
Goseva-Postojanova, K., Anastasovski, G., Dimitrijevik, A., Pantev, R., & Miller, B. (2014). Characterization and classification of malicious Web traffic. Computers and Security, 42, 92–115.
Article Google Scholar
Luoshi, Z., Yibo, X., & Yuanyuan, B. (2015). A new network traffic classification method based on classifier integration. International Journal of Grid and Distributed Computing, 8(3), 309–322.
Article Google Scholar
Wang, Y., Xiang, Y., Zhang, J., Zhou, W., Wei, G., & Yang, L. T. (2014). Internet traffic classification using constrained clustering. IEEE Transactions on Parallel and Distributed Systems, 25(11), 2932–2943.
Article Google Scholar
Bukhari, R. H., Sarfaraz, A., & Khan, A. (2018). Python: A critical analysis of programing languages for novices. Science International, 30(3), 327–331.
Google Scholar
Khan, A., & Sarfaraz, A. (2018). Practical guidelines for securing wireless local area networks (WLANs). International Journal of Security and Its Applications, 12(3), 19–28.
Article Google Scholar
Le Blond, S., & Choffnes, D. (2015). Herd: A scalable, traffic analysis resistant anonymity network for VoIP systems. In Proceedings of the SIGCOM (pp. 639–652).
Vines, P., & Kohno, T. (2015). Rook: Using video games as a low-bandwidth censorship resistant communication platform. In Proceedings of the WPES (pp. 75–84).
Dyer, K. P., Coull, S. E., & Shrimpton, T. (2015). Marionette: A programmable network-traffic obfuscation system. In Proceedings of the USENIX (pp. 367–382).
Khan, A., & Sarfaraz, A. (2017). Vetting the security of mobile applications. Science International, 29(2), 361–365.
Google Scholar
Khan, A., & Sarfaraz, A. (2018). Novel high-capacity robust and imperceptible image steganography scheme using multi flipped permutations and frequency entropy matching method. Soft Computing, 20(10), 1–12.
Google Scholar
Khan, A., Sohaib, M., & Amjad, F. M. (2016). High-capacity multi-layer framework for highly robust textual steganography. Science International, 28(5), 4451–4457.
Google Scholar
Khan, A., Tariq, U., Shabbir, J., & Hassan, S. (2016). Cloud security analysis for health care systems. International Journal of Computer and Communication System Engineering, 3(1), 1–8.
Google Scholar
Khan, A. (2015). Comparative analysis of watermarking techniques. Science International, 27(6), 6091–6096.
Google Scholar
Khan, A. (2015). Robust textual steganography. Journal of Science, 4(4), 426–434.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, COMSATS University, Islamabad, Pakistan
Aaliya Sarfaraz & Ahmed Khan

Authors

Aaliya Sarfaraz
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Khan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmed Khan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sarfaraz, A., Khan, A. Feature Selection Based Correlation Attack on HTTPS Secure Searching. Wireless Pers Commun 103, 2995–3008 (2018). https://doi.org/10.1007/s11277-018-5989-6

Download citation

Published: 22 September 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s11277-018-5989-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature Selection Based Correlation Attack on HTTPS Secure Searching

Abstract

Access this article

Similar content being viewed by others

Online social networks security and privacy: comprehensive review and analysis

Cyber Security Threats and Vulnerabilities: A Systematic Mapping Study

A survey on security challenges in cloud computing: issues, threats, and solutions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature Selection Based Correlation Attack on HTTPS Secure Searching

Abstract

Access this article

Similar content being viewed by others

Online social networks security and privacy: comprehensive review and analysis

Cyber Security Threats and Vulnerabilities: A Systematic Mapping Study

A survey on security challenges in cloud computing: issues, threats, and solutions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation