skip to main content
10.1145/2566486.2568047acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Quite a mess in my cookie jar!: leveraging machine learning to protect web authentication

Published: 07 April 2014 Publication History

Abstract

Browser-based defenses have recently been advocated as an effective mechanism to protect web applications against the threats of session hijacking, fixation, and related attacks. In existing approaches, all such defenses ultimately rely on client-side heuristics to automatically detect cookies containing session information, to then protect them against theft or otherwise unintended use. While clearly crucial to the effectiveness of the resulting defense mechanisms, these heuristics have not, as yet, undergone any rigorous assessment of their adequacy. In this paper, we conduct the first such formal assessment, based on a gold set of cookies we collect from 70 popular websites of the Alexa ranking. To obtain the gold set, we devise a semi-automatic procedure that draws on a novel notion of authentication token, which we introduce to capture multiple web authentication schemes. We test existing browser-based defenses in the literature against our gold set, unveiling several pitfalls both in the heuristics adopted and in the methods used to assess them. We then propose a new detection method based on supervised learning, where our gold set is used to train a binary classifier, and report on experimental evidence that our method outperforms existing proposals. Interestingly, the resulting classification, together with our hands-on experience in the construction of the gold set, provides new insight on how web authentication is implemented in practice.

References

[1]
The Mechanize library for Python. http://wwwsearch.sourceforge.net/mechanize/.
[2]
PHP session handling. http://php.net/manual/en/book.session.php.
[3]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In International Conference on Very Large Data Bases (VLDB), pages 487--499, 1994.
[4]
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Chapman & Hall, New York, NY, 1984.
[5]
C. E. Brodley and P. E. Utgoff. Multivariate decision trees. Machine Learning, 19(1):45--77, 1995.
[6]
M. Bugliesi, S. Calzavara, R. Focardi, and W. Khan. Automatic and robust client-side protection for cookie-based sessions. In Engineering Secure Software and Systems (ESSoS), 2014. To appear.
[7]
P. Devyver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice-Hall, 1982.
[8]
D. A. F. Florêncio and C. Herley. A large-scale study of web password habits. In International Conference on World Wide Web (WWW), pages 657--666, 2007.
[9]
S. Fogie, J. Grossman, R. Hansen, A. Rager, and P. D. Petkov. XSS Attacks: Cross Site Scripting Exploits and Defense. Syngress Publishing, 2007.
[10]
W. F. Friedman. The index of coincidence and its applications to cryptanalysis. Cryptographic Series, 1922.
[11]
S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4(1):1--58, January 1992.
[12]
T. S. Guzella and W. M. Caminhas. A review of machine learning approaches to spam filtering. Expert Systems with Applications, 36(7):10206--10222, 2009.
[13]
C. Jackson and A. Barth. ForceHTTPS: protecting high-security web sites from network attacks. In International Conference on World Wide Web (WWW), pages 525--534, 2008.
[14]
M. Johns, B. Braun, M. Schrank, and J. Posegga. Reliable protection against session fixation attacks. In ACM Symposium on Applied Computing (SAC), pages 1531--1537, 2011.
[15]
M. Johns and J. Winter. Requestrodeo: client side protection against session riding. Proceedings of the OWASP Europe Conference, pages 5--17, 2006.
[16]
E. Kreyszig. Advanced Engineering Mathematics. Wiley, 4 edition, March 1979.
[17]
T. M. Mitchell. Machine Learning. McGraw-Hill, Inc., New York, NY, USA, 1 edition, 1997.
[18]
N. Nikiforakis, W. Meert, Y. Younan, M. Johns, and W. Joosen. Sessionshield: Lightweight protection against session hijacking. In Engineering Secure Software and Systems (ESSoS), pages 87--100, 2011.
[19]
C. Perlich, F. Provost, and J. S. Simonoff. Tree induction vs. logistic regression: a learning-curve analysis. Journal of Machine Learning Research, 4:211--255, December 2003.
[20]
J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., 1993.
[21]
F. Roesner, T. Kohno, and D. Wetherall. Detecting and defending against third-party tracking on the web. In USENIX Conference on Networked Systems Design and Implementation (NSDI), pages 1--14, 2012.
[22]
P. D. Ryck, L. Desmet, T. Heyman, F. Piessens, and W. Joosen. Csfire: Transparent client-side mitigation of malicious cross-domain requests. In Engineering Secure Software and Systems (ESSoS), pages 18--34, 2010.
[23]
P. D. Ryck, L. Desmet, W. Joosen, and F. Piessens. Automatic and precise client-side protection against CSRF attacks. In European Symposium on Research in Computer Security (ESORICS), pages 100--116, 2011.
[24]
P. D. Ryck, N. Nikiforakis, L. Desmet, F. Piessens, and W. Joosen. Serene: Self-reliant client-side protection against session fixation. In Distributed Applications and Interoperable Systems (DAIS), pages 59--72, 2012.
[25]
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York, NY, USA, 1986.
[26]
C. Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27:379--423, 1948.
[27]
R. Sommer and V. Paxson. Outside the closed world: On using machine learning for network intrusion detection. In IEEE Symposium on Security and Privacy, pages 305--316, 2010.
[28]
S. Tang, N. Dautenhahn, and S. T. King. Fortifying web-based applications automatically. In ACM Conference on Computer and Communications Security (CCS), pages 615--626, 2011.
[29]
L. G. Valiant. A theory of the learnable. Commununications of the ACM, 27(11):1134--1142, November 1984.
[30]
Y. Zhou and D. Evans. Why aren't HTTP-Only cookies more widely deployed. In Web 2.0 Security and Privacy Workshop (W2SP'10), 2010.

Cited By

View all
  • (2022)Enhancing Web Authentication Security Using Random ForestTENCON 2022 - 2022 IEEE Region 10 Conference (TENCON)10.1109/TENCON55691.2022.9978128(1-6)Online publication date: 1-Nov-2022
  • (2021)A preliminary study on the adoption and effectiveness of SameSite cookies as a CSRF defence2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)10.1109/EuroSPW54576.2021.00012(49-59)Online publication date: Sep-2021
  • (2021)Measuring Web Session Security at ScaleComputers and Security10.1016/j.cose.2021.102472111:COnline publication date: 1-Dec-2021
  • Show More Cited By

Index Terms

  1. Quite a mess in my cookie jar!: leveraging machine learning to protect web authentication

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '14: Proceedings of the 23rd international conference on World wide web
    April 2014
    926 pages
    ISBN:9781450327442
    DOI:10.1145/2566486

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 April 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. authentication cookies
    2. classification
    3. web security

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    WWW '14
    Sponsor:
    • IW3C2

    Acceptance Rates

    WWW '14 Paper Acceptance Rate 84 of 645 submissions, 13%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Enhancing Web Authentication Security Using Random ForestTENCON 2022 - 2022 IEEE Region 10 Conference (TENCON)10.1109/TENCON55691.2022.9978128(1-6)Online publication date: 1-Nov-2022
    • (2021)A preliminary study on the adoption and effectiveness of SameSite cookies as a CSRF defence2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)10.1109/EuroSPW54576.2021.00012(49-59)Online publication date: Sep-2021
    • (2021)Measuring Web Session Security at ScaleComputers and Security10.1016/j.cose.2021.102472111:COnline publication date: 1-Dec-2021
    • (2020)ECEM - Generating Adversarial Logs under Black-box Setting in Web SecurityGLOBECOM 2020 - 2020 IEEE Global Communications Conference10.1109/GLOBECOM42002.2020.9347996(1-6)Online publication date: Dec-2020
    • (2019)Mitch: A Machine Learning Approach to the Black-Box Detection of CSRF Vulnerabilities2019 IEEE European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP.2019.00045(528-543)Online publication date: Jun-2019
    • (2019)A security feature framework for programming languages to minimize application layer vulnerabilitiesSECURITY AND PRIVACY10.1002/spy2.953:1Online publication date: 7-Nov-2019
    • (2017)Surviving the WebACM Computing Surveys10.1145/303892350:1(1-34)Online publication date: 6-Mar-2017
    • (2016)Half-Baked CookiesProceedings of the 11th ACM on Asia Conference on Computer and Communications Security10.1145/2897845.2897889(675-685)Online publication date: 30-May-2016
    • (2016)TooKie: A New Way to Secure SessionsRecent Developments in Intelligent Information and Database Systems10.1007/978-3-319-31277-4_17(195-207)Online publication date: 27-Feb-2016
    • (2015)CookiExt: Patching the browser against session hijacking attacksJournal of Computer Security10.3233/JCS-15052923:4(509-537)Online publication date: 16-Sep-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media