Abstract
Spear-phishing is an effective attack vector for infiltrating companies and organisations. Based on the multitude of personal information available online, an attacker can craft seemingly legit emails and trick his victims into opening malicious attachments and links. Although anti-spoofing techniques exist, their adoption is still limited and alternative protection approaches are needed. In this paper, we show that a sender leaves content-agnostic traits in the structure of an email. Based on these traits, we develop a method capable of learning profiles for a large set of senders and identifying spoofed emails as deviations thereof. We evaluate our approach on over 700,000 emails from 16,000 senders and demonstrate that it can discriminate thousands of senders, identifying spoofed emails with 90% detection rate and less than 1 false positive in 10,000 emails. Moreover, we show that individual traits are hard to guess and spoofing only succeeds if entire emails of the sender are available to the attacker.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amin, R.M.: Detecting targeted malicious email through supervised classification of persistent threat and recipient oriented features. Ph.D. thesis, George Washington University, Washington, DC, USA (2010). aAI3428188
Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: International Conference on Machine Learning (ICML), pp. 97–104 (2006)
Buildwith technology lookup. https://builtwith.com. Accessed November 2017
Callas, J., Donnerhacke, L., Finney, H., Shaw, D., Thayer, R.: OpenPGP Message Format. RFC 4880 (Proposed Standard), November 2007. https://doi.org/10.17487/RFC4880. Updated by RFC 5581
Caputo, D.D., Pfleeger, S.L., Freeman, J.D., Johnson, M.E.: Going spear phishing: exploring embedded training and awareness. IEEE Secur. Priv. 12(1), 28–38 (2014)
Chen, P., Desmet, L., Huygens, C.: A study on advanced persistent threats. In: De Decker, B., Zúquete, A. (eds.) CMS 2014. LNCS, vol. 8735, pp. 63–72. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44885-4_5
Crocker, D., Hansen, T., Kucherawy, M.: DomainKeys Identified Mail (DKIM) Signatures. RFC 6376 (Internet Standard), September 2011. https://doi.org/10.17487/RFC6376
Lawrence, N.D., Schölkopf, B.: Estimating a kernel fisher discriminant in the presence of label noise. In: ICML, vol. 1, pp. 306–313 (2001)
Duda, R., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2001)
Duman, S., Cakmakci, K.K., Egele, M., Robertson, W., Kirda, E.: EmailProfiler: spearphishing filtering with header and stylometric features of emails. In: COMPSAC (2016)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. JMLR 9, 1871–1874 (2008)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Foster, I.D., Larson, J., Masich, M., Snoeren, A.C., Savage, S., Levchenko, K.: Security by any other name: on the effectiveness of provider based email security. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, CCS 2015, pp. 450–464. ACM, New York (2015). https://doi.org/10.1145/2810103.2813607
Freed, N., Borenstein, N.: Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. RFC 2045 (Draft Standard), November 1996. https://doi.org/10.17487/RFC2045. Updated by RFCs 2184, 2231, 5335, 6532
Freed, N., Moore, K.: MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations. RFC 2231 (Proposed Standard), November 1997. https://doi.org/10.17487/RFC2231
Gupta, S., Singhal, A., Kapoor, A.: A literature survey on social engineering attacks: phishing attack. In: 2016 International Conference on Computing, Communication and Automation (ICCCA), pp. 537–540. IEEE (2016)
Han, F., Shen, Y.: Accurate spear phishing campaign attribution and early detection. In: SAC, pp. 2079–2086 (2016)
Hardy, S., et al.: Targeted threat index: characterizing and quantifying politically-motivated targeted malware. In: USENIX Security, pp. 527–541 (2014)
Ho, G., et al.: Detecting credential spearphishing attacks in enterprise settings. In: USENIX Security Symposium (2017)
Trend Micro Incorporated: Spear-Phishing Email: Most Favored APT Attack Bait. Technical report, Trend Micro Inc. (2012)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. Technical report 23, LS VIII, University of Dortmund (1997)
Joachims, T.: Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms. Kluwer Academic Publishers (2002)
Josefsson, S.: The Base16, Base32, and Base64 Data Encodings. RFC 4648 (Proposed Standard), October 2006. https://doi.org/10.17487/RFC4648
Kitterman, S.: Sender Policy Framework (SPF) for Authorizing Use of Domains in Email, Version 1. RFC 7208 (Proposed Standard), April 2014. https://doi.org/10.17487/RFC7208. Updated by RFC 7372
Kucherawy, M., Zwicky, E.: Domain-based Message Authentication, Reporting, and Conformance (DMARC). RFC 7489 (Informational), March 2015. https://doi.org/10.17487/RFC7489
Le Blond, S., Uritesc, A., Gilbert, C.: A look at targeted attacks through the lense of an NGO. In: USENIX Security, pp. 543–558 (2014)
Lin, E., Aycock, J., Mannan, M.: Lightweight client-side methods for detecting email forgery. In: Lee, D.H., Yung, M. (eds.) WISA 2012. LNCS, vol. 7690, pp. 254–269. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35416-8_18
Mori, T., Sato, K., Takahashi, Y., Ishibashi, K.: How is e-mail sender authentication used and misused? In: Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, CEAS 2011, pp. 31–37. ACM, New York (2011). https://doi.org/10.1145/2030376.2030380
Ramsdell, B., Turner, S.: Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 Message Specification. RFC 5751 (Proposed Standard), January 2010. https://doi.org/10.17487/RFC5751
Resnick, P.: Internet Message Format. RFC 5322 (Draft Standard), October 2008. https://doi.org/10.17487/RFC5322. Updated by RFC 6854
Rieck, K., Wressnegger, C., Bikadorov, A.: Sally: a tool for embedding strings in vector spaces. J. Mach. Learn. Res. (JMLR) 13(Nov), 3247–3251 (2012)
Salton, G., Wong, A., Yang, C.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Stringhini, G., Thonnard, O.: That ain’t you: blocking spearphishing through behavioral modelling. In: Almgren, M., Gulisano, V., Maggi, F. (eds.) DIMVA 2015. LNCS, vol. 9148, pp. 78–97. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20550-2_5
Wang, J., Herath, T., Chen, R., Vishwanath, A., Rao, H.R.: Research article phishing susceptibility: an investigation into the processing of a targeted spear phishing email. IEEE Trans. Prof. Commun. 55(4), 345–362 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Gascon, H., Ullrich, S., Stritter, B., Rieck, K. (2018). Reading Between the Lines: Content-Agnostic Detection of Spear-Phishing Emails. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2018. Lecture Notes in Computer Science(), vol 11050. Springer, Cham. https://doi.org/10.1007/978-3-030-00470-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-00470-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00469-9
Online ISBN: 978-3-030-00470-5
eBook Packages: Computer ScienceComputer Science (R0)