Using Social Network Analysis for Spam Detection

DeBarr, Dave; Wechsler, Harry

doi:10.1007/978-3-642-12079-4_10

Using Social Network Analysis for Spam Detection

Dave DeBarr¹⁹ &
Harry Wechsler¹⁹

Conference paper

2337 Accesses
19 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6007))

Abstract

Content filtering is a popular approach to spam detection. It focuses on analysis of the message content to identify spam. In this paper, we evaluate the use of social network analysis measures to improve the performance of a content filtering model. By measuring the degree centrality of message transfer agents, we observed performance improvements for spam detection in repeated experiments; e.g. a 70% increase in the proportion of spam detected with a false positive rate of 0.1%. We were also able to use anomaly detection to identify mislabeled messages in a publicly available spam data set. Messages claiming unusually long paths between the sender’s message transfer agent and the recipient’s message transfer agent turned out to be spam.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Calais, P., Guedes, D., Meria Jr., W., Hoepers, C., Chaves, M., Steding-Jessen, K.: Spamming Chains: A New Way of Understanding Spammer Behavior. In: Proceedings of the 6th Conference on E-Mail and Anti-Spam (2009), http://www.ceas.cc/papers-2009/ceas2009-paper-23.pdf
Cormack, G.V.: TREC 2007 Spam Track Overview. NIST Special Publication 500-274. In: The 16th Text REtrieval Conference, TREC (2007), http://trec.nist.gov/pubs/trec16/papers/SPAM.OVERVIEW16.pdf
Crocker, H.D.: Standard for the Format of ARPA Internet Text Messages. ARPANET Request for Comments (RFC) No. 822 (August 1982), http://www.ietf.org/rfc/rfc0822.txt
Fawcett, T.: An Introduction to ROC Analysis. Pattern Recognition Letters 27(8), 861–874 (2006)
Article MathSciNet Google Scholar
Freeman, L.C.: Centrality in Social Networks: Concept Clarification. Social Networks 1(3), 215–239 (1979)
Article Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics 28(2), 337–407 (2000)
Article MATH MathSciNet Google Scholar
Kaufman, L., Rousseeuw, P.J.: Partitioning Around Medoids. In: Finding Groups in Data, pp. 68–125. Wiley-Interscience, Hoboken (2005)
Google Scholar
Manning, C.D., Raghavan, P., Schutze, H.S.: Term Weighting, and the Vector Space Model. In: Introduction to Information Retrieval, pp. 109–133. Cambridge University Press, Cambridge (2008), http://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf
Google Scholar
TREC 2007 Public Spam Corpus, http://plg.uwaterloo.ca/~gvcormac/treccorpus07/

Download references

Author information

Authors and Affiliations

Department of Computer Science, George Mason University, Fairfax, VA, 22030-4444
Dave DeBarr & Harry Wechsler

Authors

Dave DeBarr
View author publications
You can also search for this author in PubMed Google Scholar
Harry Wechsler
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Sociology, University of Hawaii, 2424 Maile Way, Saunders Hall 237, 956-7234, Honolulu, HI, USA
Sun-Ki Chai
Air Force Research Laboratory, Rome Research Site, AFRL/RIEF, 525 Brooks Road, NY 13441, Rome, USA
John J. Salerno
Department of Behavioral and Social Sciences Research, National Institute of Health (NIH), 31 Center Drive, 20892-2027, Bethesda, MD, USA
Patricia L. Mabry

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

DeBarr, D., Wechsler, H. (2010). Using Social Network Analysis for Spam Detection. In: Chai, SK., Salerno, J.J., Mabry, P.L. (eds) Advances in Social Computing. SBP 2010. Lecture Notes in Computer Science, vol 6007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12079-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-12079-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12078-7
Online ISBN: 978-3-642-12079-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics