Skip to main content

Using Social Network Analysis for Spam Detection

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6007))

Abstract

Content filtering is a popular approach to spam detection. It focuses on analysis of the message content to identify spam. In this paper, we evaluate the use of social network analysis measures to improve the performance of a content filtering model. By measuring the degree centrality of message transfer agents, we observed performance improvements for spam detection in repeated experiments; e.g. a 70% increase in the proportion of spam detected with a false positive rate of 0.1%. We were also able to use anomaly detection to identify mislabeled messages in a publicly available spam data set. Messages claiming unusually long paths between the sender’s message transfer agent and the recipient’s message transfer agent turned out to be spam.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Calais, P., Guedes, D., Meria Jr., W., Hoepers, C., Chaves, M., Steding-Jessen, K.: Spamming Chains: A New Way of Understanding Spammer Behavior. In: Proceedings of the 6th Conference on E-Mail and Anti-Spam (2009), http://www.ceas.cc/papers-2009/ceas2009-paper-23.pdf

  2. Cormack, G.V.: TREC 2007 Spam Track Overview. NIST Special Publication 500-274. In: The 16th Text REtrieval Conference, TREC (2007), http://trec.nist.gov/pubs/trec16/papers/SPAM.OVERVIEW16.pdf

  3. Crocker, H.D.: Standard for the Format of ARPA Internet Text Messages. ARPANET Request for Comments (RFC) No. 822 (August 1982), http://www.ietf.org/rfc/rfc0822.txt

  4. Fawcett, T.: An Introduction to ROC Analysis. Pattern Recognition Letters 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  5. Freeman, L.C.: Centrality in Social Networks: Concept Clarification. Social Networks 1(3), 215–239 (1979)

    Article  Google Scholar 

  6. Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics 28(2), 337–407 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  7. Kaufman, L., Rousseeuw, P.J.: Partitioning Around Medoids. In: Finding Groups in Data, pp. 68–125. Wiley-Interscience, Hoboken (2005)

    Google Scholar 

  8. Manning, C.D., Raghavan, P., Schutze, H.S.: Term Weighting, and the Vector Space Model. In: Introduction to Information Retrieval, pp. 109–133. Cambridge University Press, Cambridge (2008), http://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf

    Google Scholar 

  9. TREC 2007 Public Spam Corpus, http://plg.uwaterloo.ca/~gvcormac/treccorpus07/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

DeBarr, D., Wechsler, H. (2010). Using Social Network Analysis for Spam Detection. In: Chai, SK., Salerno, J.J., Mabry, P.L. (eds) Advances in Social Computing. SBP 2010. Lecture Notes in Computer Science, vol 6007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12079-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12079-4_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12078-7

  • Online ISBN: 978-3-642-12079-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics