Skip to main content

Spam Source Clustering by Constructing Spammer Network with Correlation Measure

  • Conference paper

Abstract

Spam filtering is one of the most challenging problems in electric message systems. In general, recent studies on specifying real spam source are based on content filtering because spammers usually falsify their origin. We propose a method to specify spam source based on structural analysis with complex network. We assume that each spam sources either has the same victim list or uses the same spam-hosting program. We treat spam source - target relationship as a bipartite network and construct weighted spam source network by network projection using correlation measure. We find that community clustering methods are inappropriate with spammer network. We group spammers with gradient-based grouping, which uses correlations between nodes as gradient between nodes. We convert them into local minima, which helps to cluster spammers into a few spam source groups. We investigate the weblog spam data with the proposed method and validate it. The method that we propose can be applied to diverse categorization problems, such as multiple text categorization and network subunit clustering.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Spamlinks.net, http://spamlinks.net/filter-bl.htm

  2. Song, S., Manikopoulos, C.N.: IP Spoofing Detection Approach(ISDA) for Network Intrusion Detection System. In: Sarnoff Symposium. IEEE, Los Alamitos (2006)

    Google Scholar 

  3. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian Approach to Filtering Junk E-Mail. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)

    Google Scholar 

  4. Newman, M.E.J., Strogatz, S.H., Watts, D.J.: Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E. 64, 026118 (2001)

    Article  Google Scholar 

  5. Newman, M.E.J.: The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. U.S.A. 98, 404–409 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  6. Lambiotte, R., Ausloos, M.: Uncovering collective listening habits and music genres in bipartite networks. Phys. Rev. E. 72, 066107 (2005)

    Article  Google Scholar 

  7. Doye, J.P.K.: The network topology of a potential energy landscape: A static scale-free network. Phys. Rev. Lett. 88, 238701 (2002)

    Article  Google Scholar 

  8. Eolin Antispam Service, http://antispam.eolin.com

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Shin, J., Kim, S. (2009). Spam Source Clustering by Constructing Spammer Network with Correlation Measure. In: Zhou, J. (eds) Complex Sciences. Complex 2009. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 4. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02466-5_88

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02466-5_88

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02465-8

  • Online ISBN: 978-3-642-02466-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics