Text Mining for Spam Filtering

Kołcz, Aleksander

doi:10.1007/978-0-387-30164-8_828

Aleksander Kołcz

413 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Bratko, A., Cormack, G. V., Filipic, B., Lynam, T. R., & Zupan, B. (2006). Spam filtering using statistical data compression models. Journal of Machine Learning Research, 7, 2673–2698.
MathSciNet Google Scholar
Carreras, X., & Màrquez, L. (2001). Boosting trees for anti-spam email filtering. In Proceedings of RANLP-01, the 4th international conference on recent advances in natural language processing. New York: ACM.
Google Scholar
Cormack, G. V., & Lynam, T. R. (2006). On-line supervised spam filter evaluation. ACM Transactions on Information Systems, 25(3), 11.
Google Scholar
Dalvi, N., Domingos, P., Sanghai, M. S., & Verma, D. (2004). Adversarial classification. In Proceedings of the tenth international conference on knowledge discovery and data mining (Vol. 1, pp. 99–108). New York: ACM.
Google Scholar
Drucker, H., Wu, D., & Vapnik, V. N. (1999). Support vector machines for spam categorization. IEEE Transactions on Neural Networks, 5(10), 1048–1054.
Article Google Scholar
Fawcett, T. (2003). In vivo’ spam filtering: A challenge problem for data mining. KDD Explorations, 5(2), 140–148.
Article Google Scholar
Goodman, J., & Yih, W. (2006). Online discriminative spam filter training. In Proceedings of the third conference on email and anti-spam.Mountain View, CA. (CEAS-2006).
Google Scholar
Kołcz, A. (2005). Local sparsity control for naive bayes with extreme misclassification costs. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM.
Google Scholar
Kołcz, A., & Alspector, J. (2001). SVM-based filtering of e-mail spam with content-specific misclassification costs. TextDM’2001 (IEEE ICDM-2001 workshop on text mining), San Jose, CA.
Google Scholar
Kołcz, A., Bond, M., & Sargent, J. (2006). The challenges of service-side personalized spam filtering: Scalability and beyond. In Proceedings of the first international conference on scalable information systems (INFOSCALE). New York: ACM.
Google Scholar
Kołcz, A. M., & Chowdhury, A. (2007). Hardening fingerprinting by context. In Proceedings of the fourth international conference on email and anti-spam.
Google Scholar
Lowd, D., & Meek, C. (2005). Good word attacks on statistical spam filters. In Proceedings of the second conference on email and anti-spam. Mountain View, CA. (CEAS-2005).
Google Scholar
Metsis, V., Androutsopoulos, I., & Paliouras, G. (2006). Spam filtering with naive bayes - which naive bayes? In Proceedings of the third conference on email and anti-spam. (CEAS-2006).
Google Scholar
Rigoutsos, I., & Huynh, T. (2004). Chung-Kwei: a pattern-discovery-based system for the automatic identification of unsolicited e-mail messages (SPAM). In Proceedings of the first conference on email and anti-spam. (CEAS-2004).
Google Scholar
Sahami, M., Dumais, S., Heckerman, D., & Horvitz, E. (1998). A Bayesian approach to filtering junk email. AAAI workshop on learning for text categorization, Madison, Wisconsin. AAAI Technical Report WS-98-05.
Google Scholar
Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Spyropoulos, C. D., & Stamatopoulos, P. (2001). Stacking classifiers for anti-spam filtering of e-mail. In L. Lee & D. Harman (Eds.). Proceedings of empirical methods in natural language processing (EMNLP 2001) (pp. 44–50). http://www.cs.cornell.edu/home/llee/emnlp/proceeding.html.
Sculley, D., & Wachman, G. (2007). Relaxed online support vector machines for spam filtering. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval. New York: ACM.
Google Scholar
Segal, R., Crawford, J., Kephart, J., & Leiba, B. (2004). SpamGuru: An enterprise anti-spam filtering system. In Proceedings of the first conference on email and anti-spam. (CEAS-2004).
Google Scholar
Siefkes, C., Assis, F., Chhabra, S., & Yerazunis, W. (2004). Combining winnow and orthogonal sparse bigrams for incremental spam filtering. In Proceedings of the european conference on principle and practice of knowledge discovery in databases. New York: Springer.
Google Scholar
Yoshida, K., Adachi, F., Washio, T., Motoda, H., Homma, T., Nakashima, A., et al. (2004). Densitiy-based spam detection. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 486–493). New York: ACM.
Google Scholar

Download references

Author information

Authors and Affiliations

Authors

Aleksander Kołcz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Engineering, University of New South Wales, Sydney, Australia, 2052
Claude Sammut
Faculty of Information Technology, Clayton School of Information Technology, Monash University, P.O. Box 63, Victoria, Australia, 3800
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Kołcz, A. (2011). Text Mining for Spam Filtering. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_828

Download citation

DOI: https://doi.org/10.1007/978-0-387-30164-8_828
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30768-8
Online ISBN: 978-0-387-30164-8
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics