Abstract
In this paper a spam filtering method is proposed. We focus on user behavior that most email users browse the Web. The method reduces troublesome maintenance of the spam filter, since the filter learns from Web browsing behavior in the background. The method uses Web browsing behavior of each user to learn ham words. Ham words are picked up from browsed Web pages using TF-IDF and stored in the database called ham words list. For each received email, the method extracts keywords from the email, including Web pages of the URLs. If some keywords are in the ham words list, the email is treated as a ham. In our experiments, several spam emails which cannot be detected by a Bayesian filter are detected as spams.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barracuda Networks, Inc.: Barracuda Networks Releases Annual Spam Report (press release, 2007), http://www.barracudanetworks.com/ns/news_and_events/index.php?nid=232
Goodman, J., Cormack, G.V., Heckerman, D.: Spam and the Ongoing Battle for the Inbox. Communication of ACM 50(2), 24–33 (2007)
Cunningham, P., Nowlan, N., Delany, S.J., Haahr, M.: A Case-Based Approach to Spam Filtering that Can Track Concept Drift. In: Proc. ICCBR 2003 Workshop on Long-Lived CBR Systems (2003)
Graham, P.: A Plan for Spam (2002), http://www.paulgraham.com/spam.html
Kumagai, N., Aritsugi, M.: On Applying an Image Processing Technique to Detecting Spams. In: Proc. 21st International Conference on Data Engineering Workshops (ICDEW 2005), p. 1172 (2005)
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)
Yahoo! Inc.: Yahoo! Search Web Services, http://developer.yahoo.com/search/
http://ultimania.org/sen/ Sen (in Japanese)
Mozilla: Thunderbird, http://www.mozilla.com/thunderbird/
Budzik, J., Hammond, K.J.: User Interactions with Everyday Applications as Context for Just-in-time Information Access. In: Proc. 5th International Conference on Intelligent User Interfaces, pp. 44–51 (2000)
Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Spyropoulos, C.D.: An Experimental Comparison of Naive Bayesian and Keyword-based Anti-Spam Filtering with Personal E-mail Messages. In: Proc. 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2000), pp. 160–167 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takashita, T., Itokawa, T., Kitasuka, T., Aritsugi, M. (2008). A Spam Filtering Method Learning from Web Browsing Behavior. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5178. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85565-1_96
Download citation
DOI: https://doi.org/10.1007/978-3-540-85565-1_96
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85564-4
Online ISBN: 978-3-540-85565-1
eBook Packages: Computer ScienceComputer Science (R0)