Skip to main content

User Action Based Adaptive Learning with Weighted Bayesian Classification for Filtering Spam Mail

  • Conference paper
AI 2006: Advances in Artificial Intelligence (AI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4304))

Included in the following conference series:

Abstract

Nowadays, e-mail is considered one of the most important communication methods, but most users suffer from Spam mail. To solve this problem, there has been much research. The previous research showed comparatively high performance, but for adaptation of real world, it requires several improvements. First, it needs personalized learning for better performance. We cannot make a strict definition of Spam, because the definition of any context depends on each user. Second, the concept drift or interest drift problem, that is, users’ interest or any context’s concept, may change over time. Therefore, many Spam filtering systems are using continuous learning schemes such as adaptive learning or incremental learning. However, these systems require user feedback or rating results manually, and this inconvenience causes slow learning and performance enhancement. In this research, we developed an adaptive learning system based on an automatic weighting environment. For the automatic weight, we categorized 6 user patterns (actions) on the mailing system whose weights are automatically adapted to the learning phase. From the experiment, we will demonstrate the Bayesian classification with an adaptive learning environment. By using suggesting ideas, we will analyze the comparison result with adaptive learning. Finally, from the experiment using real world data sets, we will prove its possibility for tracking the concept and interest drift problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Korea Telecom. (2004), http://www.kt.co.kr

  2. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian Approach to Filtering Junk E-Mail. In: Learning for Text Categorization, Proc. of the AAAI Workshop, Madison Wisconsin. AAAI Technical Report WS-98-05, pp. 55–62 (1998)

    Google Scholar 

  3. Thomas, G., Peter, A.F.: Weighted Bayesian Classification based on Support Vector Machine. In: Proc. of the 18th International Conference on Machine Learning, pp. 207–209 (2001)

    Google Scholar 

  4. Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D., Stamatopoulos, P.: A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists. Information Retrieval 6(1), 49–73 (2000)

    Article  Google Scholar 

  5. Androutsopoulos, I., Koutsias, J., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C., Stamatopoulos, P.: Learning to Filter Spam E-mail: A Comparison of a NaĂŻve Bayesian and a Memory-Based Approach. In: 4th PKDD Workshop on Machine Learning and Textual Information Access (2000)

    Google Scholar 

  6. The Apache SpamAssassin Project, http://Spamassassin.apache.org/

  7. The SpamBayes Project, http://Spambayes.sourceforge.net/

  8. Kim, H.J., Kim, H.N., Jung, J.J., Jo, G.S.: Spam mail Filtering System using Semantic Enrichment. In: Proc. of the 5th International Conference on Web Information Systems Engineering (2004)

    Google Scholar 

  9. Cunningham, P., Nowlan, N., Delany, S.J., Haahr, M.: A Case-Based Approach to Spam Filtering that Can Track Concept Drift. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  10. Kevin, R.G.: Using Latent Semantic Indexing to Filter Spam. In: ACM Symposium on Applied Computing, Data Mining Track (2003)

    Google Scholar 

  11. Cohen, W.W.: Learning Rules that Classify E-Mail. In: Proc. of the AAAI Spring Symposium on Machine Learning in Information Access (1996)

    Google Scholar 

  12. Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Paliouras, G., Spyropoulos, C.D.: An Evaluation of Naive Bayesian Anti-Spam Filtering. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 9–17. Springer, Heidelberg (2000)

    Google Scholar 

  13. Ferreira, J.T.A.S., Denison, D.G.T., Hand, D.J.: Weighted NaĂŻve Bayes modeling for data mining, Technical report, Dept. of mathematics at Imperial College (2001)

    Google Scholar 

  14. Kim, H.J., Kim, H.N., Jung, J.J., Jo, G.S.: On Enhancing The Performance of Spam mail Filtering System using Semantic Enrichment. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, Springer, Heidelberg (2004)

    Google Scholar 

  15. Koychev, I., Schwab, I.: Adaption to Drifting User’s Interests. In: Proc. of the ECML200/MLnet Workshop ML in the New Information Age (2000)

    Google Scholar 

  16. Pádraig, C., Niamh, N., Sarah, J.D., Mads, H.: A Case-Based Approach to Spam Filtering that Can Track Concept Drift. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, Springer, Heidelberg (2003)

    Google Scholar 

  17. Delany, S.J., Cunningham, P., Coyle, L.: An Assessment of Case-Based Reasoning for Spam Filtering. Artificial Intelligence Review Journal 24(3-4), 359–378 (2005)

    Article  Google Scholar 

  18. Mitchell, T., Caruana, R., Freitag, D., McDermott, J., Zabowski, D.: Experience with a Learning Personal Assistant. Communications of the ACM 37(7), 81–91 (1994)

    Article  Google Scholar 

  19. Schlimmer, J., Granger, R.: Incremental Learning from Noisy Data. Machine Learning 1(3), 317–357 (1986)

    Google Scholar 

  20. Grabtree, I., Soltysiak, S.: Identifying and Tracking Changing Interests. International Journal of Digital Libraries 2, 38–53 (1998)

    Article  Google Scholar 

  21. Koychev, I.: Gradual Forgetting for Adaptation to Concept Drift. In: Proc. of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, pp. 101–106 (2000)

    Google Scholar 

  22. Yang, Y., Liu, X.: A Re-examination of Text Categorization Methods. In: Proc. of the ACM SIGIR 1999 Conference (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, HJ., Shrestha, J., Kim, HN., Jo, GS. (2006). User Action Based Adaptive Learning with Weighted Bayesian Classification for Filtering Spam Mail. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_83

Download citation

  • DOI: https://doi.org/10.1007/11941439_83

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49787-5

  • Online ISBN: 978-3-540-49788-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics