Skip to main content

Artificial Immune System Based Web Page Classification

  • Conference paper
Software Engineering in Intelligent Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 349))

Abstract

Automated classification of web pages is an important research direction in web mining, which aims to construct a classification model that can classify new instances based on labeled web documents. Machine learning algorithms are adapted to textual classification problems, including web document classification. Artificial immune systems are a branch of computational intelligence inspired by biological immune systems which is utilized to solve a variety of computational problems, including classification. This paper examines the effectiveness and suitability of artificial immune system based approaches for web page classification. Hence, two artificial immune system based classification algorithms, namely Immunos-1 and Immunos-99 algorithms are compared to two standard machine learning techniques, namely C4.5 decision tree classifier and Naïve Bayes classification. The algorithms are experimentally evaluated on 50 data sets obtained from DMOZ (Open Directory Project). The experimental results indicate that artificial immune based systems achieve higher predictive performance for web page classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fürnkranz, J.: Web Mining. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 891–920. Springer, Heidelberg (2005)

    Google Scholar 

  2. Zhang, Q., Richard, S.: Web Mining: A Survey of Current Research, Techniques, and Software. Int. J. Info. Tech. Dec. Mak. 7, 683–720 (2008)

    Article  Google Scholar 

  3. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

  4. Bhatia, M.P.S., Kumar, A.: Information Retrieval and Machine Learning: Supporting Technologies for Web Mining Research and Practice. Webology 5(2), Article 55 (2008)

    Google Scholar 

  5. Qi, X., Davison, B.D.: Web Page Classification: Features and Algorithms. ACM Computing Surveys 41(2), Article 12 (2009)

    Google Scholar 

  6. Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

  7. de Castro, L.N., Timmis, J.: Artificial Immune Systems: A Novel Paradigm to Pattern Recognition. In: Corchado, J.M., Alonso, L., Fyfe, C. (eds.) Artificial Neural Networks in Pattern Recognition, pp. 67–84 (2002)

    Google Scholar 

  8. Zheng, J., Chen, Y., Zhang, W.: A Survey of Artificial Immune Applications. Artificial Intelligence Review 34, 19–34 (2010)

    Article  Google Scholar 

  9. Lee, H.-M., Chen, C.-M., Tan, C.-C.: An Intelligent Web-Page Classifier with Fair Feature-Subset Selection. In: Joint 9th IFSA World Congress and 20th NAFIPS International Conference, pp. 395–400. IEEE Press, New York (2001)

    Google Scholar 

  10. Haruechaiyasak, C., Shyu, M.-C., Chen, S.-C.: Web Document Classification Based on Fuzzy Association. In: 26th Annual International Computer Software and Applications Conference, pp. 487–492. IEEE Press, New York (2002)

    Google Scholar 

  11. Wang, Y., Hodges, J., Tang, B.: Classification of Web Documents Using a Naïve Bayes Method. In: 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 560–564. IEEE Press, New York (2003)

    Chapter  Google Scholar 

  12. Kwon, O.-W., Lee, J.-H.: Text Categorization based on K-nearest Neighbor Approach for Web site Classification. Information Processing and Management 39, 25–44 (2003)

    Article  MATH  Google Scholar 

  13. Qi, D., Sun, B.: A Genetic K-means Approaches for Automated Web Page Classification. In: Proceedings of the 2004 IEEE International Conference on Information Reuse and Integration, pp. 241–246. IEEE Press, New York (2004)

    Google Scholar 

  14. Selamat, A., Omatu, S.: Web page feature selection and classification using neural networks. Information Sciences 158, 69–88 (2004)

    Article  MathSciNet  Google Scholar 

  15. Yi, G., Hu, H., Lu, Z.: Web Document Classification Based on Extended Rough Set. In: PDCAT 2005, pp. 916–919. IEEE Press, New York (2005)

    Google Scholar 

  16. Chen, R.-C., Hsich, C.-H.: Web Page Classification Based on a Support Vector Machine Using a Weighted Vote Schema. Expert Systems with Applications 31, 427–435 (2006)

    Article  Google Scholar 

  17. Materna, J.: Automated Web Page Classification. In: Proceedings of Recent Advances in Slavonic Natural Language Processing, Masaryk University, pp. 84–93 (2008)

    Google Scholar 

  18. Zhang, J., Niu, Y., Nie, H.: Web Document Classification Based on Fuzzy k-NN Algorithm. In: Proceedings of the 2009 International Conference on Computational Intelligence and Security, pp. 193–196. IEEE Press, Washington (2009)

    Chapter  Google Scholar 

  19. Chen, C.-M., Lee, H.-M., Chang, Y.-J.: Two Novel Feature Selection Approaches for Web Page Classification. Expert Systems with Applications 36, 260–272 (2009)

    Article  Google Scholar 

  20. Özel, S.A.: A Web Page Classification System Based on a Genetic Algorithm Using Tagged-Terms as Features. Expert Systems with Applications 38, 3407–3415 (2011)

    Article  Google Scholar 

  21. de Castro, L.N., Timmis, J.: Artificial Immune System: A New Computational Intelligence Approach. Springer, Heidelberg (2002)

    Google Scholar 

  22. Timmis, J., Hone, A., Stibor, T., Clark, E.: Theoretical advances in artificial immune systems. Theoretical Computer Science 403, 11–32 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  23. Sinha, J.K., Bhattacharya, S.: A Text Book of Immunology. Academic Pub., Kolkata (2006)

    Google Scholar 

  24. de Castro, L.N., Zuben, F.J.V.: Artificial Immune Systems: Part I- Basic Theory and Applications, Technical report, RT-DCA (1999)

    Google Scholar 

  25. de Castro, L., Zuben, F.: Learning and Optimization Using the Clonal Selection Principle. IEEE Transactions on Evolutionary Computation 6(3), 239–251 (2002)

    Article  Google Scholar 

  26. Ruochen, L., Haifeng, D., Licheng, J.: Immunity Clonal Strategies. In: Proceedings of the Fifth International Conference on Computational Intelligence and Multimedia Applications, pp. 290–295. IEEE Press, Washington (2003)

    Google Scholar 

  27. Garrett, S.: Parameter-Free Adaptive Clonal Selection. In: Proceedings of Congress on Evolutionary Computation, pp. 1052–1058. IEEE Press, Washington (2004)

    Google Scholar 

  28. White, J.A., Garrett, S.M.: Improved Pattern Recognition with Artificial Clonal Selection? In: Timmis, J., Bentley, P.J., Hart, E. (eds.) ICARIS 2003. LNCS, vol. 2787, pp. 181–193. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  29. Carter, J.H.: The immune system as a model for classification and pattern recognition. Journal of the American Informatics Association 7, 28–41 (2000)

    Article  Google Scholar 

  30. Brownlee, J.: Immunos-81: The Misunderstood Artificial Immune System. Technical report, Swinburne University (2005)

    Google Scholar 

  31. Wilson, W.O., Birkin, P., Aickelin, U.: Price Trackers Inspired by Immune Memory. In: Bersini, H., Carneiro, J. (eds.) ICARIS 2006. LNCS, vol. 4163, pp. 362–375. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  32. Forrest, S., Perelson, A., Allen, L., Cherukuri, R.: Self-nonself discrimination in a computer. In: Proceedings of the IEEE Symposium on Research in Security and Privacy, pp. 202–212. IEEE Press, New York (1994)

    Google Scholar 

  33. Talbi, E.-G.: Metaheuristics: From Design to Implementation. Wiley, New York (2009)

    Book  Google Scholar 

  34. Hofmeyr, S.A., Forrest, S.: Architecture for an Artificial Immune System. Evolutionary Computation 8(4), 443–473 (2000)

    Article  Google Scholar 

  35. Timmis, J., Neal, M., Hunt, J.: An Artificial Immune System for Data Analysis. Biosystems 55, 143–150 (2000)

    Article  Google Scholar 

  36. Kopacek, L., Olej, V.: Municipal Creditworthiness Mlodeling by Artificial Immune Systems. Acta Electrotehnica et Informatica 10(1), 3–11 (2010)

    Google Scholar 

  37. DMOZ Open Directory Project Dataset, http://www.unicauca.edu.co/~ccobos/wdc/wdc.htm

  38. WEKA Classification Algorithms, http://wekaclassalgos.sourceforge.net/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aytuğ Onan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Onan, A. (2015). Artificial Immune System Based Web Page Classification. In: Silhavy, R., Senkerik, R., Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds) Software Engineering in Intelligent Systems. Advances in Intelligent Systems and Computing, vol 349. Springer, Cham. https://doi.org/10.1007/978-3-319-18473-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18473-9_19

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18472-2

  • Online ISBN: 978-3-319-18473-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics