Abstract
This paper presents a classification method based on web structure strategy and support vector machine, which can be applied to the classification of a large number of web pages. The algorithm designs the initial filter layer to recall the web pages quickly according to the structure of the web pages, and then trains the SVM classifier to do two classifications to improve the accuracy. Experiments show that the classification algorithm is feasible.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ma, Y., Zheng, X., Xianmin, Li, Y.: XML web page classification method based on web mining and document object model tree. Microcomput. Appl. 32(07), 47–49+52 (2016)
Chakrabarti, D., Kumar, R., Punera, K.: Page-level template detection via isotonic smoothing. In: Proceedings of the 16th International Conference on World Wide Web, pp. 61–70. ACM, New York (2007)
Zhang, D.: Research and Implementation of Content-Oriented Web Page Classification Method. Nanjing University of Posts and Telecommunications (2017)
Ye, L.: Design and Research of Web Classification Scheme Based on SVM. Beijing University of Posts and Telecommunications (2014)
Guo, X.: Application of information gain rate and chi-square test (IGRAC) in manufacturing industry. J. Comput. Sci. Inst. Inf. Technol. Appl. Southwest Univ. Financ. Econ. 3 (2009)
Wang, W., Guo, X.: The method of selecting kernel function. J. Liaoning Normal Univ. (Nat. Sci. Ed.) 01, 1–4 (2008)
Gu, M., et al.: Research on web page classification technology based on structure and text features. J China Univ. Sci. Technol. 47(04), 290–296 (2017)
He, N., Wang, J., Zhou, Q., Cao, Y.: Chinese web page classifier based on decision support vector machine. Comput. Eng. 02, 47–48 (2003)
Ding, S., Qi, P., Tan, H.: Review of support vector machine theory and algorithms. J. Univ. Electron. Sci. Technol. 40(01), 2–10 (2011)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Y., Yao, Z. (2019). Multi-layer Filtering Webpage Classification Method Based on SVM. In: Milošević, D., Tang, Y., Zu, Q. (eds) Human Centered Computing. HCC 2019. Lecture Notes in Computer Science(), vol 11956. Springer, Cham. https://doi.org/10.1007/978-3-030-37429-7_56
Download citation
DOI: https://doi.org/10.1007/978-3-030-37429-7_56
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37428-0
Online ISBN: 978-3-030-37429-7
eBook Packages: Computer ScienceComputer Science (R0)