Abstract
Currently, process for blocking the deviant teaching website is done manually by Malaysia authorities. In addition there are no Web filtering product offered to filter religion content and especially for Malay language. Web filtering can be used as protection against inappropriate and prevention of misuse of the network and hence, it can be used to filter the content of suspicious websites and alleviate the dissemination of such Web page. The purpose of the paper is to filter the deviant teachings Web page and classify them into three categories which are deviate, suspicious and clean. There are three Term Weighting Scheme techniques were used as feature selection included Term Frequency Inverse Document Frequency (TFIDF), Entropy and Modified Entropy. Support Vector Machine (SVM) will be used for classification process. As a result, M. Entropy shows the most suitable term weighting scheme to use in Islamic web pages filtering rather than TFIDF and Entropy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Heins, M., Cho, C., Goldberg, D.: Internet Filters (2006)
Foot, K.A., Schneider, S.M.: Web Campaigning, p. 263. MIT Press (2006)
Du, R., Safavi-Naini, R., Susilo, W.: Web filtering using text classification. In: The 11th IEEE International Conference on Networks, pp. 325–330 (October 2003)
Lee, Z.S.: Enhanced Feature Selection Method For Illicit Web Content Filtering. Universiti Teknologi Malaysia (2010)
Salleh, S.F.M.: Comparative Study On Term Weighting Schemes As Feature Selection Method For Malay Illicit Web Content Filtering. Universiti Teknologi Malaysia (2012)
Lee, Z.-S., Maarof, M.A., Selamat, A., Shamsuddin, S.M.: Enhance Term Weighting Algorithm as Feature Selection Technique for Illicit Web Content Classification. In: 2008 Eighth Int. Conf. Intell. Syst. Des. Appl., pp. 145–150 (November 2008)
Mazlam, N.: Enhancement of Stemming Process for Malay Illicit Web Content. Universiti Teknologi Malaysia (2012)
Sembok, T.M.T., Bakar, Z.A., Ahmad, F.: Experiments in Malay Information Retrieval. In: 2011 International Conference on Electrical Engineering and Informatics (July 2011)
Fadzli, S.A., Norsalehen, A.K., Syarilla, I.A., Hasni, H., Satar, S.D.M.: Simple rules malay stemmer. In: The International Conference on Informatics and Applications (ICIA 2012), pp. 28–35 (2012)
Cummins, R., O’Riordan, C.: Evolved term-weighting schemes in Information Retrieval: an analysis of the solution space. Artif. Intell. Rev. 26(1-2), 35–47 (2007)
Yang, Y., Pederson, J.O.: A Comparative Study on Feature Selection inText Categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML 1997, pp. 412–420 (1997)
Verikas, A., Bacauskiene, M.: Feature selection with neural networks. Pattern Recognit. Lett. 23(11), 1323–1335 (2002)
Liu, Y., Loh, H.T., Sun, A.: Imbalanced text classification: A term weighting approach. Expert Syst. Appl. 36(1), 690–701 (2009)
Salton, G., Wong, A., Yang, C.S.: AVector Space Model for Automatic Indexing. Commun. ACMÂ 18(11) (1975)
Selamat, A., Omatu, S.: Web page feature selection and classification using neural networks. Inf. Sci. (Ny). 158, 69–88 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zamry, N.M., Maarof, M.A., Zainal, A. (2014). Islamic Web Content Filtering and Categorization on Deviant Teaching. In: Herawan, T., Ghazali, R., Deris, M. (eds) Recent Advances on Soft Computing and Data Mining. Advances in Intelligent Systems and Computing, vol 287. Springer, Cham. https://doi.org/10.1007/978-3-319-07692-8_63
Download citation
DOI: https://doi.org/10.1007/978-3-319-07692-8_63
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07691-1
Online ISBN: 978-3-319-07692-8
eBook Packages: EngineeringEngineering (R0)