Automatic violent content web filtering approach based on the KDD process
International Journal of Web Information Systems
ISSN: 1744-0084
Article publication date: 21 November 2008
Abstract
Purpose
The growth of the web and the increasing number of documents electronically available has been paralleled by the emergence of harmful web pages content such as pornography, violence, racism, etc. This emergence involved the necessity of providing filtering systems designed to secure the internet access. Most of them process mainly the adult content and focus on blocking pornography, marginalizing violence. The purpose of this paper is to propose a violent web content detection and filtering system, which uses textual and structural content‐based analysis.
Design/methodology/approach
The violent web content detection and filtering system uses textual and structural content‐based analysis based on a violent keyword dictionary. The paper focuses on the keyword dictionary preparation, and presents a comparative study of different data mining techniques to block violent content web pages.
Findings
The solution presented in this paper showed its effectiveness by scoring a 89 per cent classification accuracy rate on its test data set.
Research limitations/implications
Many future work directions can be considered. This paper analyzed only the web page, and an additional analysis of the visual content can be one of the directions of future work. Future research is underway to develop effective filtering tools for other types of harmful web pages, such as racist, etc.
Originality/value
The paper's major contributions are first, the study and comparison of several decision tree building algorithms to build a violent web classifier based on a textual and structural content‐based analysis for improving web filtering. Second, showing laborious dictionary building by finding automatically discriminative indicative keywords.
Keywords
Citation
Hammami, M., Guermazi, R. and Ben Hamadou, A. (2008), "Automatic violent content web filtering approach based on the KDD process", International Journal of Web Information Systems, Vol. 4 No. 4, pp. 441-464. https://doi.org/10.1108/17440080810919486
Publisher
:Emerald Group Publishing Limited
Copyright © 2008, Emerald Group Publishing Limited