Abstract
Web content filtering is a means to make end-users aware of the ‘quality’ of Web resources by evaluating their contents and/or characteristics against users’ preferences. Although they can be used for a variety of purposes, Web content filtering tools are mainly deployed as a service for parental control purposes, and for regulating the access to Web content by users connected to the networks of enterprises, libraries, schools, etc. Current Web filtering tools are based on well established techniques, such as data mining and firewall blocking, and they typically cater to the filtering requirements of very specific end-user categories. Therefore, what is lacking is a unified filtering framework able to support all the possible application domains, and making it possible to enforce interoperability among the different filtering approaches and the systems based on them. In this paper, a multi-strategy approach is described, which integrates the available techniques and focuses on the use of metadata for rating and filtering Web information. Such an approach consists of a filtering meta-model, referred to as MFM (Multi-strategy Filtering Model), which provides a general representation of the Web content filtering domain, independently from its possible applications, and of two prototype implementations, partially carried out in the framework of the EU projects EUFORBIA and QUATRO, and designed for different application domains: user protection and Web quality assurance, respectively.
Similar content being viewed by others
References
Adam, N.R., Atluri, V., Bertino, E., Ferrari, E.: A content-based authorization model for digital libraries. IEEE Trans. Knowl. Data Eng. 14(2), 296–315 (2002). doi:10.1109/69.991718
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005). doi:10.1109/TKDE.2005.99
Archer, P.: QUATRO Vocabulary – Version 1.0. QUATRO Technical Specification (2006).http://www.quatro-project.org/vocabulary/
Archer, P., Ferrari, E., Karkaletsis, V., Konstantopoulos, S., Koukourikos, A., Perego, A.: QUATRO Plus: quality you can trust? In: ESWC 2009 Workshop on Trust and Privacy on the Social and Semantic Web (SPOT 2009), CEUR Workshop Proceedings, vol. 447. CEUR-WS.org (2009). http://ceur-ws.org/Vol-447/paper1.pdf
Archer, P., Perego, A., Smith, K.: Protocol for Web description resources (POWDER): grouping of resources. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-grouping/
Archer, P., Shimuzu, N., Ahmed, K., Brickley, D., Appelquist, D., Chandrinos, K.: RDF content labels: schema description. QUATRO Technical Specification (2005). http://www.w3.org/2004/12/q/doc/content-labels-schema.htm
Archer, P., Smith, K., Perego, A.: Protocol for Web description resources (POWDER): description resources. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-dr/
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook. Theory, Implementation, and Applications. Cambridge University Press, Cambridge (2003). doi:10.2277/0521781760
Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: OWL Web Ontology Language: Reference. W3C Recommendation, World Wide Web Consortium (2004). http://www.w3.org/TR/owl-ref/
Berners-Lee, T.: Cwm – A general purpose data processor for the Semantic Web. Project Web Site, World Wide Web Consortium (2008). http://www.w3.org/2000/10/swap/doc/cwm.html
Berners-Lee, T., Connolly, D., Kagal, L., Scharf, Y., Hendler, J.: N3Logic: a logical framework for the World Wide Web. Theory Pract. Log. Program 8(3), 249–269 (2008). doi:10.1017/S1471068407003213
Bertino, E., Ferrari, E., Perego, A.: Content-based filtering of Web documents: The Ma\(\mathcal{X}\) system and the EUFORBIA project. Int. J. Inf. Secur. 2(1), 45–58 (2003). doi:10.1007/s10207-003-0024-6
Bertino, E., Ferrari, E., Perego, A.: Web content filtering. In: Ferrari, E., Thuraisingham, B. (eds.) Web and Information Security, chap. 6, pp. 112–132. IDEA Group, Hershey (2006)
Bonatti, P.A., Olmedilla, D.: Driving and monitoring provisional trust negotiation with metapolicies. In: 6th IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY 2005), pp. 14–23. IEEE CS, Silver Spring (2005). doi:10.1109/POLICY.2005.13
Damianou, N., Dulay, N., Lupu, E., Sloman, M.: The Ponder policy specification language. In: International Workshop on Policies for Distributed Systems and Networks (POLICY 2001), LNCS, vol. 1995, pp. 18–38. Springer, New York (2001). doi:10.1007/3-540-44569-2_2
de Bruijn, J., Lara, R., Polleres, A., Fensel, D.: OWL DL vs. OWL Flight: conceptual modeling and reasoning for the Semantic Web. In: 14th International Conference on World Wide Web (WWW 2005), pp. 623–632. ACM, New York (2005). doi:10.1145/1060745.1060836
Ferraiolo, D.F., Kuhn, D.R., Chandramouli, R. (eds.): Role-Based Access Control, 2nd edn. Artech House, Norwood (2007)
Flesca, S., Greco, S., Tagarelli, A., Zumpano, E.: Mining user preferences, page content and usage to personalize website navigation. World Wide Web 8(3), 317–345 (2005). doi:10.1007/s11280-005-1315-9
Golder, S.A., Huberman, B.A.: The structure of collaborative tagging systems. Comput. Res. Repos. abs/cs/0508082 (2005). http://arxiv.org/abs/cs/0508082
Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M.: SWRL: A Semantic Web rule language combining OWL and RuleML. W3C Member Submission, World Wide Web Consortium (2004). http://www.w3.org/Submission/SWRL/
Kagal, L., Paolucci, M., Srinivasan, N., Denker, G., Finin, T.W., Sycara, K.P.: Authorization and privacy for semantic Web services. IEEE Intell. Syst. 19(4), 50–56 (2004). doi:10.1109/MIS.2004.23
Karkaletsis, V., Perego, A., Archer, P., Stamatakis, K., Nasikas, P., Rose, D.: Quality labeling of Web content: The QUATRO approach. In: WWW 2006 Workshop on Models of Trust for the Web (MTW 2006), CEUR Workshop Proceedings, vol. 190. CEUR-WS.org (2006). http://ceur-ws.org/Vol-190/paper09.pdf
Konstantopoulos, S., Archer, P.: Protocol for Web description resources (POWDER): formal semantics. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-formal/
Lagoze, C., Hunter, J.: The ABC ontology and model. J. Digit. Inf. 2(2) (2001). http://journals.tdl.org/jodi/article/view/44/47
Motik, B., Horrocks, I., Sattler, U.: Bridging the gap between OWL and relational databases. In: 16th International Conference on World Wide Web (WWW 2007), pp. 807–816. ACM, New York (2007). doi:10.1145/1242572.1242681
OASIS: eXtensible Access Control Markup Language (XACML) – Version 2.0. OASIS Standard (2005). http://docs.oasis-open.org/xacml/2.0/access_control-xacml-2.0-core-spec-os.pdf
Resnick, P., Miller, J.: PICS: Internet access controls without censorship. Commun. ACM 39(10), 87–93 (1996). doi:10.1145/236156.236175
Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based access control models. IEEE Comp. 29(2), 38–47 (1996). doi:10.1109/2.485845
Uszok, A., Bradshaw, J.M., Johnson, M., Jeffers, R., Tate, A., Dalton, J., Aitken, S.: KAoS policy management for semantic Web services. IEEE Intell. Syst. 19(4), 32–41 (2004). doi:10.1109/MIS.2004.31
Voß, J.: Tagging, folksonomy & Co—Renaissance of manual indexing? Comput. Res. Repos. abs/cs/0701072 (2007). http://arxiv.org/abs/cs/0701072
Weitzner, D.J., Hendler, J., Berners-Lee, T., Connolly, D.: Creating a policy-aware Web: discretionary, rule-based access for the World Wide Web. In: E. Ferrari, B. Thuraisingham (eds.) Web and Information Security, chap. 1, pp. 1–31. IDEA Group, Hershey (2006)
Winslett, M., Ching, N., Jones, V.E., Slepchin, I.: Using digital credentials on the World Wide Web. J. Comput. Secur. 5(3), 255–267 (1997)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bertino, E., Ferrari, E. & Perego, A. A General Framework for Web Content Filtering. World Wide Web 13, 215–249 (2010). https://doi.org/10.1007/s11280-009-0073-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-009-0073-5