Skip to main content
Log in

A General Framework for Web Content Filtering

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Web content filtering is a means to make end-users aware of the ‘quality’ of Web resources by evaluating their contents and/or characteristics against users’ preferences. Although they can be used for a variety of purposes, Web content filtering tools are mainly deployed as a service for parental control purposes, and for regulating the access to Web content by users connected to the networks of enterprises, libraries, schools, etc. Current Web filtering tools are based on well established techniques, such as data mining and firewall blocking, and they typically cater to the filtering requirements of very specific end-user categories. Therefore, what is lacking is a unified filtering framework able to support all the possible application domains, and making it possible to enforce interoperability among the different filtering approaches and the systems based on them. In this paper, a multi-strategy approach is described, which integrates the available techniques and focuses on the use of metadata for rating and filtering Web information. Such an approach consists of a filtering meta-model, referred to as MFM (Multi-strategy Filtering Model), which provides a general representation of the Web content filtering domain, independently from its possible applications, and of two prototype implementations, partially carried out in the framework of the EU projects EUFORBIA and QUATRO, and designed for different application domains: user protection and Web quality assurance, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adam, N.R., Atluri, V., Bertino, E., Ferrari, E.: A content-based authorization model for digital libraries. IEEE Trans. Knowl. Data Eng. 14(2), 296–315 (2002). doi:10.1109/69.991718

    Article  Google Scholar 

  2. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005). doi:10.1109/TKDE.2005.99

    Article  Google Scholar 

  3. Archer, P.: QUATRO Vocabulary – Version 1.0. QUATRO Technical Specification (2006).http://www.quatro-project.org/vocabulary/

  4. Archer, P., Ferrari, E., Karkaletsis, V., Konstantopoulos, S., Koukourikos, A., Perego, A.: QUATRO Plus: quality you can trust? In: ESWC 2009 Workshop on Trust and Privacy on the Social and Semantic Web (SPOT 2009), CEUR Workshop Proceedings, vol. 447. CEUR-WS.org (2009). http://ceur-ws.org/Vol-447/paper1.pdf

  5. Archer, P., Perego, A., Smith, K.: Protocol for Web description resources (POWDER): grouping of resources. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-grouping/

  6. Archer, P., Shimuzu, N., Ahmed, K., Brickley, D., Appelquist, D., Chandrinos, K.: RDF content labels: schema description. QUATRO Technical Specification (2005). http://www.w3.org/2004/12/q/doc/content-labels-schema.htm

  7. Archer, P., Smith, K., Perego, A.: Protocol for Web description resources (POWDER): description resources. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-dr/

  8. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook. Theory, Implementation, and Applications. Cambridge University Press, Cambridge (2003). doi:10.2277/0521781760

    MATH  Google Scholar 

  9. Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: OWL Web Ontology Language: Reference. W3C Recommendation, World Wide Web Consortium (2004). http://www.w3.org/TR/owl-ref/

  10. Berners-Lee, T.: Cwm – A general purpose data processor for the Semantic Web. Project Web Site, World Wide Web Consortium (2008). http://www.w3.org/2000/10/swap/doc/cwm.html

  11. Berners-Lee, T., Connolly, D., Kagal, L., Scharf, Y., Hendler, J.: N3Logic: a logical framework for the World Wide Web. Theory Pract. Log. Program 8(3), 249–269 (2008). doi:10.1017/S1471068407003213

    Article  MATH  MathSciNet  Google Scholar 

  12. Bertino, E., Ferrari, E., Perego, A.: Content-based filtering of Web documents: The Ma\(\mathcal{X}\) system and the EUFORBIA project. Int. J. Inf. Secur. 2(1), 45–58 (2003). doi:10.1007/s10207-003-0024-6

    Article  Google Scholar 

  13. Bertino, E., Ferrari, E., Perego, A.: Web content filtering. In: Ferrari, E., Thuraisingham, B. (eds.) Web and Information Security, chap. 6, pp. 112–132. IDEA Group, Hershey (2006)

    Google Scholar 

  14. Bonatti, P.A., Olmedilla, D.: Driving and monitoring provisional trust negotiation with metapolicies. In: 6th IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY 2005), pp. 14–23. IEEE CS, Silver Spring (2005). doi:10.1109/POLICY.2005.13

    Chapter  Google Scholar 

  15. Damianou, N., Dulay, N., Lupu, E., Sloman, M.: The Ponder policy specification language. In: International Workshop on Policies for Distributed Systems and Networks (POLICY 2001), LNCS, vol. 1995, pp. 18–38. Springer, New York (2001). doi:10.1007/3-540-44569-2_2

    Google Scholar 

  16. de Bruijn, J., Lara, R., Polleres, A., Fensel, D.: OWL DL vs. OWL Flight: conceptual modeling and reasoning for the Semantic Web. In: 14th International Conference on World Wide Web (WWW 2005), pp. 623–632. ACM, New York (2005). doi:10.1145/1060745.1060836

    Chapter  Google Scholar 

  17. Ferraiolo, D.F., Kuhn, D.R., Chandramouli, R. (eds.): Role-Based Access Control, 2nd edn. Artech House, Norwood (2007)

    Google Scholar 

  18. Flesca, S., Greco, S., Tagarelli, A., Zumpano, E.: Mining user preferences, page content and usage to personalize website navigation. World Wide Web 8(3), 317–345 (2005). doi:10.1007/s11280-005-1315-9

    Article  Google Scholar 

  19. Golder, S.A., Huberman, B.A.: The structure of collaborative tagging systems. Comput. Res. Repos. abs/cs/0508082 (2005). http://arxiv.org/abs/cs/0508082

  20. Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M.: SWRL: A Semantic Web rule language combining OWL and RuleML. W3C Member Submission, World Wide Web Consortium (2004). http://www.w3.org/Submission/SWRL/

  21. Kagal, L., Paolucci, M., Srinivasan, N., Denker, G., Finin, T.W., Sycara, K.P.: Authorization and privacy for semantic Web services. IEEE Intell. Syst. 19(4), 50–56 (2004). doi:10.1109/MIS.2004.23

    Article  Google Scholar 

  22. Karkaletsis, V., Perego, A., Archer, P., Stamatakis, K., Nasikas, P., Rose, D.: Quality labeling of Web content: The QUATRO approach. In: WWW 2006 Workshop on Models of Trust for the Web (MTW 2006), CEUR Workshop Proceedings, vol. 190. CEUR-WS.org (2006). http://ceur-ws.org/Vol-190/paper09.pdf

  23. Konstantopoulos, S., Archer, P.: Protocol for Web description resources (POWDER): formal semantics. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-formal/

  24. Lagoze, C., Hunter, J.: The ABC ontology and model. J. Digit. Inf. 2(2) (2001). http://journals.tdl.org/jodi/article/view/44/47

  25. Motik, B., Horrocks, I., Sattler, U.: Bridging the gap between OWL and relational databases. In: 16th International Conference on World Wide Web (WWW 2007), pp. 807–816. ACM, New York (2007). doi:10.1145/1242572.1242681

    Chapter  Google Scholar 

  26. OASIS: eXtensible Access Control Markup Language (XACML) – Version 2.0. OASIS Standard (2005). http://docs.oasis-open.org/xacml/2.0/access_control-xacml-2.0-core-spec-os.pdf

  27. Resnick, P., Miller, J.: PICS: Internet access controls without censorship. Commun. ACM 39(10), 87–93 (1996). doi:10.1145/236156.236175

    Article  Google Scholar 

  28. Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based access control models. IEEE Comp. 29(2), 38–47 (1996). doi:10.1109/2.485845

    Google Scholar 

  29. Uszok, A., Bradshaw, J.M., Johnson, M., Jeffers, R., Tate, A., Dalton, J., Aitken, S.: KAoS policy management for semantic Web services. IEEE Intell. Syst. 19(4), 32–41 (2004). doi:10.1109/MIS.2004.31

    Article  Google Scholar 

  30. Voß, J.: Tagging, folksonomy & Co—Renaissance of manual indexing? Comput. Res. Repos. abs/cs/0701072 (2007). http://arxiv.org/abs/cs/0701072

  31. Weitzner, D.J., Hendler, J., Berners-Lee, T., Connolly, D.: Creating a policy-aware Web: discretionary, rule-based access for the World Wide Web. In: E. Ferrari, B. Thuraisingham (eds.) Web and Information Security, chap. 1, pp. 1–31. IDEA Group, Hershey (2006)

    Google Scholar 

  32. Winslett, M., Ching, N., Jones, V.E., Slepchin, I.: Using digital credentials on the World Wide Web. J. Comput. Secur. 5(3), 255–267 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Perego.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bertino, E., Ferrari, E. & Perego, A. A General Framework for Web Content Filtering. World Wide Web 13, 215–249 (2010). https://doi.org/10.1007/s11280-009-0073-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-009-0073-5

Keywords

Navigation