ABSTRACT
Children spend significant amounts of time on the Internet. Recent studies showed, that during these periods they are often not under adult supervision. This work presents an automatic approach to identifying suitable web pages for children based on topical and non-topical web page aspects. We discuss the characteristics of children's web sites with respect to recent findings in children's psychology and cognitive sciences. We finally evaluate our approach in a large-scale user study, finding, that it compares favourably to state of the art methods while approximating human performance.
- PuppyIR: An Open Source Environment to Construct Information Services for Children. http://www.puppyir.eu.Google Scholar
- Ask Kids. http://www.askkids.com, 2010.Google Scholar
- CrowdFlower. http://www.crowdflower.com, 2010.Google Scholar
- The Open Directory Project - Kids & Teens. http://www.dmoz.org/kids and teens/, 2010.Google Scholar
- Yahoo! Kids. http://kids.yahoo.com/, 2010.Google Scholar
- P.N. Bennett and N. Nguyen. Refined experts: improving classification in large taxonomies. In SIGIR 2009. Google ScholarDigital Library
- J. Callan and M. Eskenazi. Combining lexical and grammatical features to improve readability measures for first and second language texts. In NAACL HLT, 2007.Google Scholar
- C. Castillo, D. Donato, A. Gionis, V. Murdock, and F. Silvestri. Know your neighbors: Web spam detection using the web topology. In SIGIR 2007. Google ScholarDigital Library
- K. Collins-Thompson and J. Callan. A language modeling approach to predicting reading difficulty. In Proceedings of HLT/NAACL, volume 4, 2004.Google Scholar
- L. Feng. Automatic readability assessment for people with intellectual disabilities. ACM SIGACCESS, (93), 2009. Google ScholarDigital Library
- L. Feng, N. Elhadad, and M. Huenerfauth. Cognitively motivated features for readability assessment. In EACL, pages 229--237. ACL, 2009. Google ScholarDigital Library
- E. Gabrilovich and S. Markovitch. Harnessing the expertise of 70,000 human editors: Knowledge-based feature generation for text categorization. Journal of Machine Learning Research, 8:2297--2345, 2007. Google ScholarDigital Library
- K. Golub and A. Ardo. Importance of HTML structural elements and metadata in automated subject classification. ECDL 2005, pages 368--378. Google ScholarDigital Library
- G. R. Klare. The measurement of readability: useful information for communicators. ACM Journal of Computer Documentation (JCD), 24(3):121, 2000. Google ScholarDigital Library
- P. Kolari, T. Finin, and A. Joshi. SVMs for the blogosphere: Blog identification and splog detection. In AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs, 2006.Google Scholar
- A. Large, J. Beheshti, and T. Rahman. Design criteria for children's Web portals: The users speak out. JASIST, 53(2):79--94, 2002. Google ScholarDigital Library
- B. Liu, M. Hu, and J. Cheng. Opinion observer: Analyzing and comparing opinions on the web. In WWW 2005. Google ScholarDigital Library
- T. Y. Liu, Y. Yang, H. Wan, H. J. Zeng, Z. Chen, and W. Y. Ma. Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explorations Newsletter, 7(1):43, 2005. Google ScholarDigital Library
- S. Naidu. Evaluating the usability of educational websites for children. Usability News, 7(2), 2005.Google Scholar
- A. Ntoulas, G. Chao, and J. Cho. The infocious web search engine: Improving web searching through linguistic analysis. In WWW 2005, pages 840--849. Google ScholarDigital Library
- Ofcom. Uk children's media literacy: Research document. http://www.ofcom.org.uk/advice /medialiteracy/medlitpub/medlitpubrss/ukchildrensml/ukchildrensml1.pdf, March 2010.Google Scholar
- E. Pitler and A. Nenkova. Revisiting readability: A unified framework for predicting text quality. In EMNLP 2008. Google ScholarDigital Library
- S. Schwarm and M. Ostendorf. Reading level assessment using support vector machines and statistical language models. In ACL 2005, volume 43. Google ScholarDigital Library
- E. A. Wartella, E. A. Vandewater, and V. J. Rideout. Introduction: electronic media use in the lives of infants, toddlers, and preschoolers. American Behavioral Scientist, 48(5):501, 2005.Google ScholarCross Ref
Index Terms
- Web page classification on child suitability
Recommendations
A combined topical/non-topical approach to identifying web sites for children
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data miningToday children interact more and more frequently with information services. Especially in on-line scenarios there is a great amount of content that is not suitable for their age group. Due to the growing importance and ubiquity of the Internet in today'...
Tangible interaction in parent-child collaboration: encouraging awareness and reflection
IDC '18: Proceedings of the 17th ACM Conference on Interaction Design and ChildrenParent-child interaction during a collaborative activity can empower children if parents are able to envision their child's mental state and regulate their behavior. However, this ability is a great challenge for many parents. We designed a simple ...
Analysis of Search and Browsing Behavior of Young Users on the Web
The Internet is increasingly used by young children for all kinds of purposes. Nonetheless, there are not many resources especially designed for children on the Internet and most of the content online is designed for grown-up users. This situation is ...
Comments