Abstract
Annotated Web content, digital libraries, news and media portals, e-commerce web sites, online catalogs, RDF/OWL knowledge bases and online encyclopedias can be considered containers of named entities such as organizations, persons, locations. Entities are mostly implicitly mentioned in texts or multi-media content, but increasingly explicit in structured annotations such as the ones provided by the Semantic Web. Today, as a result of different research projects and commercial initiatives, systems deal with massive amounts of data that are either explicitly or implicitly related to entities, which have to managed in an efficient way. This paper contributes to Web Science by attempting to measure and interpret trends of entity popularity on the WWW, taking into consideration the occurrence of named entities in a large news corpus, and correlating these findings with analysis results on how entities are searched for, based on a large search engine query log. The study shows that entity popularity follows well-known trends, which can be of interest for several aspects in the development of services and applications on the WWW that deal with larger amounts of data about (named) entities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bazzanella, B., Stoermer, H., Bouquet, P.: Top Level Categories and Attributes for Entity Representation. Technical Report 1, University of Trento, Scienze della Cognizione e della Formazione (September 2008), http://eprints.biblio.unitn.it/archive/00001467/
Berners-Lee, T., Hall, W., Hendler, J., Shadbolt, N., Weitzner, D.J.: Creating a Science of the Web. Science 313(5788), 769–771 (2006)
Bouquet, P., Stoermer, H., Niederee, C., Mana, A.: Entity Name System: The Backbone of an Open and Scalable Web of Data. In: Proceedings of the IEEE International Conference on Semantic Computing, ICSC 2008, number CSS-ICSC 2008-4-28-25 in CSS-ICSC, pp. 554–561. IEEE Computer Society, Los Alamitos (August 2008)
Cheng, G., Ge, W., Qu, Y.: Falcons: searching and browsing entities on the semantic web. In: WWW 2008: Proceeding of the 17th International Conference on World Wide Web, pp. 1101–1102. ACM, New York (2008)
Cheng, T., Yan, X., Chang, K.C.-C.: Entityrank: searching entities directly and holistically. In: VLDB 2007: Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, pp. 387–398 (2007)
Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data (2007)
Conover, W.J.: A Kolmogorov Goodness-of-Fit Test for Discontinuous Distributions. Journal of the American Statistical Association 67(339), 591–596 (1972)
International Press Telecommunications Council. Guide for implementers. Document revision 1, International Press Telecommunications Council (2009)
Demartini, G., Firan, C.S., Iofciu, T., Krestel, R., Nejdl, W.: A Model for Ranking Entities and Its Application to Wikipedia. In: Latin American Web Conference, LA-WEB 2008, pp. 28–38 (2008)
Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., Weitzner, D.: Web science: an interdisciplinary approach to understanding the web. ACM Commun. 51(7), 60–69 (2008)
Jin, Y., Matsuo, Y., Ishizuka, M.: Ranking companies on the web using social network mining. In: Ting, H., Wu, H.-J. (eds.) Web Mining Applications in E-commerce and E-services, ch. 8, pp. 137–152. Springer, Heidelberg (2008)
Jin, Y., Matsuo, Y., Ishizuka, M.: Ranking entities on the web using social network mining and ranking learning. In: WWW 2008 Workshop on Social Web Search and Mining (2008)
Newman, M.E.J.: Power laws, pareto distributions and zipf’s law. Contemporary Physics 46(5), 323–351 (2005)
Nie, Z., Ma, Y., Shi, S., Wen, J.-R., Ma, W.-Y.: Web object retrieval. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 81–90. ACM, New York (2007)
Nie, Z., Wen, J.-R., Ma, W.-Y.: Object-level vertical search. In: CIDR, pp. 235–246 (2007), www.crdrdb.org
Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: InfoScale 2006: Proceedings of the 1st International Conference on Scalable Information Systems. ACM Press, New York (2006)
Popov, B., Kitchukov, I., Angelova, K., Kiryakov, A.: Co-occurrence and Ranking of Entities. Ontotext Technology White Paper (May 2006)
Schnegg, M.: Reciprocity and the emergence of power laws in social networks. International Journal of Modern Physics 17(8) (August 2006)
Shiode, N., Batty, M.: Power law distributions in real and virtual worlds (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fogarolli, A., Giannakopoulos, G., Stoermer, H. (2010). Entity Popularity on the Web: Correlating ANSA News and AOL Search. In: Dicheva, D., Dochev, D. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2010. Lecture Notes in Computer Science(), vol 6304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15431-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-15431-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15430-0
Online ISBN: 978-3-642-15431-7
eBook Packages: Computer ScienceComputer Science (R0)