Skip to main content

Entity Popularity on the Web: Correlating ANSA News and AOL Search

  • Conference paper
Artificial Intelligence: Methodology, Systems, and Applications (AIMSA 2010)

Abstract

Annotated Web content, digital libraries, news and media portals, e-commerce web sites, online catalogs, RDF/OWL knowledge bases and online encyclopedias can be considered containers of named entities such as organizations, persons, locations. Entities are mostly implicitly mentioned in texts or multi-media content, but increasingly explicit in structured annotations such as the ones provided by the Semantic Web. Today, as a result of different research projects and commercial initiatives, systems deal with massive amounts of data that are either explicitly or implicitly related to entities, which have to managed in an efficient way. This paper contributes to Web Science by attempting to measure and interpret trends of entity popularity on the WWW, taking into consideration the occurrence of named entities in a large news corpus, and correlating these findings with analysis results on how entities are searched for, based on a large search engine query log. The study shows that entity popularity follows well-known trends, which can be of interest for several aspects in the development of services and applications on the WWW that deal with larger amounts of data about (named) entities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bazzanella, B., Stoermer, H., Bouquet, P.: Top Level Categories and Attributes for Entity Representation. Technical Report 1, University of Trento, Scienze della Cognizione e della Formazione (September 2008), http://eprints.biblio.unitn.it/archive/00001467/

  2. Berners-Lee, T., Hall, W., Hendler, J., Shadbolt, N., Weitzner, D.J.: Creating a Science of the Web. Science 313(5788), 769–771 (2006)

    Article  Google Scholar 

  3. Bouquet, P., Stoermer, H., Niederee, C., Mana, A.: Entity Name System: The Backbone of an Open and Scalable Web of Data. In: Proceedings of the IEEE International Conference on Semantic Computing, ICSC 2008, number CSS-ICSC 2008-4-28-25 in CSS-ICSC, pp. 554–561. IEEE Computer Society, Los Alamitos (August 2008)

    Google Scholar 

  4. Cheng, G., Ge, W., Qu, Y.: Falcons: searching and browsing entities on the semantic web. In: WWW 2008: Proceeding of the 17th International Conference on World Wide Web, pp. 1101–1102. ACM, New York (2008)

    Chapter  Google Scholar 

  5. Cheng, T., Yan, X., Chang, K.C.-C.: Entityrank: searching entities directly and holistically. In: VLDB 2007: Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, pp. 387–398 (2007)

    Google Scholar 

  6. Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data (2007)

    Google Scholar 

  7. Conover, W.J.: A Kolmogorov Goodness-of-Fit Test for Discontinuous Distributions. Journal of the American Statistical Association 67(339), 591–596 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  8. International Press Telecommunications Council. Guide for implementers. Document revision 1, International Press Telecommunications Council (2009)

    Google Scholar 

  9. Demartini, G., Firan, C.S., Iofciu, T., Krestel, R., Nejdl, W.: A Model for Ranking Entities and Its Application to Wikipedia. In: Latin American Web Conference, LA-WEB 2008, pp. 28–38 (2008)

    Google Scholar 

  10. Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., Weitzner, D.: Web science: an interdisciplinary approach to understanding the web. ACM Commun. 51(7), 60–69 (2008)

    Article  Google Scholar 

  11. Jin, Y., Matsuo, Y., Ishizuka, M.: Ranking companies on the web using social network mining. In: Ting, H., Wu, H.-J. (eds.) Web Mining Applications in E-commerce and E-services, ch. 8, pp. 137–152. Springer, Heidelberg (2008)

    Google Scholar 

  12. Jin, Y., Matsuo, Y., Ishizuka, M.: Ranking entities on the web using social network mining and ranking learning. In: WWW 2008 Workshop on Social Web Search and Mining (2008)

    Google Scholar 

  13. Newman, M.E.J.: Power laws, pareto distributions and zipf’s law. Contemporary Physics 46(5), 323–351 (2005)

    Article  Google Scholar 

  14. Nie, Z., Ma, Y., Shi, S., Wen, J.-R., Ma, W.-Y.: Web object retrieval. In: WWW 2007: Proceedings of the 16th International Conference on World Wide Web, pp. 81–90. ACM, New York (2007)

    Chapter  Google Scholar 

  15. Nie, Z., Wen, J.-R., Ma, W.-Y.: Object-level vertical search. In: CIDR, pp. 235–246 (2007), www.crdrdb.org

  16. Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: InfoScale 2006: Proceedings of the 1st International Conference on Scalable Information Systems. ACM Press, New York (2006)

    Google Scholar 

  17. Popov, B., Kitchukov, I., Angelova, K., Kiryakov, A.: Co-occurrence and Ranking of Entities. Ontotext Technology White Paper (May 2006)

    Google Scholar 

  18. Schnegg, M.: Reciprocity and the emergence of power laws in social networks. International Journal of Modern Physics 17(8) (August 2006)

    Google Scholar 

  19. Shiode, N., Batty, M.: Power law distributions in real and virtual worlds (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fogarolli, A., Giannakopoulos, G., Stoermer, H. (2010). Entity Popularity on the Web: Correlating ANSA News and AOL Search. In: Dicheva, D., Dochev, D. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2010. Lecture Notes in Computer Science(), vol 6304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15431-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15431-7_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15430-0

  • Online ISBN: 978-3-642-15431-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics