skip to main content
10.1145/2567948.2579048acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Wikipedia as a time machine

Published:07 April 2014Publication History

ABSTRACT

Wikipedia encyclopaedia projects, which consist of vast collections of user-edited articles covering a wide range of topics, are among some of the most popular websites on internet. With so many users working collaboratively, mainstream events are often very quickly reflected by both authors editing content and users reading articles. With temporal signals such as changing article content, page viewing activity and the link graph readily available, Wikipedia has gained attention in recent years as a source of temporal event information. This paper serves as an overview of the characteristics and past work which support Wikipedia (English, in this case) for time-aware information retrieval research. Furthermore, we discuss the main content and meta-data temporal signals available along with illustrative analysis. We briefly discuss the source and nature of each signal, and any issues that may complicate extraction and use. To encourage further temporal research based on Wikipedia, we have released all the distilled datasets referred to in this paper.

References

  1. Wikipedia: Wikipedia is not a newspaper. http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_is_not_a_newspaper.Google ScholarGoogle Scholar
  2. J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In Research and Development in Information Retrieval, pages 37--45, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Baresch, L. Knight, D. Harp, and C. Yaschur. Friends who choose your news: An analysis of content links on facebook. In ISOJ: The Official Research Journal of International Symposium on Online Journalism, Austin, TX, volume 1, 2011.Google ScholarGoogle Scholar
  4. M. Ciglan and K. Nørvåg. Wikipop: personalized event detection system based on wikipedia page view statistics. In CIKM '10, pages 1931--1932, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. A. Dow, L. A. Adamic, and A. Friggeri. The anatomy of large facebook cascades. In ICWSM, 2013.Google ScholarGoogle Scholar
  6. M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl, and S. Siersdorfer. Extracting event-related information from article updates in wikipedia. In ECIR '13, pages 254--266, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Georgescu, D. D. Pham, N. Kanhabua, S. Zerr, S. Siersdorfer, and W. Nejdl. Temporal summarization of event-related updates in wikipedia. WWW '13 Companion, pages 281--284, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Halavais and D. Lackaff. An analysis of topical coverage of Wikipedia. Journal of Computer-Mediated Communication, 13(2):429--440, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  9. B. Keegan, D. Gergle, and N. Contractor. Hot off the wiki: Structures and dynamics of wikipedia's coverage of breaking news events. American Behavioral Scientist, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  10. A. J. McMinn, Y. Moshfeghi, and J. M. Jose. Building a large-scale corpus for evaluating event detection on twitter. CIKM '13, pages 409--418, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Osborne, S. Petrovic, R. McCreadie, C. Macdonald, and I. Ounis. Bieber no more: First Story Detection using Twitter and Wikipedia. SIGIR 2012 Workshop on Time-aware Information Access (#TAIA2012), 2012.Google ScholarGoogle Scholar
  12. M. Potthast, B. Stein, and R. Gerling. Automatic vandalism detection in wikipedia. In ECIR, pages 663--668, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. WWW '10, pages 851--860, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Steiner, S. van Hooland, and E. Summers. Mj no more: using concurrent wikipedia edit spikes with social network plausibility checks for breaking news detection. WWW '13 Companion, pages 791--794, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Strötgen and M. Gertz. Multilingual and cross-domain temporal tagging. Language Resources and Evaluation, 2012.Google ScholarGoogle Scholar
  16. F. Vis. Wikinews reporting of hurricane katrina. In Citizen Journalism: Global Perspectives, Global Crises and the Media. Peter Lang, 2009.Google ScholarGoogle Scholar
  17. M. Wattenberg, F. B. Viégas, and K. Hollenbach. Visualizing activity on wikipedia with chromograms. INTERACT '07, pages 272--287, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Whiting, K. Zhou, J. Jose, O. Alonso, and T. Leelanupab. Crowdtiles: presenting crowd-based information for event-driven information needs. CIKM '12, pages 2698--2700, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Whiting, K. Zhou, and J. M. Jose. Temporal variance of intents in multi-faceted event-driven information needs. SIGIR '13. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. K. Zhou, S. Whiting, J. M. Jose, and M. Lalmas. The impact of temporal intent variability on diversity evaluation. ECIR '13, pages 820--823, Berlin, Heidelberg, 2013. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Wikipedia as a time machine

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '14 Companion: Proceedings of the 23rd International Conference on World Wide Web
      April 2014
      1396 pages
      ISBN:9781450327459
      DOI:10.1145/2567948

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 April 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader