skip to main content
10.1145/2207676.2208553acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Omnipedia: bridging the wikipedia language gap

Authors Info & Claims
Published:05 May 2012Publication History

ABSTRACT

We present Omnipedia, a system that allows Wikipedia readers to gain insight from up to 25 language editions of Wikipedia simultaneously. Omnipedia highlights the similarities and differences that exist among Wikipedia language editions, and makes salient information that is unique to each language as well as that which is shared more widely. We detail solutions to numerous front-end and algorithmic challenges inherent to providing users with a multilingual Wikipedia experience. These include visualizing content in a language-neutral way and aligning data in the face of diverse information organization strategies. We present a study of Omnipedia that characterizes how people interact with information using a multilingual lens. We found that users actively sought information exclusive to unfamiliar language editions and strategically compared how language editions defined concepts. Finally, we briefly discuss how Omnipedia generalizes to other domains facing language barriers.

Skip Supplemental Material Section

Supplemental Material

paperfile282-3.mov

mov

47.4 MB

References

  1. Adafre, S.F. and de Rijke, M. 2006. Finding Similar Sentences Across Multiple Languages in Wikipedia. EACL 2006 Workshop on New Text, Wikis and Blogs and Other Dynamic Text Sources.Google ScholarGoogle Scholar
  2. Adar, E., Skinner, M. and Weld, D.S. 2009. Information Arbitrage Across Multi-lingual Wikipedia. WSDM '09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. von Ahn, L. 2011. Three human computation projects. (2011). SIGCSE '11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Au Yeung, C.-man, Duh, K. and Nagata, M. 2011. Providing Cross-Lingual Editing Assistance to Wikipedia Editors. CICL '11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bergstrom, T. and Karahalios, K. 2009. Conversation clusters: grouping conversation topics through human-computer dialog. CHI '09. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Budanitsky, A. and Hirst, G. 2006. Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics. 32, 1 (2006), 13--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Callahan, E.S. and Herring, S.C. Cultural bias in Wikipedia content on famous persons. Journal of the American Society for Information Science and Technology. 62: 1899--1915. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Capocci, A., Servedio, V.D.P., Colaiori, F., Buriol, L.S., Donato, D., Leonardi, S. and Caldarelli, G. 2006. Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia. Physical Review E. 74, 3 (2006), 036116.Google ScholarGoogle ScholarCross RefCross Ref
  9. Dong, W. and Fu, W.-T. 2010. Cultural difference in image tagging. CHI '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Duolingo: http://duolingo.com/. Accessed: 2011-09--13.Google ScholarGoogle Scholar
  11. Filatova, E. 2009. Multilingual Wikipedia, Summarization, and Information Trustworthiness. SIGIR Workshop on Information Access in a Multilingual World.Google ScholarGoogle Scholar
  12. Frequently asked questions - Wikimedia Foundation: http://wikimediafoundation.org/wiki/Frequently_asked_questions. Accessed: 2011-09--21.Google ScholarGoogle Scholar
  13. Gärdenfors, P. 2000. Conceptual Spaces: The Geometry of Thought. The MIT Press. Google ScholarGoogle ScholarCross RefCross Ref
  14. Hecht, B. and Gergle, D. 2009. Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories. Communities and Technologies 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hecht, B. and Gergle, D. 2010. The tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context. CHI '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hong, L., Convertino, G. and Chi, E.H. 2011. Language Matters in Twitter: A Large Scale Study. ICWSM '11.Google ScholarGoogle Scholar
  17. Jarmasz, M. and Szpakowicz, S. 2003. Roget's thesaurus and semantic similarity. RANLP '03.Google ScholarGoogle Scholar
  18. Kittur, A., Suh, B. and Chi, E.H. 2008. Can you ever trust a wiki?: impacting perceived trustworthiness in wikipedia. CSCW '08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kumaran, A., Datha, N., Ashok, B., Saravanan, K., Ande, A., Sharma, A., Vedantham, S., Natampally, V., Dendi, V. and Maurice, S. 2010. WikiBABEL: A System for Multilingual Wikipedia Content. American Machine Translation Association (AMTA) Workshop.Google ScholarGoogle Scholar
  20. wiki/List_of_Wikipedias. Accessed: 2011-09--20.Google ScholarGoogle Scholar
  21. Manypedia: 2011. http://www.manypedia.com/.Google ScholarGoogle Scholar
  22. de Melo, G. and Weikum, G. 2010. Untangling the Cross-Lingual Link Structure of Wikipedia. ACL '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mihalcea, R. and Csomai, A. 2007. Wikify!: linking documents to encyclopedic knowledge. CIKM '07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Milne, D. and Witten, I.H. 2008. An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links. WIKIAI '08.Google ScholarGoogle Scholar
  25. Milne, D. and Witten, I.H. 2008. Learning to link with wikipedia. CIKM '08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Oh, J.-H., Kawahara, D., Uchimoto, K., Kazama, J. and Torisawa, K. 2008. Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia. WIIAT '08. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Pfeil, U., Zaphiris, P. and Ang, C.S. 2006. Cultural Differences in Collaborative Authoring of Wikipedia. Journal of Computer-Mediated Communication. 12, 1, 88--113.Google ScholarGoogle ScholarCross RefCross Ref
  28. Sorg, P. and Cimiano, P. 2008. Enriching the Crosslingual Link Structure of Wikipedia - A Classification-based Approach. WIKI-AI '08.Google ScholarGoogle Scholar
  29. Suh, B., Chi, E.H, Pendleton, B.A. and Kittur, A. 2007. Us vs. Them: Understanding Social Dynamics in Wikipedia with Revert Graph Visualizations. VAST '07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Translating the world's information with Google Translator Toolkit: 2009. http://googleblog.blogspot.com/2009/06/translating-worlds-information-with.html. Accessed: 2011-09--16.Google ScholarGoogle Scholar
  31. Viégas, F.B., Wattenberg, M. and Dave, K. 2004. Studying cooperation and conflict between authors with history flow visualizations. CHI '04.Google ScholarGoogle Scholar
  32. Wattenberg, M., Viégas, F.B. and Hollenbach, K. 2007. Visualizing activity on wikipedia with chromograms. INTERACT '07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. WikiBhasha beta -- A multi-lingual content creator for Wikipedia: http://www.wikibhasha.org/.Google ScholarGoogle Scholar

Index Terms

  1. Omnipedia: bridging the wikipedia language gap

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CHI '12: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
      May 2012
      3276 pages
      ISBN:9781450310154
      DOI:10.1145/2207676

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 May 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate6,199of26,314submissions,24%

      Upcoming Conference

      CHI '24
      CHI Conference on Human Factors in Computing Systems
      May 11 - 16, 2024
      Honolulu , HI , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader