research-article

Supporting collaboration in Wikipedia between language communities

Authors:
Ranjitha Gurunath Kulkarni

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
View Profile

,
Gaurav Trivedi

National Institute of Technology Karnataka, India

National Institute of Technology Karnataka, India
View Profile

,
Tushar Suresh

National Institute of Technology Karnataka, India

National Institute of Technology Karnataka, India
View Profile

,
Miaomiao Wen

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
View Profile

,
Zeyu Zheng

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
View Profile

,
Carolyn Rose

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA

Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
View Profile

ICIC '12: Proceedings of the 4th international conference on Intercultural CollaborationMarch 2012Pages 47–56https://doi.org/10.1145/2160881.2160890

Published:21 March 2012Publication History

ICIC '12: Proceedings of the 4th international conference on Intercultural Collaboration

Pages 47–56

ABSTRACT

This paper describes an application of machine translation technology for supporting collaboration in Wikipedia. Wikipedia hosts separate language Wikipedias for hundreds of different languages. While some content is specific to these different versions of Wikipedia, some topics have pages within multiple different Wikipedias. Similarly, while some users participate only in one Wikipedia, we find users who play a bridging role between these sub-communities and participate in the process of maintaining similar pages in different Wikipedias. Since these are not the majority of users, a support tool that allows stretching the effort of these specialized users further by indicating where their effort is needed could be a tremendous benefit to the community. An evaluation of the proposed approach demonstrates promise that such a tool could substantially reduce the effort involved in playing this bridging role on Wikipedia.

References

Christof Müller and Iryna Gurevych. 2009 Using Wikipedia and Wiktionary in Domain-Specific Information Retrieval Evaluating Systems for Multilingual and Multimodal Information Access, Springer Berlin /Heidelberg, pp. 219--226. Google ScholarDigital Library
Steinberger, Ralf and Pouliquen, Bruno and Hagman, Johan 2002. Cross-Lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC EProceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing, pp. 415--424. Google ScholarDigital Library
Aminul Islam and Diana Inkpen. 2008, Jul. Semantic Text Similarity Using Corpus-Based Word Similarity and String Similarity ACM Transaction on Knowledge Discovery from Data, Vol. 2, No. 2, Article 10. Google ScholarDigital Library
Wikipedia Infoboxes Help. (2010, Dec.) {Online}. Available: http://en.wikipedia.org/wiki/Help:InfoboxGoogle Scholar
Wikipedia Infoboxes Categories. (2010, Dec.){Online}. Available http://en.wikipedia.org/wiki/Category:InfoboxtemplatesGoogle Scholar
MediaWiki API Documentation. (2010, Dec.) {Online}. Available: http://www.mediawiki.org/wiki/APIoxGoogle Scholar
GoogleTranslate API, developer's guide (v2): Using REST. (2010, Dec.) {Online}. Available: http://code.google.com/apis/language/translate/v2/usingrest.htmlGoogle Scholar
Libcurl - C API documentation. (2010, Dec.) {Online}. Available: http://curl.haxx.se/libcurl/c/Google Scholar
PHP similar text function documentation (2010, Dec.) {Online}. Available: http://php.net/manual/en/function.similar-text.phpGoogle Scholar
Jonathan J. Oliver. 2008, Jul. Decision Graphs - An Extension of Decision Trees. Available: http://www.cs.monash.edu.au/jono/TechReports/TR173.dgraph.psGoogle Scholar
Metzler, Donald and Dumais, Susan and Meek, Christopher 2007. Similarity Measures for Short Segments of Text Advances in Information Retrieval Vol. 4425, Springer Berlin / Heidelberg, pp. 16--27. Google ScholarDigital Library
C. Fellbaum. 1998. WordNet: An Electronical Lexical Database. The MIT Press, Cambridge, MA.Google Scholar
PHP metaphone code generation function by Lawrence Philips. (2010, Dec.) {Online}. Available: http://php.net/manual/en/function.metaphone.phpGoogle Scholar
Binstock & Rex. 1995. Practical Algorithms for Programmers Addison Wesley. Google ScholarDigital Library
Parts Of Speech Tagging, PHP/ir, Information Retrieval and other interesting topics. (2010, Dec.) {Online}. Available: http://phpir.com/part-of-speechtaggingGoogle Scholar
Adar, Skinner and Weld 2009, Information Arbitrage Across Multi-lingual Wikipedia WSDM'09, Barcelona, Spain. Google ScholarDigital Library
Ulrike Pfeil, Panayiotis Zaphiris, Chee Siang Ang 2006, Cultural Differences in Collaborative Authoring of Wikipedia.Google Scholar
B. Latane, K. Williams, and S. Harkins. Many hands make light the work: The causes and consequences of social loafing. J. Pers. Soc. Psych., 37:822--832, 1979.Google Scholar
D. Cosley, D. Frankowski, L. Terveen... - 2007, SuggestBot: Using Intelligent Task Routing to Help People Find Work in Wikipedia.Google Scholar
S. L. Bryant, A Forte... - 2005, Becoming Wikipedian: Transformation of Participation in a Collaborative Online Encyclopedia.Google Scholar
Slattery, S. P. (2009). "Edit this page": the socio-technological infrastructure of a Wikipedia article. In Proc. of the 27th ACM international conference on Design of communication (pp. 289--296). Bloomington, Indiana, USA: ACM. Google ScholarDigital Library
Liu, Y., Liu, Q., & Lin, S. (2006). Tree-to-string alignment template for statistical machine translation, Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics. Google ScholarDigital Library
Gildea, D. (2003). Loosely tree-based alignment for machine translation, Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Google ScholarDigital Library
Och, F. & Ney, H. (2000). Improved statistical alignment models, Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics. Google ScholarDigital Library
Mohler, M. & and Mihalcea, R. (2009). Text-to-text Semantic Similarity for Automatic Short Answer Grading, in Proceedings of the European Chapter of the Association for Computational Linguistics (EACL 2009), Athens, Greece. Google ScholarDigital Library
Gbrilovich, E. & Markovitch, S. (2009). Wikipedia-based semantic interpretation for natural language processing, Journal of Artificial Intelligence Research 34(1). Google ScholarDigital Library
Metzler, D., Dumais, S., & Meek, C. (2007). Similarity Measures for Short Segments of Text, Advances in Information Retrieval, Volume 4425, pp 16--27. Google ScholarDigital Library

Index Terms

Supporting collaboration in Wikipedia between language communities
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...
Read More
Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
Read More
Automatically Generating Wikipedia Info-boxes from Wikidata
WWW '18: Companion Proceedings of the The Web Conference 2018

Info-boxes provide a summary of the most important meta-data relating to a particular entity described by a Wikipedia article. However, many articles have no info-box or have info-boxes with only minimal information; furthermore, there is a huge ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICIC '12: Proceedings of the 4th international conference on Intercultural Collaboration
March 2012
170 pages
ISBN:9781450308182
DOI:10.1145/2160881
General Chairs:
Ravi Vatrapu
Copenhagen Business School, Denmark
,
Vanessa Evers
University of Twente, The Netherlands
,
K. B. Akhilesh
Indian Institute of Science, India
,
Program Chairs:
Bonnie Nardi
University of California Irvine, USA
,
Martha Maznevski
International Institute for Management Development (IMD), Switzerland
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 March 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
computer supported cooperative work
cross-lingual document similarity
wikipedia
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate47of77submissions,61%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 132
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Supporting collaboration in Wikipedia between language communities

ICIC '12: Proceedings of the 4th international conference on Intercultural Collaboration

ABSTRACT

References

Cited By

Index Terms

Recommendations

Two-stage approach to named entity recognition using Wikipedia and DBpedia

Learning multilingual named entity recognition from Wikipedia

Automatically Generating Wikipedia Info-boxes from Wikidata