Abstract
Many Wikipedia articles lack information, because not all users submit truly complete information to Wikipedia. However, Wikipedia has many language versions that have been developed independently. Therefore, if we supply these complementary information from many language versions, the users must satisfy the amount of information of Wikipedia articles with the complementary information, instead of only one language version of Wikipedia articles. In this study, we specifically examine multilingual Wikipedia and propose a method of extracting good quality complementary information from Wikipedia of other languages. Specifically, we compare Wikipedia articles with less information to those with more information. From Wikipedia articles, which can have the same theme and different languages, we extract different information as complementary information. As described herein, we extract comparison target articles of Wikipedia based on a link graph, because cases exist in which information included in an articles is written in multiple pages of different languages. Furthermore, some low-quality information is extracted as complementary information because Wikipedia articles are written by not only good editors but also bad editors such as vandals. We propose a method to calculate the quality of information based on the editors, and we extract good quality complementary information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adar, E., Skinner, M., Weld, D.S.: Information arbitrage across multi-lingual wikipedia. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM 2009, pp. 94–103. ACM Press, New York (2009)
Adler, B., de Alfaro, L.: A content-driven reputation system for the Wikipedia. In: Proceedings of the 16th International Conference on World Wide Web (WWW 2007), pp. 261–270 (2007)
Chen, Z., Liu, S., Wenyin, L., Pu, G., Ma, W.Y.: Building a web thesaurus from web link structure. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 48–55 (2003)
Eklou, D., Asano, Y., Yoshikawa, M.: How the web can help wikipedia: a study on information complementation of wikipedia by the web. In: Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2012, pp. 9:1–9:10. ACM, New York (2012), http://doi.acm.org/10.1145/2184751.2184763
Fujiwara, Y., Suzuki, Y., Konishi, Y., Nadamoto, A.: Extracting Difference Information from Multilingual Wikipedia. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 496–503. Springer, Heidelberg (2012)
Kamps, J., Koolen, M.: Is wikipedia link structure different? In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 232–241 (2009)
Ma, Q., Nadamoto, A., Tanaka, K.: Complementary information retrieval for cross-media news content. Inf. Syst. 31(7), 659–678 (2006), http://dx.doi.org/10.1016/j.is.2005.12.004
Milne, D.: Computing semantic relatedness using wikipedia link structure. In: Proc. of New Zealand Computer Science Research Student Conference, NZCSRSC 2007, CDROM (2007)
Milne, D., Medelyan, O., Witten, I.H.: Mining Domain-Specific thesauri from wikipedia: A case study. In: WI 2006: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 442–448 (2006)
Nakatani, M., Jatowt, A., Tanaka, K.: Adaptive ranking of search results by considering user’s comprehension. In: Proceedings of the 4th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2010, CDROM (2010)
Nakayama, K., Hara, T., Nishio, S.: Wikipedia Mining for an Association Web Thesaurus Construction. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 322–334. Springer, Heidelberg (2007)
Strube, M., Ponzetto, S.P.: WikiRelate! computing semantic relatedness using wikipedia. In: Proceedings of the 21st International conference on Artificial intelligence (AAAI 2006), pp. 1419–1424 (2006)
Stvilia, B., Gasser, L., Twidale, M.B., Smith, L.C.: A framework for information quality assessment. Journal of the American Society for Information Science and Technology 58(12), 1720–1733 (2007)
Stvilia, B., Twidale, M.B., Smith, L.C., Gasser, L.: Information quality work organization in wikipedia. Journal of the American Society for Information Science and Technology 59(6), 983–1001 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Suzuki, Y., Fujiwara, Y., Konishi, Y., Nadamoto, A. (2012). Good Quality Complementary Information for Multilingual Wikipedia. In: Wang, X.S., Cruz, I., Delis, A., Huang, G. (eds) Web Information Systems Engineering - WISE 2012. WISE 2012. Lecture Notes in Computer Science, vol 7651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35063-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-35063-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35062-7
Online ISBN: 978-3-642-35063-4
eBook Packages: Computer ScienceComputer Science (R0)