Abstract
Wikipedia, a killer application in Web 2.0, has embraced the power of collaborative editing to harness collective intelligence. It features many attractive characteristics, like entity-based link graph, abundant categorization and semi-structured layout, and can serve as an ideal data source to extract high quality and well-structured data. In this chapter, we first propose several solutions to extract knowledge from Wikipedia. We do not only consider information from the relational summaries of articles (infoboxes) but also semi-automatically extract it from the article text using the structured content available. Due to differences with information extraction from the Web, it is necessary to tackle new problems, like the lack of redundancy in Wikipedia that is dealt with by extending traditional machine learning algorithms to work with few labeled data. Furthermore, we also exploit the widespread categories as a complementary way to discover additional knowledge. Benefiting from both structured and textural information, we additionally provide a suggestion service for Wikipedia authoring. With the aim to facilitate semantic reuse, our proposal provides users with facilities such as link, categories and infobox content suggestions. The proposed enhancements can be applied to attract more contributors and lighten the burden of professional editors. Finally, we developed an enhanced search system, which can ease the process of exploiting Wikipedia. To provide a user-friendly interface, it extends the faceted search interface with relation navigation and let the user easily express his complex information needs in an interactive way. In order to achieve efficient query answering, it extends scalable IR engines to index and search both the textual and structured information with an integrated ranking support.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fu, L., Wang, H., Zhu, H., Zhang, H.,Wang, Y., Yu, Y.: Making more wikipedians: Facilitating semantics reuse for wikipedia authoring. Lecture Notes in Computer Science 4825, 128 (2007)
Giles, J.: Special Report–Internet encyclopaedias go head to head. Nature 438(15), 900–901 (2005)
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: International Joint Conference on Artificial Intelligence. Lawrence Erlbaum Associates Ltd vol. 18, pp. 587–594 (2003)
Liu, Q., Xu, K., Zhang, L., Wang, H., Yu, Y., Pan, Y.: Catriple: Extracting Triples from Wikipedia Categories. In: Proceedings of the 3rd Asian Semantic Web Conference on The Semantic Web. Springer pp. 330–344 (2008)
Suchanek, F., Kasneci, G.,Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th international conference on World Wide Web. ACM New York, NY, USA pp. 697–706 (2007)
Wang, G., Yu, Y., Zhu, H.: Pore: Positive-only relation extraction from wikipedia text. Lecture Notes in Computer Science 4825, 580 (2007)
Yee, K., Swearingen, K., Li, K., Hearst,M.: Faceted metadata for image search and browsing. In: Proceedings of the SIGCHI conference on Human factors in computing systems. ACM New York, NY, USA pp. 401–408 (2003)
Zhang, L., Liu, Q., Zhang, J.,Wang, H., Pan, Y., Yu, Y.: Semplore: An ir approach to scalable hybrid query of semantic web data. Lecture Notes in Computer Science 4825, 652 (2007)
Zlatić, V., Božičević, M., Štefančić, H., Domazet, M.: Wikipedias: collaborative web-based encyclopedias as complex networks. SIAM Rev Phys Rev E 74, 016, 115 (2003)
Zobel, J., Moffat, A.: Inverted files for text search engines. ACMComputing Surveys (CSUR) 38(2) (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Wang, H., Penin, T., Fu, L., Liu, Q., Xue, G., Yu, Y. (2009). Semantic Services for Wikipedia. In: King, I., Baeza-Yates, R. (eds) Weaving Services and People on the World Wide Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00570-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-00570-1_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00569-5
Online ISBN: 978-3-642-00570-1
eBook Packages: Computer ScienceComputer Science (R0)