Wikipedia as a Source of Ontological Knowledge: State of the Art and Application

Fogarolli, Angela

doi:10.1007/978-3-642-16793-5_1

Angela Fogarolli⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 329))

622 Accesses

Abstract

This chapter motivates that Wikipedia can be used as a source of knowledge for creating semantic enabled applications, and consists of two parts. First, we provide an overview over different research fields which attempt to extract knowledge encoded by humans inside Wikipedia. The extracted knowledge can then be used for creating a new generation of intelligent applications based on the collaborative character of Wikipedia, rather than on domain ontologies which require the intervention of knowledge engineers and domain experts. Second, as a proof of concept, we describe an application whose intelligent behavior is achieved by using Wikipedia knowledge for automatic annotation and representation of multimedia presentations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

From Ontology to Semantic Wiki – Designing Annotation and Browse Interfaces for Given Ontologies

Discovery and Enrichment of Knowledges from a Semantic Wiki

Getting the Most Out of Wikidata: Semantic Technology Usage in Wikipedia’s Knowledge Graph

References

Adafre, S.F., Jijkoun, V., de Rijke, M.: Fact discovery in wikipedia. In: WI 2007: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 177–183. IEEE Computer Society, Washington (2007), http://dx.doi.org/10.1109/WI.2007.57
Chapter Google Scholar
Adafre, S.F., de Rijke, M.: Discovering missing links in wikipedia. In: LinkKDD 2005: Proceedings of the 3rd International Workshop on Link Discovery, pp. 90–97. ACM, New York (2005)
Chapter Google Scholar
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: Dbpedia: A nucleus for a web of open data. In: ISWC/ASWC, pp. 722–735 (2007)
Google Scholar
Auer, S., Lehmann, J.: What have innsbruck and leipzig in common? extracting semantics from wiki content. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 503–517. Springer, Heidelberg (2007)
Chapter Google Scholar
Chernov, S., Iofciu, T., Nejdl, W., Zhou, X.: Extracting semantic relationships between wikipedia categories. In: 1st Workshop on Semantic Wikis (2006)
Google Scholar
Ciaramita, M., Altun, Y.: Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 594–602. Association for Computational Linguistics, Sydney (2006), http://www.aclweb.org/anthology/W/W06/W06-1670
Chapter Google Scholar
Cucerzan, S.: Large-scale named entity disambiguation based on wikipedia data. In: EMNLP 2007: Empirical Methods in Natural Language Processing, Prague, Czech Republic, June 28-30, pp. 708–716 (2007), http://acl.ldc.upenn.edu/D/D07/D07-1074.pdf
Culotta, A., McCallum, A., Betz, J.: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In: Proceedings of the main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 296–303. Association for Computational Linguistics, Morristown (2006), http://dx.doi.org/10.3115/1220835.1220873
Chapter Google Scholar
Ebersbach, A., Glaser, M., Heigl, R.: Wiki: Web Collaboration. Springer, Heidelberg (2005)
Google Scholar
Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer, New York (2003)
Google Scholar
Fields, K.: Ontologies, categories, folksonomies: an organised language of sound. Org. Sound 12(2), 101–111 (2007), http://dx.doi.org/10.1017/S135577180700177X
Article Google Scholar
Fogarolli, A.: Word sense disambiguation based on wikipedia link structure. In: IEEE ICSC 2009 (2009)
Google Scholar
Fogarolli, A., Ronchetti, M.: Intelligent mining and indexing of multi-language e-learning material. In: Tsihrintzis, G., et al. (eds.) 1st International Symposium on Intelligent Interactive Multimedia Systems and Services, KES IIMS 2008. SCI, vol. New Directions in Intelligent Interactive Multimedia, pp. 395–404. Springer, Heidelberg (2008)
Google Scholar
Gabrilovich, E., Markovitch, S.: Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence, Boston, MA (2006)
Google Scholar
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 6–12 (2007)
Google Scholar
Cui, G., Lu, Q., Li, W., Chen, Y.: Corpus exploitation from wikipedia for ontology construction. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)
Google Scholar
Giles, J.: Internet encyclopaedias go head to head. Nature 438(7070), 900–901 (2005)
Article Google Scholar
Klein, G.O., Smith, B.: Concept systems and ontologies. Discussion between realist philosophers and ISO/CEN experts concerning the standards addressing ”concepts” and related terms (2005)
Google Scholar
Gruber, T.: Tagontology - a way to agree on the semantics of tagging data (2005), http://tomgruber.org/writing/tagontology-tagcamp-talk.pdf
Hepp, M., Siorpaes, K., Bachlechner, D.: Harvesting wiki consensus: Using wikipedia entries as vocabulary for knowledge management. IEEE Internet Computing 11(5), 54–65 (2007), doi:10.1109/MIC.2007.110
Article Google Scholar
Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring article quality in wikipedia: models and evaluation. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 243–252. ACM, New York (2007), http://doi.acm.org/10.1145/1321440.1321476
Chapter Google Scholar
Janik, M., Kochut, K.J.: Wikipedia in action: Ontological knowledge in text categorization. ICSC 0, 268–275 (2008), http://doi.ieeecomputersociety.org/10.1109/ICSC.2008.53
Google Scholar
Atserias, J., Zaragoza, H., Ciaramita, M., Attardi, G.: Semantically annotated snapshot of the english wikipedia. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)
Google Scholar
Kamps, J., Koolen, M.: The importance of link evidence in wikipedia. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 270–282. Springer, Heidelberg (2008)
Chapter Google Scholar
Krizhanovsky, A.: Synonym search in wikipedia: Synarcher. arxiv.org http://arxiv.org/abs/cs/0606097v1 ; Search for synomyms in Wikipedia using hyperlinks and categories
Lankes, R.D., Silverstein, J., Nicholson, S., Marshall, T.: Participatory networks the library as conversation. Information Research 12(4) (2007), http://iis.syr.edu/projects/PNOpen/ParticiaptoryNetworks.pdf
Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: SIGDOC 1986: Proceedings of the 5th Annual International Conference on Systems Documentation, pp. 24–26. ACM, New York (1986), http://doi.acm.org/10.1145/318723.318728
Chapter Google Scholar
Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In: SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 266–272. ACM, New York (2004), http://doi.acm.org/10.1145/1008992.1009039
Google Scholar
Ramos, M.A., Rambow, O., Wanner, L.: Using semantically annotated corpora to build collocation resources. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)
Google Scholar
Mihalcea, R.: Using wikipedia for automatic word sense disambiguation. In: Proceedings of NAACL HLT 2007, pp. 196–203 (2007), http://www.cs.unt.edu/~rada/papers/mihalcea.naacl07.pdf
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 233–242. ACM, New York (2007)
Chapter Google Scholar
Milne, D.: Computing semantic relatedness using wikipedia link structure. In: New Zealand Computer Science Research Student Conference (2007)
Google Scholar
Milne, D., Medelyan, O., Witten, I.H.: Mining domain-specific thesauri from wikipedia: A case study. In: WI 2006: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 442–448. IEEE Computer Society, Washington (2006), http://dx.doi.org/10.1109/WI.2006.119
Chapter Google Scholar
Nguyen, D.P.T., Matsuo, Y., Ishizuka, M.: Relation extraction from wikipedia using subtree mining. In: AAAI, pp. 1414–1420. AAAI Press, Menlo Park (2007)
Google Scholar
Noruzi, A.: Folksonomies (un)controlled vocabulary? Knowledge Organization 33(4), 199–203 (2006), http://noruzi.blogspot.com/2007/07/folksonomies-uncontrolled-vocabulary.html
Google Scholar
Obrst, L.: Ontologies for semantically interoperable systems. In: CIKM 2003: Proceedings of the Twelfth International Conference on Information and Knowledge Management, pp. 366–369. ACM Press, New York (2003), http://doi.acm.org/10.1145/956863.956932
Chapter Google Scholar
Ollivier, Y., Senellart, P.: Finding related pages using Green measures: An illustration with Wikipedia. In: Proc. AAAI, Vancouver, Canada, pp. 1427–1433 (2007)
Google Scholar
Pask, G.: Conversation, cognition and learning: A cybernetic theory and methodology. Elsevier, Amsterdam (1975), http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0444411933
Google Scholar
Ponzetto, S.: Creating a knowledge base from a collaboratively generated encyclopedia. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics Doctoral Consortium, Rochester, N.Y., pp. 9–12 (2007)
Google Scholar
Ponzetto, S., Strube, M.: Deriving a large scale taxonomy from wikipedia. In: Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI 2007), Vancouver, B.C., pp. 1440–1447 (2007)
Google Scholar
Roth, M., im Walde, S.S.: Corpus co-occurrence, dictionary and wikipedia entries as resources for semantic relatedness information. In: E.L.R.A (ELRA) (ed.) Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (2008)
Google Scholar
Ruiz-Casado, M., Alfonseca, E., Castells, P.: From wikipedia to semantic relationships: a semi-automated annotation approach. In: SemWiki (2006)
Google Scholar
Schaffert, S.: Ikewiki: A semantic wiki for collaborative knowledge management. In: 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, WETICE 2006, pp. 388–396 (2006)
Google Scholar
Schonhofen, P.: Identifying document topics using the wikipedia category network. In: WI 2006: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 456–462. IEEE Computer Society, Washington (2006)
Chapter Google Scholar
Siorpaes, K., Hepp, M.: Ontogame: Weaving the semantic web by online games. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 751–766. Springer, Heidelberg (2008)
Chapter Google Scholar
Snoek, C., Worring, M.: Multimodal video indexing: A review of the state-of-the-art. In: Multimedia Tools and Applications, vol. 25, pp. 5–35 (2005)
Google Scholar
Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using Wikipedia. In: Proceedings of the 21st National Conference on ArtificialIntelligence (AAAI 2006), Boston, Mass., pp. 1419–1424 (2006)
Google Scholar
Suchanek, F., Kasneci, G., Weikum, G.: Yago: A large ontology from wikipedia and wordnet. Research Report MPI-I-2007-5-003, Max-Planck-Institut für Informatik, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany (2007)
Google Scholar
Suh, S., Halpin, H., Klein, E.: Extracting common sense knowledge from wikipedia. In: Proc. of the ISWC 2006 Workshop on Web Content Mining with Human Language technology (2006), http://orestes.ii.uam.es/workshop/22.pdf
Syed, Z., Finin, T., Joshi, A.: Wikipedia as an ontology for describing documents. In: Proceedings of the Second International Conference on Weblogs and Social Media. AAAI Press, Menlo Park (2008)
Google Scholar
Thomas, C., Sheth, A.P.: Semantic convergence of wikipedia articles. In: WI 2007: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, pp. 600–606. IEEE Computer Society, Washington (2007), http://dx.doi.org/10.1109/WI.2007.93
Chapter Google Scholar
Twidale, B.S.M.B.: Assessing information quality of a community-based encyclopedia. In: Proceedings of the International Conference on Information Quality, pp. 442–454 (2005)
Google Scholar
Uren, V.S., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F.: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. J. Web Sem. 4(1), 14–28 (2006)
Google Scholar
Vercoustre, A.M., Thom, J.A., Pehcevski, J.: Entity ranking in wikipedia. In: SAC 2008: Proceedings of the 2008 ACM Symposium on Applied computing, pp. 1101–1106. ACM, New York (2008), http://doi.acm.org/10.1145/1363686.1363943
Chapter Google Scholar
Völkel, M., Krötzsch, M., Vrandecic, D., Haller, H., Studer, R.: Semantic wikipedia. In: Proceedings of the 15th International Conference on World Wide Web, WWW 2006, Edinburgh, Scotland, May 23-26 (2006), http://www.aifb.uni-karlsruhe.de/WBS/hha/papers/SemanticWikipedia.pdf
Voss, J.: Measuring wikipedia. In: Proceedings International Conference of the International Society for Scientometrics and Informetrics: 10 th (2005), http://eprints.rclis.org/archive/00003610/
Voss, J.: Collaborative thesaurus tagging the wikipedia way (2006), http://arxiv.org/abs/cs.IR/0604036
Wang, G., Yu, Y., Zhu, H.: Pore: Positive-only relation extraction from wikipedia text. In: Aberer, K., Choi, K.S., Noy, N., Allemang, D., Lee, K.I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 575–588. Springer, Heidelberg (2007), http://iswc2007.semanticweb.org/papers/575.pdf
Google Scholar
Wang, J.Z., Boujemaa, N., Bimbo, A.D., Geman, D., Hauptmann, A.G., Tesić, J.: Diversity in multimedia information retrieval research. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 5–12. ACM, New York (2006), http://doi.acm.org/10.1145/1178677.1178681
Chapter Google Scholar
Wang, J.Z., Boujemaa, N., Chen, Y.: High diversity transforms multimedia information retrieval into a cross-cutting field: report on the 8th workshop on multimedia information retrieval. SIGMOD Rec. 36(1), 57–59 (2007), http://doi.acm.org/10.1145/1276301.1276315
Article MATH Google Scholar
Watanabe, Y., Asahara, M., Matsumoto, Y.: A graph-based approach to named entity categorization in Wikipedia using conditional random fields. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 649–657. Association for Computational Linguistics, Prague (2007), http://www.aclweb.org/anthology/D/D07/D07-1068
Google Scholar
Weber, N., Buitelaar, P.: Web-based ontology learning with isolde. In: Proc. of ISWC 2006 Workshop on Web Content Mining with Human Language Technologies (2006), http://orestes.ii.uam.es/workshop/4.pdf
Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 41–50. ACM, New York (2007), http://portal.acm.org/citation.cfm?id=1321440.1321449 , doi:10.1145/1321440.1321449
Chapter Google Scholar
Yu, J., Thom, J.A., Tam, A.: Ontology evaluation using wikipedia categories for browsing. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 223–232. ACM, New York (2007), http://doi.acm.org/10.1145/1321440.1321474
Chapter Google Scholar
Zesch, T., Gurevych, I.: Analysis of the wikipedia category graph for nlp applications. In: Proc. of the TextGraphs-2 Workshop (2007), http://acl.ldc.upenn.edu/W/W07/W07-0201.pdf
Zesch, T., Gurevych, I., Mühlhäuser, M.: Analyzing and accessing wikipedia as a lexical semantic resource. In: Biannual Conference of the Society for Computational Linguistics and Language Technology (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Trento, Via Sommarive 14, 38123, Trento, Italy
Angela Fogarolli

Authors

Angela Fogarolli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Sciences Multimedia and Telecommunications, Open University of Catalonia, Rbla. Poblenou. 156, 08018, Barcelona, Spain
Santi Caballé
Department of Languages and Informatics Systems, Polytechnic University of Catalonia, Campus Nord, Ed. Omega, C/Jordi Girona 1-3, 08034, Barcelona, Spain
Fatos Xhafa
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR Labs), P.O. Box 2259, 98071-2259, Auburn, Washington, USA
Ajith Abraham

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fogarolli, A. (2010). Wikipedia as a Source of Ontological Knowledge: State of the Art and Application. In: Caballé, S., Xhafa, F., Abraham, A. (eds) Intelligent Networking, Collaborative Systems and Applications. Studies in Computational Intelligence, vol 329. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16793-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-16793-5_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16792-8
Online ISBN: 978-3-642-16793-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics