A Method for Automatically Extracting Domain Semantic Networks from Wikipedia

Xavier, Clarissa Castellã; de Lima, Vera Lúcia Strube

doi:10.1007/978-3-642-28885-2_10

Clarissa Castellã Xavier²³ &
Vera Lúcia Strube de Lima²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7243))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

1164 Accesses

Abstract

This paper describes a method for automatically extracting domain semantic networks of concepts connected by non-specific relations from Wikipedia. We propose an approach based on category and link structure analysis. The method consists of two main tasks: concepts extraction and relations acquisition. For each task we developed two different implementation strategies. Aiming to identify what strategies have the best performances we conducted different extractions for two domains and we analyze their results. From this evaluation we discuss the best approach to implement the extraction method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Szumlanski, S.R., Gomez, F.: Automatically acquiring a semantic network of related concepts. In: Huang, J., Koudas, N., Jones, G.J.F., Wu, X., Collins-Thompson, K., An, A. (eds.) CIKM, pp. 19–28. ACM (2010)
Google Scholar
Suchanek, F., Kasneci, G., Weikum, G.: YAGO: A Large Ontology from Wikipedia and WordNet. Web Semantics Science Services and Agents on the World Wide Web 6(3), 203–217 (2008)
Article Google Scholar
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A Nucleus for a Web of Open Data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)
Chapter Google Scholar
Nastase, V., Strube, M., Boerschinger, B., Zirn, C., Elghafari, A.: Wikinet: A very large scale multilingual concept network. In: Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), Valletta, Malta (2010)
Google Scholar
Fogarolli, A.: Wikipedia as a Source of Ontological Knowledge: State of the Art and Application. In: Caballé, S., Xhafa, F., Abraham, A. (eds.) Intelligent Networking, Collaborative Systems and Applications. Studies in Computational Intelligence, vol. 329, pp. 1–26. Springer, Heidelberg (2010)
Chapter Google Scholar
Syed, Z., Finin, T.: Unsupervised techniques for discovering ontology elements from Wikipedia article links. In: Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading (FAM-LbR 2010), pp. 78–86. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Navigli, R., Ponzetto, S.P.: BabelNet: building a very large multilingual semantic network. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), pp. 216–225. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
de Melo, G., Weikum, G.: MENTA: inducing multilingual taxonomies from wikipedia. In: Huang, J., Koudas, N., Jones, G.J.F., Wu, X., Collins-Thompson, K., An, A. (eds.) CIKM, pp. 1099–1108. ACM (2010)
Google Scholar
Xavier, C.C., de Lima, V.L.S.: A Semi-automatic Method for Domain Ontology Extraction from Portuguese Language Wikipedia’s Categories. In: da Rocha Costa, A.C., Vicari, R.M., Tonidandel, F. (eds.) SBIA 2010. LNCS, vol. 6404, pp. 11–20. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

PPGCC – PUCRS, Av. Ipiranga, 6681 – Prédio 32, Porto Alegre, Brazil
Clarissa Castellã Xavier & Vera Lúcia Strube de Lima

Authors

Clarissa Castellã Xavier
View author publications
You can also search for this author in PubMed Google Scholar
Vera Lúcia Strube de Lima
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UFSCAR, Rod. Washington Luís, 13565-905, São Carlos, Brazil
Helena Caseli
UFRGS, Av. Bento Gonçalves, 9500, 91501-970, Porto Alegre, Brazil
Aline Villavicencio
DETI/IEETA, Universidade de Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal
António Teixeira
UC/ IT, DEEC, Universidade de Coimbra, Polo 2, 3030-290, Coimbra, Portugal
Fernando Perdigão

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xavier, C.C., de Lima, V.L.S. (2012). A Method for Automatically Extracting Domain Semantic Networks from Wikipedia. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-28885-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics