Information Extraction from Concise Passages of Natural Language Sources

Pohorec, Sandi; Verlič, Mateja; Zorman, Milan

doi:10.1007/978-3-642-15576-5_35

Sandi Pohorec¹⁹,
Mateja Verlič¹⁹ &
Milan Zorman¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6295))

Included in the following conference series:

East European Conference on Advances in Databases and Information Systems

804 Accesses

Abstract

This paper will present a semi-automated approach for information extraction for ontology construction. The sources used are short news extracts syndicated online. These are used because they contain short passages which provide information in a concise and precise manner. The shortness of the passage significantly reduces the problems of word sense disambiguation. The main goal of knowledge extraction is a semi-automated approach to ontology construction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automated ontology generation from a plain text using statistical and NLP techniques

Article 10 December 2015

Open Information Extraction as Additional Source for Kazakh Ontology Generation

Representation, Analysis, and Extraction of Knowledge from Unstructured Natural Language Texts

Article 27 May 2021

References

Davies, J., Studer, R., Warren, P.: Semantic Web Technologies: Trends and Research in Ontology-based Systems. John Wiley & Sons Ltd., Great Britain (2006)
Book Google Scholar
RSS 2.0 Specification, http://www.rssboard.org/rss-specification
Extensible Markup Language (XML) 1.0, http://www.w3.org/TR/REC-xml/
Heydon, A., Najork, M.: A scalable extensible web crawler. In: Proceedings of the Eight World Wide Web Conference, pp. 219–229 (1999)
Google Scholar
Brewington, B.E., Cybenko, G.: How Dynamic is the Web. In: Proceedings of the Ninth International World Wide Web Conference, pp. 257–276 (2000)
Google Scholar
Chakrabarti, S., van den Berg, M., Dom, B.: Focused crawling: a new approach to topic-specific Web resource discovery. In: Proceedings of the Eight International Conference on World Wide Web, pp. 1623–1640 (1999)
Google Scholar
Grefenstette, G., Tapanainen, P.: What is a word, what is a sentence? Problems of tokenization. In: 3rd International Conference on Computer Lexicography, pp. 79–87 (1994)
Google Scholar
Meir, R., Rätsch, G.: An introduction to boosting and leveraging. In: Mendelson, S., Smola, A.J. (eds.) Advanced Lectures on Machine Learning. LNCS (LNAI), vol. 2600, pp. 118–183. Springer, Heidelberg (2003)
Chapter Google Scholar
Brants, T.: TnT – A Statistical Part-of-Speech Tagger. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 224–231 (2000)
Google Scholar
Ehrig, M., Haase, P., Hefke, M., Stojanovic, N.: Similarity for ontologies – A comprehensive framework. In: Proceedings of the 13^th European Conference on Information Systems (2004)
Google Scholar
Navigli, R.: Word Sense Disambiguation: A Survey. ACM Comput. Surv. 41(2), 1–69 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova ulica 17, 2000, Maribor, Slovenia
Sandi Pohorec, Mateja Verlič & Milan Zorman

Authors

Sandi Pohorec
View author publications
You can also search for this author in PubMed Google Scholar
Mateja Verlič
View author publications
You can also search for this author in PubMed Google Scholar
Milan Zorman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Infoamtion Science, University of Genova, Via Dodecaneso, 35, 16146, Genova, Italy
Barbara Catania
Faculty of Science, Department of Mathematics and Informatics, University of Novi Sad, Trg Dositeja Obradovica 4, 21000, Novi Sad, Serbia
Mirjana Ivanović
Institute of Computer Science and Applied Mathematics, Christian-Albrechts-University of Kiel, Olshausenstr. 40, 24098, Kiel, Germany
Bernhard Thalheim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pohorec, S., Verlič, M., Zorman, M. (2010). Information Extraction from Concise Passages of Natural Language Sources. In: Catania, B., Ivanović, M., Thalheim, B. (eds) Advances in Databases and Information Systems. ADBIS 2010. Lecture Notes in Computer Science, vol 6295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15576-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-15576-5_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15575-8
Online ISBN: 978-3-642-15576-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics