Skip to main content

From Open Information Extraction to Semantic Web: A Context Rule-Based Strategy

  • Conference paper
  • First Online:
  • 977 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11308))

Abstract

The Web represents a valuable data source of information that is presented mainly as unstructured text. The extraction of structured and valuable information from sources such as the Web is an important challenge for the Semantic Web and Information Extraction areas, where elements representing real-world objects (aka named entities) and their relations need to be extracted from text and formally represented through RDF triples. Thus, extracting such information from the Web is manually unfeasible due to its large scale and heterogeneity of domains. In this sense, Open Information Extraction (OIE) is an independent domain task based on patterns to extract any kind of relation between named entities. Hence, one step further is to transform such relations into RDF triples. This paper proposes a method to represent relations obtained by an OIE approach into RDF triples. The method is based on the extraction of named entities, their relation, and contextual information from an input sentence and a set of defined rules that lead to map the extracted elements with resources from a Knowledge Base of the Semantic Web. The evaluation demonstrates promising results regarding the extraction and representation of information.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    A named entity, mainly, refers to names of people, companies, and geographical places. However, the types of NEs varies according to the domain.

  2. 2.

    In this work, the term context refers to the words between two NEs in a sentence.

  3. 3.

    In this work we focus on binary relations, but, the OIE output could also be represented as n-ary relations.

  4. 4.

    We use URI prefixes (namespaces) in accordance with the service hosted at http://prefix.cc.

  5. 5.

    https://premon.fbk.eu/.

  6. 6.

    https://stanfordnlp.github.io/CoreNLP/index.html.

  7. 7.

    https://www.dbpedia-spotlight.org/.

  8. 8.

    http://www.cs.cmu.edu/~ark/SEMAFOR/.

  9. 9.

    https://framenet.icsi.berkeley.edu/fndrupal/.

  10. 10.

    http://dbpedia.org/resource/.

  11. 11.

    This explanation is used for the next tables.

References

  1. Augenstein, I., Maynard, D., Ciravegna, F.: Relation extraction from the web using distant supervision. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS (LNAI), vol. 8876, pp. 26–41. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13704-9_3

    Chapter  Google Scholar 

  2. Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_21

    Chapter  Google Scholar 

  3. Baker, C.: FrameNet: a knowledge base for natural language processing. In: Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929–2014), pp. 1–5. Association for Computational Linguistics (2014)

    Google Scholar 

  4. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)

    Article  Google Scholar 

  5. Corcoglioniti, F., Rospocher, M., Palmero Aprosio, A.: Frame-based ontology population with pikes. IEEE Trans. Knowl. Data Eng. 28(12), 3261–3275 (2016)

    Article  Google Scholar 

  6. Das, D., Schneider, N., Chen, D., Smith, N.A.: Probabilistic frame-semantic parsing. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 948–956. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  7. Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 355–366. ACM, New York (2013)

    Google Scholar 

  8. Etzioni, O., Banko, M., Soderland, S., Weld, D.S.: Open information extraction from the web. Commun. ACM 51(12), 68–74 (2008)

    Article  Google Scholar 

  9. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on EMNLP, pp. 1535–1545. Association for Computational Linguistics, Stroudsburg (2011)

    Google Scholar 

  10. Gangemi, A., Presutti, V., Recupero, D.R., Nuzzolese, A.G., Draicchio, F., MongiovÃ, M.: Semantic web machine reading with FRED. Semant. Web 8(6), 873–893 (2017)

    Article  Google Scholar 

  11. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd edn. Prentice Hall Series in Artificial Intelligence. Prentice Hall, Pearson Education International (2009)

    Google Scholar 

  12. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: ACL System Demonstrations, pp. 55–60 (2014)

    Google Scholar 

  13. Piskorski, J., Yangarber, R.: Information extraction: past, present and future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization Theory and Applications of Natural Language Processing, pp. 23–49. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-28569-1_2

    Chapter  Google Scholar 

  14. Zouaq, A., Gagnon, M., Jean-Louis, L.: An assessment of open relation extraction systems for the semantic web. Inf. Syst. 71, 228–239 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julio Hernandez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hernandez, J., Lopez-Arevalo, I., Martinez-Rodriguez, J.L., Aldana-Bobadilla, E. (2018). From Open Information Extraction to Semantic Web: A Context Rule-Based Strategy. In: Groza, A., Prasath, R. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2018. Lecture Notes in Computer Science(), vol 11308. Springer, Cham. https://doi.org/10.1007/978-3-030-05918-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05918-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05917-0

  • Online ISBN: 978-3-030-05918-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics