Skip to main content

Pragmatic Information Extraction in Brazilian Portuguese Documents

  • Conference paper
  • First Online:
Computational Processing of the Portuguese Language (PROPOR 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11122))

  • 880 Accesses

Abstract

The volume of published data in the Web has been increasing, and a great amount of those data is available in a natural language format. Manually analyzing each document is a time-consuming and tedious task. Thus, Open IE area emerges to help the extraction of semantic relationships in a large number of texts written in a natural language from different domains. Although a semantic analysis does not guarantee complete accuracy in extracting relations, a pragmatic analysis becomes important on Open EI to identify additional meanings (unsaid) that goes beyond semantics in a text. Our work developed a method for Open Information Extraction to extract relations from texts written in Portuguese in a first pragmatic level. We stated that a first pragmatic level deals with inferential, contextual and intentional aspects. We evaluate our approach, and our results outstand the most relevant related work on comparing accuracy and minimality measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Available: https://pt.wikipedia.org/. Accessed: 08/05/2018.

  2. 2.

    Available: http://www.linguateca.pt/cetenfolha/. Accessed: 08/05/2018.

References

  1. Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction for the web. IJCAI 7, 2670–2676 (2007)

    Google Scholar 

  2. Banko, M., Etzioni, O., Center, T.: The tradeoffs between open and traditional relation extraction. In: ACL, vol. 8, pp. 28–36. Association for Computational Linguistics, Stroudsburg (2008)

    Google Scholar 

  3. Bast, H., Haussmann, E.: Open information extraction via contextual sentence decomposition. In: 2013 IEEE Seventh International Conference on Semantic Computing (ICSC), ICSC 2013, pp. 154–159. IEEE, Irvine (2013)

    Google Scholar 

  4. Bast, H., Haussmann, E.: More informative open information extraction via simple inference. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 585–590. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_61

    Chapter  Google Scholar 

  5. Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996). http://dl.acm.org/citation.cfm?id=230386.230390

  6. da Costa, J.C.: A teoria inferencial das implicaturas: descrição do modelo clássico de grice. Letras de Hoje 44(3) (2009)

    Google Scholar 

  7. Del Corro, L., Gemulla, R.: Clausie: Clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 355–366. ACM, New York (2013). https://doi.org/10.1145/2488388.2488420

  8. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 1535–1545. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2145432.2145596

  9. Gamallo, P., Garcia, M.: Multilingual open information extraction. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds.) EPIA 2015. LNCS (LNAI), vol. 9273, pp. 711–722. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23485-4_72

    Chapter  Google Scholar 

  10. Grice, H.P.: Studies in the Way of Words. Harvard University Press (1989)

    Google Scholar 

  11. Mausam, M.S., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, pp. 523–534. Association for Computational Linguistics, Stroudsburg (2012). http://dl.acm.org/citation.cfm?id=2390948.2391009

  12. de Oliveira, L.S., Glauber, R., Claro, D.B.: Dependentie: an open information extraction system on portuguese by a dependence analysis. Encontro Nacional de Inteligência Artificial e Computacional (2017)

    Google Scholar 

  13. Sena, C.F.L., Glauber, R., Claro, D.B.: Inference approach to enhance a portuguese open information extraction. In: Proceedings of the 19th International Conference on Enterprise Information Systems, ICEIS, vol. 1, pp. 442–451. INSTICC, ScitePress, Porto, Portugal (2017). https://doi.org/10.5220/0006338204420451

  14. Wu, F., Weld, D.S.: Open information extraction using wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 118–127. Association for Computational Linguistics, Stroudsburg (2010). http://dl.acm.org/citation.cfm?id=1858681.1858694

Download references

Acknowledgement

Authors would like to thank FAPESB BOL3288/2015 for finantial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniela Barreiro Claro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sena, C.F.L., Claro, D.B. (2018). Pragmatic Information Extraction in Brazilian Portuguese Documents. In: Villavicencio, A., et al. Computational Processing of the Portuguese Language. PROPOR 2018. Lecture Notes in Computer Science(), vol 11122. Springer, Cham. https://doi.org/10.1007/978-3-319-99722-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99722-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99721-6

  • Online ISBN: 978-3-319-99722-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics