Skip to main content
Log in

Online Reasoning for Semantic Error Detection in Text

  • Original Article
  • Published:
Journal on Data Semantics

Abstract

Identifying incorrect content (i.e., semantic error) in text is a difficult task because of the ambiguous nature of written natural language and the many factors that can make a statement semantically erroneous. Current methods identify semantic errors in a sentence by determining whether it contradicts the domain to which the sentence belongs. However, because these methods are constructed on expected logic contradictions, they cannot handle new or unexpected semantic errors. In this paper, we propose a new method for detecting semantic errors that is based on logic reasoning. Our proposed method converts text into logic clauses, which are later analyzed against a domain ontology by an automatic reasoner to determine its consistency. This approach can provide a complete analysis of the text, since it can analyze a single sentence or sets of multiple sentences. When there are multiple sentences to analyze, in order to avoid the high complexity of reasoning over a large set of logic clauses, we propose rules that reduce the set of sentences to analyze, based on the logic relationships between sentences. In our evaluation, we have found that our proposed method can identify a significant percentage of semantic errors and, in the case of multiple sentences, it does so without significant computational cost. We have also found that both the quality of the information extraction output and modeling elements of the ontology (i.e., property domain and range) affect the capability of detecting errors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Bechhofer S, van Harmelen F, Hendler J, Horrocks I, Patel-Schneider PF, McGuinness D, Stein L (2004) OWL Web Ontology Language. http://www.w3.org/TR/owl-ref/

  2. Bos J (2008) Wide-coverage semantic analysis with boxer. In: Proceedings of the 2008 conference on semantics in text processing, STEP ’08. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 277–286

  3. Bos J (2015) Open-domain semantic parsing with boxer. In: Proceedings of the 20th Nordic conference of computational linguistics, NODALIDA 2015, May 11–13, 2015, Institute of the Lithuanian Language, Vilnius, Lithuania, pp 301–304

  4. Buitelaar P, Cimiano P, Racioppa S, Siegel M (2006) Ontology-based information extraction with SOBA. In: Proceedings of the international conference on language resources and evaluation (LREC). ELRA, pp 2321–2324

  5. Carlson A, Betteridge J, Hruschka ER, M., M.T (2009) Coupling semi-supervised learning of categories and relations. In: Proceedings of the NAACL HLT 2009 workshop on semi-supervised learning for natural language processing (SemiSupLearn), pp 1–9

  6. Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: Conference on empirical methods in natural language processing (EMNLP), pp 1535–1545

  7. Flouris G, Manakanatas D, Kondylakis H, Plexousakis D, Antoniou G (2008) Ontology change: classification and survey. Knowl Eng Rev 23(02):117–152

    Article  Google Scholar 

  8. Fuchs NE, Kaljurand K, Schneider G (2006) Attempto controlled English meets the challenges of knowledge representation, reasoning, interoperability and user interfaces. In: Sutcliffe G, Goebel R (eds) FLAIRS conference. AAAI Press, pp 664–669

  9. Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220

    Article  Google Scholar 

  10. Gutierrez F, Dou D, Fickas S, Griffiths G (2012) Providing grades and feedback for student summaries by ontology-based information extraction. In: Proceedings of the 21st ACM conference on information and knowledge management (CIKM), pp 1722–1726

  11. Gutierrez F, Dou D, Fickas S, Griffiths G (2014) Online reasoning for ontology-based error detection in text. In: Proceedings of the 13th international conference on ontologies, databases and application of semantics (ODBASE), pp 562–579

  12. Gutierrez F, Dou D, Fickas S, Martini A, Zong H (2013) Hybrid ontology-based information extraction for automated text grading. In: Proceedings of the 12th IEEE international conference on machine learning and applications (ICMLA), pp 359–364

  13. Gutierrez F, Dou D, Fickas S, Wimalasuriya D, Zong H (2015) A hybrid ontology-based information extraction system. J Inf Sci 42:798–820

    Article  Google Scholar 

  14. Haase P, Völker J (2008) Ontology learning and reasoning—dealing with uncertainty and inconsistency. In: Costa PC, D’Amato C, Fanizzi N, Laskey KB, Laskey KJ, Lukasiewicz T, Nickles M, Pool M (eds) Uncertainty reasoning for the semantic web I, pp 366–384

  15. Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings 14th conference on computational linguistics (COLING), pp 539–545

  16. Hina S, Atwell E, Johnson O (2010) Secure information extraction from clinical documents using snomed ct gazetteer and natural language processing. In: 2010 International conference for internet technology and secured transactions, pp 1–5

  17. Horridge M, Parsia B, Sattler U (2009) Explaining inconsistencies in OWL ontologies. Scalable Uncertain Manag 5785:124–137

    Article  Google Scholar 

  18. Horrocks I, Patel-Schneider P (2004) Reducing owl entailment to description logic satisfiability. Web Seman Sci Serv Agents World Wide Web 1(4):345–357

    Article  Google Scholar 

  19. Huang Z, van Harmelen F, ten Teije A (2005) Reasoning with inconsistent ontologies. In: Proceedings of the 19th international joint conference on artificial intelligence (IJCAI), pp 454–459

  20. Kaljurand K, Fuchs NE (2007) Verbalizing OWL in Attempto Controlled English. In: Proceedings of third international workshop on OWL: experiences and directions, Innsbruck, Austria (6th–7th June 2007), vol 258

  21. Koch M, Gilmer J, Soderland S, Weld DS (2014) Type-aware distantly supervised relation extraction with linked arguments. In: Conference on empirical methods in natural language processing (EMNLP)

  22. Kuhn T (2007) AceRules: executing rules in controlled natural language. In: Massimo Marchiori CdSM, Pan JZ (eds) Proceedings of the first international conference on web reasoning and rule systems (RR2007), Lecture notes in computer science. Springer

  23. Landauer TK, Laham D, Foltz PW (1998) Learning human-like knowledge by singular value decomposition: a progress report. In: Proceedings of the conference on advances in neural information processing systems (NIPS), pp 45–51

  24. Mausam Schmitz M, Soderland S, Bart R, Etzioni O (2012) Open language learning for information extraction. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 523–534

  25. Maynard D, Peters W, Li Y (2006) Metrics for evaluation of ontology-based information extraction. In: WWW workshop on evaluation of ontologies for the Web’ (EON). Edinburgh, Scotland, UK

  26. Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the joint conference of the 47th annual meeting of the association for computational linguistics (ACL) and the 4th international joint conference on natural language processing of the AFNLP, pp 1003–1011

  27. Motik B, Shearer R, Horrocks I (2009) Hypertableau reasoning for description logics. J Artif Intell Res 36:165–228

    MathSciNet  MATH  Google Scholar 

  28. Parsia B, Sirin E (2004) Pellet: an OWL DL Reasoner. In: 3rd international semantic web conference (ISWC)

  29. Presutti V, Draicchio F, Gangemi A (2012) Knowledge extraction based on discourse representation theory and linguistic frames. In: Proceedings of the 18th international conference on knowledge engineering and knowledge management, EKAW’12, pp 114–129. Springer, Berlin, Heidelberg. doi:10.1007/978-3-642-33876-2_12

  30. Reiter R (1987) A theory of diagnosis from first principles. Artif Intell 32(1):57–95

    Article  MathSciNet  MATH  Google Scholar 

  31. Ritter A, Downey D, Soderland S, Etzioni O (2008) It’s a contradiction—no, it’s not: a case study using functional relations. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 11–20

  32. Saggion H, Funk A, Maynard D, Bontcheva K (2007) Ontology-based information extraction for business intelligence. The Semantic Web, pp 843–856

  33. Schlobach S, Cornet R (2003) Non-standard reasoning services for the debugging of description logic terminologies. In: Proceedings of the 19th international joint conference on artificial intelligence (IJCAI), pp 355–362

  34. Schlobach S, Huang Z, Cornet R, van Harmelen F (2007) Debugging incoherent terminologies. J Autom Reason 39(3):317–349

    Article  MathSciNet  MATH  Google Scholar 

  35. Smith A, Osborne M (2006) Using gazetteers in discriminative information extraction. In: Proceedings of the tenth conference on computational natural language learning, CoNLL-X ’06. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 133–140

  36. Sohlberg MM, Ehlhardt L, Fickas S, Sutcliffe A (2003) A pilot study exploring electronic mail in users with acquired cognitive-linguistic impairments. Brain Injury 17(7):609–629

  37. Surdeanu M, Tibshirani J, Nallapati R, Manning CD (2012) Multi-instance multi-label learning for relation extraction. In: Proceedings joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 455–465

  38. Wimalasuriya DC, Dou D (2010) Ontology-based information extraction: an introduction and a survey of current approaches. J Inf Sci 36:306–323

    Article  Google Scholar 

Download references

Acknowledgements

This research is partially supported by the National Science Foundation Grant IIS-1118050 and Grant IIS-1013054. This research is also partially supported by the Fondo Nacional De Ciencia Y Tecnologia (FONDECYT), Chile, Grant No. 3170971. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the NSF or FONDECYT.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dejing Dou.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gutierrez, F., Dou, D., de Silva, N. et al. Online Reasoning for Semantic Error Detection in Text. J Data Semant 6, 139–153 (2017). https://doi.org/10.1007/s13740-017-0079-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13740-017-0079-6

Keywords

Navigation