Skip to main content

Merging Structural and Taxonomic Similarity for Text Retrieval Using Relational Descriptions

  • Conference paper
Digital Libraries (IRCDL 2010)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 91))

Included in the following conference series:

  • 676 Accesses

Abstract

Information retrieval effectiveness has become a crucial issue with the enormous growth of available digital documents and the spread of Digital Libraries. Search and retrieval are mostly carried out on the textual content of documents, and traditionally only at the lexical level. However, pure term-based queries are very limited because most of the information in natural language is carried by the syntactic and logic structure of sentences. To take into account such a structure, powerful relational languages, such as first-order logic, must be exploited. However, logic formulæ constituents are typically uninterpreted (they are considered as purely syntactic entities), whereas words in natural language express underlying concepts that involve several implicit relationships, as those expressed in a taxonomy. This problem can be tackled by providing the logic interpreter with suitable taxonomic knowledge.

This work proposes the exploitation of a similarity framework that includes both structural and taxonomic features to assess the similarity between First-Order Logic (Horn clause) descriptions of texts in natural language, in order to support more sophisticated information retrieval approaches than simple term-based queries. Evaluation on a sample case shows the viability of the solution, although further work is still needed to study the framework more deeply and to further refine it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agichtein, E., Askew, W., Liu, Y.: Combining lexical, syntactic, and semantic evidence for textual entailment classification. In: Proc. 1st Text Analysis Conference, TAC (2008)

    Google Scholar 

  2. Budanitsky, A., Hirst, G.: Semantic distance in wordnet: An experimental, application-oriented evaluation of five measures. In: Proc. Workshop on WordNet and Other Lexical Resources, 2nd meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh (2001)

    Google Scholar 

  3. Ceri, S., Gottlöb, G., Tanca, L.: Logic Programming and Databases. Springer, Heidelberg (1990)

    Book  Google Scholar 

  4. Clark, P., Harrison, P.: Recognizing textual entailment with logical inference. In: Proc. 1st Text Analysis Conference, TAC (2008)

    Google Scholar 

  5. Esposito, F., Fanizzi, N., Ferilli, S., Semeraro, G.: A generalization model based on oi-implication for ideal theory refinement. Fundamenta Informaticæ 47(1-2), 15–33 (2001)

    MathSciNet  MATH  Google Scholar 

  6. Ferilli, S., Basile, T.M.A., Biba, M., Di Mauro, N., Esposito, F.: A general similarity framework for horn clause logic. Fundamenta Informaticæ 90(1-2), 43–46 (2009)

    MathSciNet  MATH  Google Scholar 

  7. Ferilli, S., Fanizzi, N., Semeraro, G.: Learning logic models for automated text categorization. In: AI*IA 2001: Advances in Artificial Intelligence. Springer, Heidelberg (2001)

    Google Scholar 

  8. Ide, N., Véronis, J.: Word sense disambiguation: The state of the art. Computational Linguistics 24, 1–40 (1998)

    Google Scholar 

  9. Inkpen, D., Kipp, D., Nastase, V.: Machine learning experiments for textual entailment. In: Proc. 2nd PASCAL Recognising Textual Entailment Challenge, RTE-2 (2006)

    Google Scholar 

  10. Krovetz, R.: More than one sense per discourse. In: NEC Princeton NJ Labs., Research Memorandum (1998)

    Google Scholar 

  11. Lin, D.: An information-theoretic definition of similarity. In: Proc. 15th International Conf. on Machine Learning, pp. 296–304. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  12. Lloyd, J.W.: Foundations of Logic Programming, 2nd edn. Springer, Berlin (1987)

    Book  MATH  Google Scholar 

  13. Miller, G.A.: Wordnet: A lexical database for English. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  14. Muggleton, S.: Inductive logic programming. New Generation Computing 8(4), 295–318 (1991)

    Article  MATH  Google Scholar 

  15. Pennacchiotti, M., Zanzotto, F.M.: Learning shallow semantic rules for textual entailment. In: Proc. International Conference on Recent Advances in Natural Language Processing, RANLP 2007 (2007)

    Google Scholar 

  16. Rouveirol, C.: Extensions of inversion of resolution applied to theory completion. In: Inductive Logic Programming, pp. 64–90. Academic Press, London (1992)

    Google Scholar 

  17. Vargas-Vera, M., Motta, E.: An ontology-driven similarity algorithm. Tech. Report kmi-04-16. Knowledge Media Institute (KMi), The Open University, UK (July 2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ferilli, S., Biba, M., Di Mauro, N., Basile, T.M.A., Esposito, F. (2010). Merging Structural and Taxonomic Similarity for Text Retrieval Using Relational Descriptions. In: Agosti, M., Esposito, F., Thanos, C. (eds) Digital Libraries. IRCDL 2010. Communications in Computer and Information Science, vol 91. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15850-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15850-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15849-0

  • Online ISBN: 978-3-642-15850-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics