Skip to main content

Using the Web to Validate Lexico-Semantic Relations

  • Conference paper
Progress in Artificial Intelligence (EPIA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7026))

Included in the following conference series:

  • 1419 Accesses

Abstract

The evaluation of semantic relations acquired automatically from text is a challenging task, which generally ends up being done by humans. Despite less prone to errors, manual evaluation is hardly repeatable, time-consuming and sometimes subjective. In this paper, we evaluate relational triples automatically, exploiting popular similarity measures on the Web. After using these measures to quantify triples according to the co-occurrence of their arguments and textual patterns denoting their relation, some scores revealed to be highly correlated with the correction rate of the triples. The measures were also used to select correct triples in a set, with best F 1 scores around 96%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bennett, C.H., Gacs, P., Gcs, P., Member, S., Li, M., Vitanyi, P.M.B., Zurek, W.H.: Information Distance. IEEE Transactions on Information Theory 44, 1407–1423 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  2. Blohm, S., Cimiano, P., Stemle, E.: Harvesting relations from the web: quantifiying the impact of filtering functions. In: Proc. 22nd National Conf. on Artificial Intelligence, pp. 1316–1321. AAAI (2007)

    Google Scholar 

  3. Bollegala, D., Honma, T., Matsuo, Y., Ishizuka, M.: Mining for personal name aliases on the web. In: Proc. 17th International Conf. on the World Wide Web, pp. 1107–1108. ACM (2008)

    Google Scholar 

  4. Bollegala, D., Matsuo, Y., Ishizuka, M.: Measuring semantic similarity between words using web search engines. In: Proc. 16th International Conf. on the World Wide Web, pp. 757–766. ACM, New York (2007)

    Google Scholar 

  5. Brank, J., Grobelnik, M., Mladenić, D.: A survey of ontology evaluation techniques. In: Proc. Conf. on Data Mining and Data Warehouses, SIKDD (2005)

    Google Scholar 

  6. Cederberg, S., Widdows, D.: Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction. In: Proc. Conf. on Computational Natural Language Learning, pp. 111–118 (2003)

    Google Scholar 

  7. Cilibrasi, R., Vitanyi, P.M.B.: Normalized Web Distance and Word Similarity. Computing Research Repository, ArXiv e-prints (2009)

    Google Scholar 

  8. Cimiano, P., Staab, S.: Learning by googling. SIGKDD Explorations Newsletter 6(2), 24–33 (2004)

    Article  Google Scholar 

  9. Cimiano, P., Wenderoth, J.: Automatic Acquisition of Ranked Qualia Structures from the Web. In: Proc. 45th Annual Meeting of the Association of Computational Linguistics, pp. 888–895. ACL, Prague (2007)

    Google Scholar 

  10. Costa, R.P., Seco, N.: Hyponymy extraction and web search behavior analysis based on query reformulation. In: Geffner, H., Prada, R., Machado Alexandre, I., David, N. (eds.) IBERAMIA 2008. LNCS (LNAI), vol. 5290, pp. 332–341. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Downey, D., Etzioni, O., Soderland, S.: A probabilistic model of redundancy in information extraction. In: Proc. 19th International Joint Conf. on Artificial Intelligence, pp. 1034–1041. Morgan Kaufmann Publishers Inc., San Francisco (2005)

    Google Scholar 

  12. Etzioni, O., Cafarella, M., Downey, D., Popescu, A.M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: an experimental study. Artificial Intelligence 165(1), 91–134 (2005)

    Article  Google Scholar 

  13. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database (Language, Speech, and Communication). MIT (May 1998)

    Google Scholar 

  14. Gracia, J.L., Mena, E.: Web-Based Measure of Semantic Relatedness. In: Bailey, J., Maier, D., Schewe, K.-D., Thalheim, B., Wang, X.S. (eds.) WISE 2008. LNCS, vol. 5175, pp. 136–150. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Harris, Z.: Distributional structure. In: Papers in Structural and Transformational Linguistics, pp. 775–794. D. Reidel Publishing Comp., Dordrecht (1970)

    Chapter  Google Scholar 

  16. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proc. 14th Conf. on Computational Linguistics, pp. 539–545. ACL, Morristown (1992)

    Chapter  Google Scholar 

  17. Lenat, D.: CYC: A Large-Scale Investment in Knowledge Infrastructure. Communications of the ACM 38, 33–38 (1995)

    Article  Google Scholar 

  18. Magnini, B., Negri, M., Prevete, R., Tanev, H.: Is It the Right Answer? Exploiting Web Redundancy for Answer Validation. In: Proc. 40th Annual Meeting of the Association for Computational Linguistics, pp. 425–432 (2002)

    Google Scholar 

  19. Oliveira, P.C.: Probabilistic Reasoning in the Semantic Web using Markov Logic, pp. 67–73. University of Coimbra, Faculty of Sciences and Technology, Department of Informatics Engineering (July 2009)

    Google Scholar 

  20. Pantel, P., Pennacchiotti, M.: Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations. In: Proc. 21st International Conf. on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING-ACL), pp. 113–120. ACL, Sydney (2006)

    Google Scholar 

  21. Turney, P.D.: Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  22. Wu, F., Weld, D.S.: Open Information Extraction Using Wikipedia. In: Proc. 48th Annual Meeting of the Association for Computational Linguistics, pp. 118–127. ACL, Uppsala (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Costa, H.P., Gonçalo Oliveira, H., Gomes, P. (2011). Using the Web to Validate Lexico-Semantic Relations. In: Antunes, L., Pinto, H.S. (eds) Progress in Artificial Intelligence. EPIA 2011. Lecture Notes in Computer Science(), vol 7026. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24769-9_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24769-9_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24768-2

  • Online ISBN: 978-3-642-24769-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics