Finding Instance Names and Alternative Glosses on the Web: WordNet Reloaded

Paşca, Marius

doi:10.1007/978-3-540-30586-6_31

Marius Paşca¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3406))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2325 Accesses
7 Citations

Abstract

This paper presents an approach to extending existing lexical resources with instance names and alternative definitions acquired from textual documents. The experiments involve WordNet and approximately 300 million Web documents, but the method is more generally applicable. We leverage formally-structured, human-validated resources, on one hand, and data-driven instance names and definitions on the other, which opens the path to new applications of the reloaded resources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

SenseDefs: a multilingual corpus of semantically annotated textual definitions

Article Open access 23 July 2018

CoNLL-RDF: Linked Corpora Done in an NLP-Friendly Way

Common-Sense Knowledge for Natural Language Understanding: Experiments in Unsupervised and Supervised Settings

References

Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database and Some of its Applications. MIT Press, Cambridge (1998)
Google Scholar
Agirre, E., Rigau, G.: Word sense disambiguation using conceptual density. In: Proceedings of the 16th International Conference on Computational Linguistics (COLING 1996), Copenhagen, Denmark, pp. 16–22 (1996)
Google Scholar
Chai, J., Biermann, A.: The use of word sense disambiguation in an information extraction system. In: Proceedings of the 16th National Conference on Artificial Intelligence (AAAI 1999), Menlo Park, California, pp. 850–855 (1999)
Google Scholar
Dorr, B., Katsova, M.: Lexical selection for cross-language applications: Combining LCS with WordNet. In: Farwell, D., Gerber, L., Hovy, E. (eds.) AMTA 1998. LNCS (LNAI), vol. 1529, pp. 438–447. Springer, Heidelberg (1998)
Chapter Google Scholar
Green, S.: Automatically generating hypertext in newspaper articles by computing semantic relatedness. In: Proceedings of the 2nd Conference on Computational Language Learning (CoNLL 1998), Sydney, Australia, pp. 101–110 (1998)
Google Scholar
Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), Acapulco, Mexico, pp. 805–810 (2003)
Google Scholar
Brants, T.: TnT - a statistical part of speech tagger. In: Proceedings of the 6th Conference on Applied Natural Language Processing (ANLP 2000), Seattle, Washington, pp. 224–231 (2000)
Google Scholar
Voorhees, E.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval. Information Processing and Management 22, 465–476 (1986)
Article Google Scholar
Paşca, M.: Acquisition of categorized named entities for Web search. In: Proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM 2004), Washington, D.C. (2004)
Google Scholar
Wacholder, N., Ravin, Y., Choi, M.: Disambiguation of proper names in text. In: Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP 1997), Washington, D.C., pp. 202–208 (1997)
Google Scholar
Fujii, A., Ishikawa, T.: Summarizing encyclopedic term descriptions on the web. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 645–651 (2004)
Google Scholar
Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics (COLING 1992), Nantes, France, pp. 539–545 (1992)
Google Scholar
Schiffman, B., Mani, I., Concepcion, C.: Producing biographical summaries: Combining linguistic knowledge with corpus statistics. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL 2001), Toulouse, France, pp. 450–457 (2001)
Google Scholar
Phillips, W., Riloff, E.: Exploiting strong syntactic heuristics and co-training to learn semantic lexicons. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), Philadelphia, Pennsylvania, pp. 125–132 (2002)
Google Scholar
Ravichandran, D., Hovy, E.: Learning surface text patterns for a question answering system. In: Proceedings of the 40th Annual Meeting of the Association of Computational Linguistics (ACL 2002), Philadelphia, Pennsylvania (2002)
Google Scholar
Solorio, T., Pérez, M., Montes, M., Villasenor, L., López, A.: A language independent method for question classification. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland (2004)
Google Scholar
Cucerzan, S., Yarowsky, D.: Language independent named entity recognition combining morphological and contextual evidence. In: Proceedings of the 1999 Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC 1999), College Park, Maryland, pp. 90–99 (1999)
Google Scholar
Liu, B., Chin, C., Ng, H.: Mining topic-specific concepts and definitions on the web. In: Proceedings of the 12th International World Wide Web Conference (WWW 2003), Budapest, Hungary, pp. 251–260 (2003)
Google Scholar
Dolan, W., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Google Inc., 1600 Amphitheatre Parkway, Mountain View, California, 94043
Marius Paşca

Authors

Marius Paşca
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, México
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paşca, M. (2005). Finding Instance Names and Alternative Glosses on the Web: WordNet Reloaded. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2005. Lecture Notes in Computer Science, vol 3406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30586-6_31

Download citation

DOI: https://doi.org/10.1007/978-3-540-30586-6_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24523-0
Online ISBN: 978-3-540-30586-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Finding Instance Names and Alternative Glosses on the Web: WordNet Reloaded

Abstract

Access this chapter

Preview

Similar content being viewed by others

SenseDefs: a multilingual corpus of semantically annotated textual definitions

CoNLL-RDF: Linked Corpora Done in an NLP-Friendly Way

Common-Sense Knowledge for Natural Language Understanding: Experiments in Unsupervised and Supervised Settings

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Finding Instance Names and Alternative Glosses on the Web: WordNet Reloaded

Abstract

Access this chapter

Preview

Similar content being viewed by others

SenseDefs: a multilingual corpus of semantically annotated textual definitions

CoNLL-RDF: Linked Corpora Done in an NLP-Friendly Way

Common-Sense Knowledge for Natural Language Understanding: Experiments in Unsupervised and Supervised Settings

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation