Skip to main content

Semi-supervised Relation Extraction from Monolingual Dictionary for Russian WordNet

  • Conference paper
  • First Online:
  • 879 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10761))

Abstract

Monolingual dictionaries are a voluminous loosely structured source of lexical and ontological information. Numerous attempts were made to extract WordNet or ontology relations from monolingual dictionaries with varying success. Most such attempts are based on morphosyntactic rules. Difficulty of the information extraction task greatly depends on discipline of dictionary editors. Despite frequently being excellent for the human reader the discipline is rarely strict enough to allow effortless data mining on dictionaries.

Here an improvement to rule-based approach to relation extraction is put forward. The improved approach is to automatically cluster similar definitions, then manually create either one or two relation extraction rules per cluster. This helps to reduce amount of annotator work, to increase quality of rule application and to pay more attention to some of rare cases. To group definitions with similar structure mixed n-gram features were employed, their usefulness is discussed.

The work is performed on Big Explanatory Dictionary of Russian language. Definitions are grouped to 100 clusters, annotated and correctness assessed. The average accuracy is 86% for hypernym extraction, which is high for works of the same scope.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Available at https://bitbucket.org/dendik/russian-wordnet-rules, http://www.cicling.org/2016/data/311.

References

  1. Alexeyevsky, D., Temchenko, A.V.: WSD in monolingual dictionaries for Russian WordNet. In: Fellbaum, C., Forŏscu, C., Mititelu, V., Vossen, P. (eds.) Proceedings of the Eighth Global WordNet Conference, pp. 10–15. Bucharest, Romania, January 2016

    Google Scholar 

  2. Barnbrook, G., Sinclair, J.: Specialised corpus, local and functional grammars. Small Corpus Stud. ELT: Theor. Pract. 5, 237 (2001)

    Article  Google Scholar 

  3. Benitez, L., Cervell, S., Escudero, G., Lopez, M., Rigau, G., Taulé, M.: Methods and Tools for Building the Catalan WordNet. CoRR cmp-lg/9806009 (1998). http://arxiv.org/abs/cmp-lg/9806009

  4. Bordea, G., Buitelaar, P., Faralli, S., Navigli, R.: Semeval-2015 task 17: Taxonomy Extraction Evaluation (TExEval). In: Proceedings of the 9th International Workshop on Semantic Evaluation. Association for Computational Linguistics (2015)

    Google Scholar 

  5. Bramsen, P., Escobar-Molano, M., Patel, A., Alonso, R.: Extracting social power relationships from natural language. pp. 773–782. Association for Computational Linguistics (2011)

    Google Scholar 

  6. Fellbaum, C.: WordNet. Wiley Online Library (1998)

    Google Scholar 

  7. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora, pp. 539–545. Association for Computational Linguistics (1992)

    Google Scholar 

  8. Kummerfeld, J.K., Hall, D., Curran, J.R., Klein, D.: Parser showdown at the wall street corral: an empirical investigation of error types in parser output, pp. 1048–1059. Association for Computational Linguistics (2012)

    Google Scholar 

  9. Kuznetsov, S.A.: The Newest Big Explanatory Dictionary of Russian Language. RIPOL-Norint, St.Petersburg (2008)

    Google Scholar 

  10. Lindén, K., Niemi, J.: Is it possible to create a very large wordnet in 100 days? An evaluation. Lang. Resour. Eval. 48(2), 191–201 (2014)

    Article  Google Scholar 

  11. Navigli, R., Velardi, P.: Learning word-class lattices for definition and hypernym extraction, pp. 1318–1327. Association for Computational Linguistics (2010)

    Google Scholar 

  12. Oliveira, H.G., Gomes, P.: Automatic Discovery of Fuzzy Synsets from Dictionary Definitions, pp. 1801–1806 (2011)

    Google Scholar 

  13. Oliveira, H.G., Santos, D., Gomes, P.: Relations extracted from a portuguese dictionary: results and first evaluation, pp. 541–552 (2009)

    Google Scholar 

  14. Pedersen, B.S., Nimb, S., Asmussen, J., Sørensen, N.H., Trap-Jensen, L., Lorentzen, H.: DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary. Lang. Resour. Eval. 43(3), 269–299 (2009). https://doi.org/10.1007/s10579-009-9092-1

  15. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  16. Sabirova, K., Lukanin, A.: Automatic extraction of hypernyms and hyponyms from russian texts. In: Ignatov, D.I., Khachay, M.Y., Panchenko, A., Konstantinova, N., Yavorsky, R., Ustalov, D. (eds.) Supplementary Proceedings of the 3rd International Conference on Analysis of Images, Social Networks and Texts (AIST 2014), vol. 1197, pp. 35–40. Citeseer (2014)

    Google Scholar 

  17. Segalovich, I.: A Fast Morphological Algorithm with Unknown Word Guessing Induced by a Dictionary for a Web Search Engine, pp. 273–280. Citeseer (2003). https://tech.yandex.ru/mystem/

  18. Van Rossum, G.: Python Programming Language, vol. 41 (2007)

    Google Scholar 

  19. Vossen, P.: A Multilingual Database with Lexical Semantic Networks. Springer, Dordrecht (1998). https://doi.org/10.1007/978-94-017-1491-4

  20. Wang, T., Hirst, G.: Extracting Synonyms from Dictionary Definitions, pp. 471–477 (2009)

    Google Scholar 

  21. Weeds, J., Clarke, D., Reffin, J., Weir, D., Keller, B.: Learning to distinguish hypernyms and co-hyponyms, pp. 2249–2259. Dublin City University and Association for Computational Linguistics (2014)

    Google Scholar 

  22. Yamane, J., Takatani, T., Yamada, H., Miwa, M., Sasaki, Y.: Distributional Hypernym Generation by Jointly Learning Clusters and Projections (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniil Alexeyevsky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alexeyevsky, D. (2018). Semi-supervised Relation Extraction from Monolingual Dictionary for Russian WordNet. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77113-7_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77112-0

  • Online ISBN: 978-3-319-77113-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics