Skip to main content

Programmatic Link Grammar Induction for Unsupervised Language Learning

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11654))

Abstract

Although natural (i.e. human) languages do not seem to follow a strictly formal grammar, their structure analysis and generation can be approximated by one. Having such a grammar is an important tool for programmatic language understanding. Due to the huge number of natural languages and their variations, processing tools that rely on human intervention are available only for the most popular ones. We explore the problem of unsupervisedly inducing a formal grammar for any language, using the Link Grammar paradigm, from unannotated parses also obtained without supervision from an input corpus. The details of our state-of-the-art grammar induction technology and its evaluation techniques are described, as well as preliminary results of its application on both synthetic and real world text-corpora.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Yuret, D.: Discovery of linguistic relations using lexical attraction. arXiv:cmp-lg/9805009 [cs.CL] (1998)

  2. Vepstas, L., Goertzel, B.: Learning language from a large (unannotated) corpus. arXiv:1401.3372 [cs.CL], 14 January 2014

  3. Sleator, D., Temperley, D.: Parsing English with a link grammar. In: Third International Workshop on Parsing Technologies (1993)

    Google Scholar 

  4. Glushchenko, A., et al.: Unsupervised language learning in OpenCog. In: Iklé, M., Franz, A., Rzepka, R., Goertzel, B. (eds.) AGI 2018. LNCS (LNAI), vol. 10999, pp. 109–118. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97676-1_11

    Chapter  Google Scholar 

  5. Wrenn, J., Stetson, P., Johnson, S.: An unsupervised machine learning approach to segmentation of clinician-entered free text. In: AMIA Annual Symposium Proceedings 2007, vol. 2007, pp. 811–815 (2007)

    Google Scholar 

  6. Castillo-Domenech, C., Suarez-Madrigal, A.: Statistical parsing and unambiguous word representation in OpenCog’s Unsupervised Language Learning project. Göteborg: Chalmers University of Technology (2018). https://publications.lib.chalmers.se/records/fulltext/256408/256408.pdf

  7. Dupoux E: Cognitive science in the era of artificial intelligence: a roadmap for reverse-engineering the infant language-learner. arXiv:1607.08723 [cs.CL] (2018)

  8. Goertzel, B., Pennachin, C., Geisweiller, N: Engineering General Intelligence, Part 2: The CogPrime Architecture for Integrative, Embodied AGI. Atlantis Press (2014)

    Google Scholar 

  9. Harwath, D., Torralba, A., Glass, J.: Unsupervised learning of spoken language with visual context. In: 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (2016)

    Google Scholar 

  10. Došilović, F., Brčić, M., Hlupić, N.: Explainable artificial intelligence: a survey. In: 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) (2018)

    Google Scholar 

  11. Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: NIPS 20114 Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 2, pp. 2177–2185 (2014)

    Google Scholar 

  12. Church, K., Hank, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. Arch. 16(1), 22–29 (1990)

    Google Scholar 

  13. Wall, M., Rechtsteiner, A., Rocha, L.: Singular value decomposition and principal component analysis. arXiv:physics/0208101 (2002)

  14. Dai, B., Ding, S., Wahba, G.: Multivariate Bernoulli distribution. Bernoulli 19(4), 1465–1483 (2013)

    Article  MathSciNet  Google Scholar 

  15. Sculley, D.: Web-scale k-means clustering. In: WWW 2010 Proceedings of the 19th International Conference on World-Wide-Web, Raleigh, NC, USA, pp. 1177–1178 (2010)

    Google Scholar 

  16. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  17. Bernstein-Ratner, N.: The phonology of parent child speech. Children’s Lang. 6(3), 159–174 (1987)

    Google Scholar 

  18. Brent, M., Cartwright, T.: Distributional regularity and phonotactic constraints are useful for segmentation. Cognition 61, 93–125 (1996)

    Article  Google Scholar 

  19. Brent, M., Siskind, J.: The role of exposure to isolated words in early vocabulary development. Cognition 81(2), B33–B44 (2001)

    Article  Google Scholar 

Download references

Acknowledgements

We appreciate contributions by Linas Vepstas, including insightful discussions and critique on our research. We thank Amir Plivatsky for valuable feedback and maintenance and incremental improvements of the LG parser technology used in our work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anton Kolonin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Glushchenko, A., Suarez, A., Kolonin, A., Goertzel, B., Baskov, O. (2019). Programmatic Link Grammar Induction for Unsupervised Language Learning. In: Hammer, P., Agrawal, P., Goertzel, B., Iklé, M. (eds) Artificial General Intelligence. AGI 2019. Lecture Notes in Computer Science(), vol 11654. Springer, Cham. https://doi.org/10.1007/978-3-030-27005-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27005-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27004-9

  • Online ISBN: 978-3-030-27005-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics