Published online by Cambridge University Press: 27 October 2020
This study systematically reviews existing approaches to unsupervised grammar induction in terms of their theoretical underpinnings, practical implementations and evaluation. Our motivation is to identify the influence of functional-cognitive schools of grammar on language processing models in computational linguistics. This is an effort to fill any gap between the theoretical school and the computational processing models of grammar induction. Specifically, the review aims to answer the following research questions: Which types of grammar theories have been the subjects of grammar induction? Which methods have been employed to support grammar induction? Which features have been used by these methods for learning? How were these methods evaluated? Finally, in terms of performance, how do these methods compare to one another? Forty-three studies were identified for systematic review out of which 33 described original implementations of grammar induction; three provided surveys and seven focused on theories and experiments related to acquisition and processing of grammar in humans. The data extracted from the 33 implementations were stratified into 7 different aspects of analysis: theory of grammar; output representation; how grammatical productivity is processed; how grammatical productivity is represented; features used for learning; evaluation strategy and implementation methodology. In most of the implementations considered, grammar was treated as a generative-formal system, autonomous and independent of meaning. The parser decoding was done in a non-incremental, head-driven fashion by assuming that all words are available for the parsing model and the output representation of the grammar learnt was hierarchical, typically a dependency or a constituency tree. However, the theoretical and experimental studies considered suggest that a usage-based, incremental, sequential system of grammar is more appropriate than the formal, non-incremental, hierarchical view of grammar. This gap between the theoretical as well as experimental studies on one hand and the computational implementations on the other hand should be addressed to enable further progress in computational grammar induction research.
To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox.
To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive.