Skip to main content

From Cues to Categories: A Computational Study of Children’s Early Word Categorization

  • Chapter
  • First Online:

Abstract

Young children exhibit knowledge of abstract syntactic categories of words, such as noun and verb. A key research question is concerned with the type of information that children might use to form such categories. We use a computational model to provide insights into the (differential and cooperative) role of various information sources (namely, distributional, morphological, phonological, and semantic properties of words) in children’s early word categorization. Specifically, we use an unsupervised incremental clustering algorithm to learn categories of words using different combinations of these information sources, and determine the role of each type of cue by evaluating the quality of the resulting categories. We conduct two types of experiments: First, we compare the categories learned by our model to a set of gold-standard part of speech (PoS) tags, such as verb and noun. Second, we perform an experiment which simulates a particular language task similar to what performed by children, as reported in a psycholinguistic study by Brown (J Abnor Soc Psychol 55(1):1–5, 1957). Our results suggest that different categories of words may be recognized by relying on different types of cues. The results also indicate the importance of knowledge of word meanings for their syntactic categorization, and vice versa: Addition of semantic information leads to the construction of categories with a better match to the gold-standard parts of speech. On the other hand, our model (like children) can predict the semantic class of a word (e.g., action or object) by drawing on its learned knowledge of the word’s syntactic category.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In earlier experiments, we also included the first phoneme (beginning) of a word—a feature also considered by Onnis and Christiansen [22]. In our initial evaluations, we found that the inclusion of this feature did not affect the performance, and hence excluded it from further consideration.

  2. 2.

    Authors are grateful to Christopher Parisien for providing them with a preprocessed version of this corpus.

  3. 3.

    The “Null” value is treated as a missing value for a feature.

  4. 4.

    http://www.psych.rl.ac.uk/

  5. 5.

    We have performed similar experiments with different ranges of cluster numbers, and found that the general patterns in results are similar. In Appendix A, we report the result of experiments in which we set the number of clusters within the range 346–500 ( < 500). In general we prefer fewer clusters (fewer than our vocabulary size) to allow for generalization. We expect the generalization ability of the model with 247–288 ( < 300) clusters to be reasonably good since more than 55 % of these clusters contain three or more word types in all conditions.

  6. 6.

    In both the training and test data less than 6 % of the vocabulary are adjectives.

  7. 7.

    Note that although the results show that by using semantic features the prediction accuracies for adjectives and determiners are substantially improved, this effect is due to the nature of the semantic features for these words (taken from Harm [14]) and should be interpreted with caution.

  8. 8.

    Results of the novel word categorization experiment are included in the Appendix with more details.

References

  1. Alishahi, A., & Chrupała, G. (2009) Lexical category acquisition as an incremental process. In CogSci-2009 Workshop on Psychocomputational Models of Human Language Acquisition, Amsterdam.

    Google Scholar 

  2. Asr, F. T., Fazly, A. & Azimifar, Z. (2010). The effect of word-internal properties on syntactic categorization: A computational modeling approach. In Proceedings of the 32nd Annual Conference of the Cognitive Science Society, Portland, USA.

    Google Scholar 

  3. Berko, G. J. (1958). The child’s learning of English morphology. Word, 14, 150–177.

    Google Scholar 

  4. Brown, R. (1957). Linguistic determinism and the part of speech. Journal of Abnormal and Social Psychology, 55(1), 1–5.

    Article  Google Scholar 

  5. Cartwright, T., & Brent, M. (1997). Syntactic categorization in early language acquisition: Formalizing the role of distributional analysis. Cognition, 63(2), 121–170.

    Article  Google Scholar 

  6. Chang, F., Lieven, E., & Tomasello, M. (2008). Automatic evaluation of syntactic learners in typologically-different languages. Cognitive Systems Research, 9(3), 198–213.

    Article  Google Scholar 

  7. Chrupała, G., & Alishahi, A. (2010). Online entropy-based model of lexical category acquisition. In Proceedings of 14th Conference on Computational Natural Language Learning (CoNLL) (pp. 182–191), Uppsala, Sweden.

    Google Scholar 

  8. Clark, A. (2000). Inducing syntactic categories by context distribution clustering. In Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning (Vol. 7, pp. 91–94). Morristown: Association for Computational Linguistics.

    Google Scholar 

  9. Fazly, A., Alishahi, A., & Stevenson, S. (2008). A probabilistic incremental model of word learning in the presence of referential uncertainty. In Proceedings of the 30th Annual Conference of the Cognitive Science Society, Washington, DC.

    Google Scholar 

  10. Fellbaum, C. (1998). WordNet: An electronic lexical database. Cambridge: The MIT press. ISBN 026206197X.

    MATH  Google Scholar 

  11. Gelman, S., & Taylor, M. (1984). How two-year-old children interpret proper and common names for unfamiliar objects. Child Development, 55(4), 1535–1540.

    Article  Google Scholar 

  12. Gerken, L., Wilson, R., & Lewis, W. (2005). Infants can use distributional cues to form syntactic categories. Journal of Child Language, 32(02), 249–268.

    Article  Google Scholar 

  13. Goldwater, S., Griffiths, T. L., & Johnson, M. (2009). A bayesian framework for word segmentation: Exploring the effects of context. Cognition, 112(1), 21–54.

    Article  Google Scholar 

  14. Harm, M. (2002). Building large scale distributed semantic feature sets with WordNet (Tech. Rep. No. PDP. CNS. 02.01). Carnegie Mellon University, Center for the Neural Basis of Cognition, Pittsburgh, PA.

    Google Scholar 

  15. Kaplan, F., Oudeyer, P., & Bergen, B. (2008). Computational models in the debate over language learnability. Infant and child development, 17(1), 55–80.

    Article  Google Scholar 

  16. Kemp, N., Lieven, E., Tomasello, M. (2005). Young children’s knowledge of the “determiner” and “adjective” categories. Journal of Speech, Language, and Hearing Research, 48(3), 592–602.

    Article  Google Scholar 

  17. Kipper-Schuler, K. (2005). VerbNet: A broad-coverage, comprehensive verb lexicon. Ph.D. thesis, University of Pennsylvania, Philadelphia.

    Google Scholar 

  18. MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk, volume 2: The database (3rd ed.). MahWah: Lawrence Erlbaum Associates.

    Google Scholar 

  19. Mintz, T. (2003). Frequent frames as a cue for grammatical categories in child directed speech. Cognition, 90(1), 91–117.

    Article  Google Scholar 

  20. Monaghan, P., Christiansen, M., & Chater, N. (2007). The phonological-distributional coherence hypothesis: Cross-linguistic evidence in language acquisition. Cognitive Psychology, 55(4), 259–305.

    Article  Google Scholar 

  21. Naigles, L. (1990). Children use syntax to learn verb meanings. Journal of Child Language, 17, 357–374.

    Article  Google Scholar 

  22. Onnis, L. & Christiansen, M. (2008). Lexical categories at the edge of the word. Cognitive Science, 32(1), 184–221.

    Article  Google Scholar 

  23. Parisien, C., Fazly, A., & Stevenson, S. (2008). An incremental Bayesian model for learning syntactic categories. In Proceedings of the Twelfth Conference on Computational Natural Language Learning (pp. 89–96). New York: Association for Computational Linguistics.

    Google Scholar 

  24. Pearl, L. (2009). Using computational modeling in language acquisition research. Experimental Methods in Language Acquisition Research, 163–184.

    Google Scholar 

  25. Redington, M., Chater, N., & Finch, S. (1998). Distributional information: A powerful cue for acquiring syntactic categories. Cognitive Science, 22(4), 425–469.

    Article  Google Scholar 

  26. Samuelson, L. & Smith, L. (1999). Early noun vocabularies: do ontology, category structure and syntax correspond? Cognition, 73(1), 1–33.

    Article  Google Scholar 

  27. Schütze, H. (1995). Distributional part-of-speech tagging. In Proceedings of the Seventh Conference on European Chapter of the Association for Computational Linguistics (pp. 141–148). San Francisco: Morgan Kaufmann Publishers Inc.

    Google Scholar 

  28. Theakston, A. L., Lieven, E. V., Pine, J. M., & Rowland, C. F. (2001). The role of performance limitations in the acquisition of verb–argument structure: An alternative account. Journal of Child Language, 28, 127–152.

    Article  Google Scholar 

  29. Wilson, M. (1988). MRC psycholinguistic database: Machine-usable dictionary, version 2.00. Behavior Research Methods, 20(1), 6–10.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatemeh Torabi Asr .

Editor information

Editors and Affiliations

Appendix

Appendix

Table 3 %Accuracy of novel word categorization in five conditions, similarity threshold set for < 300 clusters)
Table 4 %Accuracy of novel word categorization in five conditions, similarity threshold set for < 500 clusters)

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Asr, F.T., Fazly, A., Azimifar, Z. (2012). From Cues to Categories: A Computational Study of Children’s Early Word Categorization. In: Villavicencio, A., Poibeau, T., Korhonen, A., Alishahi, A. (eds) Cognitive Aspects of Computational Language Acquisition. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31863-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31863-4_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31862-7

  • Online ISBN: 978-3-642-31863-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics