Conditions for Cognitive Plausibility of Computational Models of Category Induction

Hromada, Daniel Devatman

doi:10.1007/978-3-319-08855-6_11

Daniel Devatman Hromada¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 443))

Included in the following conference series:

International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems

774 Accesses
1 Citations

Abstract

We present two axiomatic and three conjectural conditions which a model inducing natural language categories should dispose of, if ever it aims to be considered as “cognitively plausible”. 1st axiomatic condition is that the model should involve a bootstrapping component. 2nd axiomatic condition is that it should be data-driven. 1st conjectural condition demands that the model integrates the surface features – related to prosody, phonology and morphology – somewhat more intensively than is the case in existing Markov-inspired models. 2nd conjectural condition demands that asides integrating symbolic and connectionist aspects, the model under question should exploit the global geometric and topologic properties of vector-spaces upon which it operates. At last we shall argue that model should facilitate qualitative evaluation, for example in form of a POS-i oriented Turing Test. In order to support our claims, we shall present a POS-induction model based on trivial k-way clustering of vectors representing suffixal and co-occurrence information present in parts of Multext-East corpus. Even in very initial stages of its development, the model succeeds to outperform some more complex probabilistic POS-induction models for lesser computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berg-Kirkpatrick, T., Bouchard-Côté, A., DeNero, J., Klein, D.: Painless unsupervised learning with features. In: Human LanguageTechnologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 582–590 (2010)
Google Scholar
Biemann, C.: Unsupervised part-of-speech tagging employing efficient graph clustering. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Associationfor Computational Linguistics: Student Research Workshop, pp. 7–12 (2006)
Google Scholar
Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based ngram models of natural language. Computational Linguistics 18(4), 467–479 (1992)
Google Scholar
Christodoulopoulos, C., Goldwater, S., Steedman, M.: Two Decades of Unsupervised POS induction: How far have we come? In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 575–584 (2010)
Google Scholar
Clark, A.: Combining distributional and morphological information for part of speech induction. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 1, pp. 59–66 (2003)
Google Scholar
Clark, A., de Jong, J.: Towards general algorithms for grammatical inference. In: Hutter, M., Stephan, F., Vovk, V., Zeugmann, T. (eds.) ALT 2010. LNCS, vol. 6331, pp. 11–30. Springer, Heidelberg (2010)
Google Scholar
Cohen, T., Schvaneveldt, R., Widdows, D.: Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections. Journal of Biomedical Informatics 43(2), 240–256 (2010)
Article Google Scholar
Elman, J.L.: Representation and structure in connectionist models. DTIC Document (1989)
Google Scholar
Erjavec, T.: MULTEXT-East: morphosyntactic resources for Central and Eastern European languages. Language Resources and Evaluation 46(1), 131–142 (2012)
Article Google Scholar
Ferguson, C.A.: Baby talk in six languages. American Anthropologist 66(6_PART2), 103–114 (1964)
Google Scholar
Frank, S., Goldwater, S., Keller, F.: Evaluating models of syntactic category acquisition without using a gold standard. In: Proc. 31st Annual Conf. of the Cognitive Science Society, pp. 2576–2581 (2009)
Google Scholar
Gao, J., Johnson, M.: A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 344–352 (2008)
Google Scholar
Gärdenfors, P.: Conceptual spaces: The geometry of thought. MIT Press (2004)
Google Scholar
Goldwater, S., Griffiths, T.: A fully Bayesian approach to unsupervised part-of-speech tagging. In: Annual Meeting Association for Computational Linguistics, vol. 45, p. 744 (2007)
Google Scholar
Haghighi, A., Klein, D.: Prototype-driven learning for sequence models. In: Proceedings of the Main Conference on Human LanguageTechnology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 320–327 (2006)
Google Scholar
Harris, Z.S.: Distributional structure. Word (1954)
Google Scholar
Hebb, D.O.: The Organization of Behavior: A Neuropsychlogical Theory. John Wiley & Sons (1964)
Google Scholar
Hromada, D.D.: Taxonomy of Turing Test Scenarios. In: Proceedings of AISB/IACAP 2012 Symposium, Birmingham, United Kingdom (2012)
Google Scholar
Jackendoff, R.: Foundations of language: Brain, meaning, grammar, evolution. OxfordUniversity Press, USA (2003)
Google Scholar
Johnson, M.: Why doesn’t EM find good HMM POS-taggers. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 296–305 (2007)
Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics 26, 1 (1984)
Article MathSciNet Google Scholar
Karypis, G.: CLUTO-a clustering toolkit. DTIC Document (2002)
Google Scholar
Lakoff, G.: Women, fire, and dangerous things. Univ. of Chicago Press (1990)
Google Scholar
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104(2), 211–240 (1997)
Article Google Scholar
Levy, Y., Schlesinger, I.M., Braine, M.D.S.: Categories and Processes in Language Acquisition. Lawrence Erlbaum (1988)
Google Scholar
MacWhinney, B.: The CHILDES Project: Tools for Analyzing Talk. Transcription, format and programs, vol. 1. Lawrence Erlbaum (2000)
Google Scholar
Meilă, M.: Comparing clusterings by the variation of information. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 173–187. Springer, Heidelberg (2003)
Chapter Google Scholar
Nowak, M.A., Plotkin, J.B., Krakauer, D.C.: The evolutionary language game. Journal of Theoretical Biology 200(2), 147–162 (1999)
Article Google Scholar
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the National Conference on Artificial Intelligence, pp. 474–479 (1999)
Google Scholar
Rosenberg, A., Hirschberg, J.: V-measure: A conditional entropy-based external cluster evaluation measure. In: Proceedings of the 2007Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), vol. 410, p. 420 (2007)
Google Scholar
Sahlgren, M.: An introduction to random indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminologyand Knowledge Engineering, TKE, vol. 5 (2005)
Google Scholar
Sahlgren, M., Karlgren, J.: Vector-based semantic analysis using random indexing for cross-lingual query expansion. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, p. 169. Springer, Heidelberg (2002)
Chapter Google Scholar
De Saussure, F., Bally, C., Séchehaye, A., Riedlinger, A., Calvet, L.J., De Mauro, T.: Cours de linguistique générale. Payot, Paris (1922)
Google Scholar
Schütze, H.: Part-of-speech induction from scratch. In: Proceedings of the 31st Annual Meeting on Association for Computational Linguistics, pp. 251–258 (1993)
Google Scholar
Shannon, C.E., Weaver, W.: The mathematical theory of information, vol. 97. University of Illinois Press, Urbana (1949)
Google Scholar
Solan, Z., Horn, D., Ruppin, E., Edelman, S.: Unsupervised learning of natural languages. Proceedings of the National Academy of Sciences 102(33), 11629 (2005)
Article Google Scholar
Turing, A.M.: Systems of logic based on ordinals. Proceedings of the LondonMathematical Society 2(1), 161–228 (1939), Language and Speech 40(1), 47–62
Google Scholar
Vlachos, A., Korhonen, A., Ghahramani, Z.: Unsupervised and constrained Dirichlet process mixture models for verb clustering. In: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, pp. 74–82 (2009)
Google Scholar
Widdows, D., Kanerva, P.: Geometry and meaning. CSLI Publications Stanford (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Fakulta Elektrotechniky a Informatiky, Slovenská Technická Univerzita, Ilkovičova 3, 812 19, Bratislava 1, Slovakia
Daniel Devatman Hromada

Authors

Daniel Devatman Hromada
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University Montpellier 2, LIRMM - CNRS UMR 5506, 161, Rue Ada, 34392, Montpellier Cedex 5, France
Anne Laurent
LIRMM, UMR CNRS/Universite Montpellier II, 161 rue Ada, 34392, Montpellier cedex 5, France
Olivier Strauss
UPMC Univ. Paris 06, CNRS UMR 7606, LIP6, F-75005, Paris, France
Bernadette Bouchon-Meunier
Dept. of Information Systems, Iona College, 710 North Ave, 10801, New Rochelle, NY, USA
Ronald R. Yager

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hromada, D.D. (2014). Conditions for Cognitive Plausibility of Computational Models of Category Induction. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2014. Communications in Computer and Information Science, vol 443. Springer, Cham. https://doi.org/10.1007/978-3-319-08855-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-08855-6_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08854-9
Online ISBN: 978-3-319-08855-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics