Skip to main content
Log in

Using Extra Output Learning to Insert a Symbolic Theory into a Connectionist Network

  • Published:
Minds and Machines Aims and scope Submit manuscript

Abstract

This paper examines whether a classical model could be translated into a PDP network using a standard connectionist training technique called extra output learning. In Study 1, standard machine learning techniques were used to create a decision tree that could be used to classify 8124 different mushrooms as being edible or poisonous on the basis of 21 different Features (Schlimmer, 1987). In Study 2, extra output learning was used to insert this decision tree into a PDP network being trained on the identical problem. An interpretation of the trained network revealed a perfect mapping from its internal structure to the decision tree, representing a precise translation of the classical theory to the connectionist model. In Study 3, a second network was trained on the mushroom problem without using extra output learning. An interpretation of this second network revealed a different algorithm for solving the mushroom problem, demonstrating that the Study 2 network was indeed a proper theory translation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abu-Mostafa, Y. S. (1990) 'Learning from hints in neural networks', Journal of Complexity, 6, pp. 192-198.

    Google Scholar 

  • Aldenderfer, M. S. and Blashfield, R. K. (1984), Cluster Analysis. (Vol. 07-044), Beverly Hills, CA: Sage Publications.

    Google Scholar 

  • Andrews, R., Diederich, J. and Tickle, A. B. (1995), 'A survey and critique of techniques for extracting rules from trained artificial neural networks', Knowledge-Based Systems, 8, pp. 373-389.

    Google Scholar 

  • Bechtel, W., and Abrahamsen, A. (1991), Connectionism and the Mind, Cambridge, MA: Basil Blackwell.

    Google Scholar 

  • Berkeley, I. S. N., Dawson, M. R. W., Medler, D. A., Schopflocher, D. P. and Hornsby, L. (1995), Density plots of hidden value unit activations reveal interpretable bands, Connection Science, pp. 167-186.

  • Born, R. (1987), Artificial intelligence: The case against, London: Croom Helm.

    Google Scholar 

  • Broadbent, D. (1985), 'A question of levels: Comment on McClelland and Rumelhart', Journal of Experimental Psychology: General. 114, pp. 189-192.

    Google Scholar 

  • Caruana, R. and de Sa, V. R. (1997), 'Promoting poor features to supervisors: Some inputs work better as outputs', in M. C. Mozer, M. I. Jordan and T. Petsche (eds.), Advances in Neural Information Processina Systems 9, Cambridge, MA: MIT Press.

    Google Scholar 

  • Churchland, P. M. (1985), 'Reduction. qualia, and the direct introspection of brain states', The Journal of Philosophy. LXXXII, pp. 8-28.

  • Churchland, P. M. (1988), Matter and consciousness. Revised edition, Cambridge, MA: MIT Press.

    Google Scholar 

  • Churchland, P. M. (1995), The engine of reason. the seat of the soul, Cambridge, MA: MIT Press.

    Google Scholar 

  • Churchland, P. S., Koch, C. and Sejnowski, T. J. (1990), 'What is computational neuroscience?', in E. L. Schwartz (ed.), Computational Neuroscience, Cambridge, MA: MIT Press.

    Google Scholar 

  • Churchland, P. S. and Sejnowski, T. J. (1989), 'Neural representation and neural computation', in L. Nadel, L. A. Cooper, P. Culicover, and R. M. Harnish (eds.), Neural Connections. Mental Computation, Cambridge, MA: MIT Press, pp 15-48.

    Google Scholar 

  • Churchland, P. S. and Sejnowski, T. J. (1992), The computational brain, Cambridge,MA: MIT Press.

    Google Scholar 

  • Clark, A. (1989), Microcoanition, Cambridge, MA: MIT Press.

    Google Scholar 

  • Clark, A. (1993), Associative engines, Cambridge, MA: MIT Press.

    Google Scholar 

  • Crick, F. and Asanuma, C. (1986), 'Certain aspects of the anatomy and physiology of the cerebral cortex', in J. McClelland and D. E. Rumelhart (eds.), Parallel Distributed Processing (Vol. 2), Cambridge, MA: MIT Press.

    Google Scholar 

  • Dawson, M. R. W. (1990), 'Training networks of value units: Learning in PDP systems with nonmonotonicactivation functions', Canadian Psychology 31(4), pp. 391.

    Google Scholar 

  • Dawson, M. R. W. (1991), 'The how and why of what went where in apparent motion: Modeling solutions to the motion correspondence process', Psychological Review 98, pp 569-603.

    Google Scholar 

  • Dawson, M. R. W. (1998), Understanding Cognitive Science. Oxford, UK: Blackwell.

    Google Scholar 

  • Dawson, M. R. W., Medler, D. A. and Berkeley, I. S. N. (1997), 'PDP networks can provide models that are not mere implementations of classical theories. Philosophical Psychology. 10, 25-40.

    Google Scholar 

  • Dawson, M. R. W. and Schopflocher, D. P. (1992a), 'Autonomous processing in PDP networks', Philosophical Psychology. 5, pp. 199-219.

    Google Scholar 

  • Dawson, M. R. W. and Schopflocher, D. P. (1992b), 'Modifying the generalized delta rule to train networks of nonmonotonic processors for pattern classification', Connection Science 4, pp. 19-31.

    Google Scholar 

  • Dawson, M. R. W. and Shamanski, K. S. (1994), 'Connectionism, confusion and cognitive science', Journal of Intelligent Systems. 4, pp. 215-262.

    Google Scholar 

  • Dawson, M. R. W., Shamanski, K. S. and Medler, D. A. (1993), From connectionism to cognitive science. Paper presented at the Fifth University of New Brunswick Symposium on Artificial Intelligence, Fredericton, NB.

  • Douglas, R. J. and Martin, K. A. C. (1991), 'Opening the grey box', Trends In Neuroscience 14, pp. 286-293.

    Google Scholar 

  • Dreyfus, H. L. and Dreyfus, S. E. (1988), 'Making a mind versus modeling the brain. Artificial intelligence back at the branchpoint', in S. Graubard (ed.), The Artificial Intelligence Debate, Cambridge, MA: MIT Press.

    Google Scholar 

  • Elman, J. (1990), 'Finding structure in time', Cognitive science 14, pp. 179-211.

    Google Scholar 

  • Everitt, B. (1980), Cluster Analysis New York: Halsted.

    Google Scholar 

  • Fodor, J. A. and McLaughlin, B. P. (1990), 'Connectionism and the problem of systematicity: Why Smolensky' solution doesn't work', Cognition 35, pp. 183-204.

    Google Scholar 

  • Fodor, J. A. and Pylyshyn, Z. W. (1988), 'Connectionism and cognitive architecture', Cognition 28, pp. 3-71.

    Google Scholar 

  • Gallant, S. I. (1993), Neural network learning and expert systems, Cambridge, MA: MIT Press.

    Google Scholar 

  • Gailmo, O. and Carlstrom, J. (1995), 'Some experiments using extra output learning to hing multilayer perceptrons', in L. F. Niklasson and M. B. Boden (eds.), Current Trends in Connectionism-Proceedings of the 1995 Swedish Conference on Connectionism, Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 179-190.

    Google Scholar 

  • Garson, J. W. (1994), 'No representations without rules: The prospects for a compromise between paradigms in cognitive science', Mind and Language 9, pp. 25-37.

    Google Scholar 

  • Graubard, S. (1988), The artificial intelligence debate, Cambridge, MA: MIT Press.

    Google Scholar 

  • Hadley, R. F. (1994a), 'Systematicity in connectionist language learning', Minds and Machines 3, pp. 183-200.

    Google Scholar 

  • Hadley, R. F. (1994b), 'Systematicity revisited: Reply to Christiansen and Chater and Niclasson and van Gelder', Mind and Language 9, pp. 431-444.

    Google Scholar 

  • Hadley, R. F. (1997), 'Cognition, systematicity, and nomic necessity', Mind and Language 12, pp. 137-153.

    Google Scholar 

  • Hadley, R. F. and Hayward, M. B. (1997), 'Strong semantic systematicity from Hebbian connectionist learning', Minds and Machines 7, pp. 1-37.

    Google Scholar 

  • Hanson, S. J. and Burr, D. J. (1990), 'What connectionist models learn: Learning and representation in connectionist networks', Behavioral and Brain Sciences 13, pp. 471-518.

    Google Scholar 

  • Haugeland, J. (1985), Artificial intelligence: The very idea, Cambridge, MA: MIT Press.

    Google Scholar 

  • Hecht-Nielsen, R. (1987), Neurocomputing, Reading, MA: Addison-Wesley.

    Google Scholar 

  • Hinton, G. E. (1986), Learning distributed representations of concepts. Paper presented at the the 8th Annual Meeting of the Cognitive Science Society, Ann Arbor, MI.

  • Hooker, C. A. (1979), 'Critical notice: R.M. YoshidaŠs Reduction in the Physical Sciences', Dialogue 18, pp. 81-99.

    Google Scholar 

  • Hooker, C. A. (1981), 'Towards a general theory of reduction', Dialogue 20, pp. 38-59, 201-236, 496-529.

    Google Scholar 

  • Hopcroft, J. E. and Ullman, J. D. (1979), Introduction to Automata Theorv. Languages. and Computation, Reading. MA: Addison-Wesley.

    Google Scholar 

  • Horgan, T. and Tienson, J. (1996), Connectionism and the philosophy of psychology, Cambridge, MA: MIT Press.

    Google Scholar 

  • Kilian, J. and Siegelmann, H. T. (1993), On the power of sigmoid neural networks. Paper presented at the Proceedings of the Sixth ACM Workshop on Computational Learning Theory.

  • Kremer, S. C. (1995), 'On the computational powers of Elman-style recurrent networks', IEEE Transactions on neural networks 6, pp. 1000-1004.

    Google Scholar 

  • Lachter, J. and Bever, T. G. (1988), 'The relation between linguistic structure and associative theories of language learning-A constructive critique of some connectionist learning models', Cognition 28, pp. 195-247.

    Google Scholar 

  • Lincoff, G. H. (1981), National Auduboii Society field guide to North American mushrooms, New York: Alfred A. Knopf Publishers.

    Google Scholar 

  • Marr, D. (1982), Vision, San Francisco, CA. W.H. Freeman.

    Google Scholar 

  • McCaughan, D. B. (1997, June 9-12), On the properties of periodic perceptrons. Paper presented at the IEEE/INNS International Conference on Neural Networks (ICNN'97), Houston, TX.

  • McClelland, J. (1992), 'Can connectionist models discover the structure of natural language?', in R. Morelli, W. M. Brown, D. Anselmi, K. Haberlandt, and D. Lloyd (eds.), Minds, Brains. and Computers: Perspectives in Cognitive Science and Artificial Intelligence, Norwood, NJ: Ablex.

    Google Scholar 

  • McClelland, J. L., Rumelhart, D. F. and Hinton, G. E. (1986), 'The appeal of parallel distributed processing', in D. Rumelhart and J. McClelland (eds.), Parallel Distributed Processing (Vol. 1), Cambridge, MA: MIT Press.

    Google Scholar 

  • McCloskey, M. (1991), 'Networks and theories: The place of connectionism in cognitive science', Psychological Science 2, pp. 387-395.

    Google Scholar 

  • McCulloch, W. S. and Pitts, W. (1943), 'A logical calculus of the ideas immanent in nervous activity', Bulletin of Mathematical Biophysics 5, pp. 115-133.

    Google Scholar 

  • Medler, D. A. (1998), The crossroads of connectionism: Where do we go from here? Unpublished Doctoral dissertation, University of Alberta, Edmonton, AB.

    Google Scholar 

  • Michie, D., Speigelhalter, D. J. and Taylor, C. C. (1994), Machine learning, neural and statistical classification. New York, NY: Ellis Horwood.

    Google Scholar 

  • Milligan, G. W. and Cooper, M. C. (1985), 'An examination of procedures for determining the number of clusters in a data set', Psychometrika 50, pp. 159-179.

    Google Scholar 

  • Minsky, M. (1972), Computation: finite and infinite machines. London: Prentice-Hall International.

    Google Scholar 

  • Mozer, M. C. and Smolensky, P. (1989), 'Using relevance to reduce network size automatically', Connection Science 1, pp. 3-16.

    Google Scholar 

  • Omlin, C. W. and Giles, C. L. (1996), 'Extraction of rules from discrete-time recurrent neural networks', Neural networks 9, pp. 41-52.

    Google Scholar 

  • Pinker, S. and Prince, A. (1988), 'On language and connectionism: Analysis of a parallel distributed processing model of language acquisition', Cognition 28, pp. 73-193.

    Google Scholar 

  • Pylyshyn, Z. W. (1984), Computation and cognition, Cambridge, MA.: MIT Press.

    Google Scholar 

  • Pylyshyn, Z. W. (1991), 'The role of cognitive architectures in theories of cognition', in K. VanLehn (ed.), Architectures For Intelligence, Hillsdale, NJ: Lawrence Eribaum Associates, pp 189-223.

    Google Scholar 

  • Quinlan, J. R. (1986), 'Induction of decision trees', Machine Learning 1, pp. 81-106.

    Google Scholar 

  • Ramsey, W., Stich, S. P. and Rumelhart, D. E. (1991), Philosophy and connectionist theory, Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Ripley, B. D. (1996), Pattern recognition and neural networks. Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Rumeihart, D. E., Hinton, G. E. and Williams, R. J. (1986), 'Learning representations by back-propagating errors', Nature 323, pp. 533-536.

    Google Scholar 

  • Rumelhart, D. E. and McClelland, J. L. (1985), 'Levels indeed! A response to Broadbent', Journal of Experimental Psychology: General 114, pp. 193-197.

    Google Scholar 

  • Schlimmer, J. S. (1987), Concept acquisition through representational adjustment. Unpublished Doctoral dissertation, University of California Irvine, Irvine, CA.

    Google Scholar 

  • Schneider, W. (1987), 'Connectionism: Is it a paradigm shift for psychology?', Behavior Research Methods, Instruments and Computers 19, pp. 73-83.

    Google Scholar 

  • Seidenberg, M. (1993), 'Connectionist models and cognitive theory', Psychological Science 4, pp. 228-235.

    Google Scholar 

  • Siegelman, H. T. and Sontag, E. D. (1991), 'Turing computability with neural nets', Applied Mathematics Letters 4, pp. 77-80.

    Google Scholar 

  • Siegelmann, H. T. and Sontag, E. D. (1995), 'On the computational power of neural nets', Journal of Computer and System Sciences 50, pp. 132-150.

    Google Scholar 

  • Sigelmann, H. T. (1999), Neural Networks and Analog Computation: Beyond the Turing Limit, Boston, MA: Birkhauser.

    Google Scholar 

  • Smith, B. C. (1996), On the Origin of Objects, Cambridge, MA: MIT Press.

    Google Scholar 

  • Smolensky, P. (1988), 'On the proper treatment of connectionism', Behavioural and Brain Sciences 11, pp. 1-74.

    Google Scholar 

  • Stork, D. G. (1997), 'Scientist on the set: An interview with Marvin Minsky', in D. G. Stork (ed.), HAL' Legacy: 2001' Computer as Dream and Reality, Cambridge, MA: MIT Press, pp. 15-32.

    Google Scholar 

  • Suddarth, S. C. and Kergosien, Y. L. (1990), 'Rule-injection hints as a means of improving network performance and learning time', in L. B. Almeida and C. J. Wellekens (eds.), Neural Networks. Lecture Notes in Computer Science (Vol. 412), Berlin: Springer Verlag, pp. 120-129.

    Google Scholar 

  • Suddarth, S. C., Sutton, S. A. and Holden, A. D. C. (1988), A symbolic-neural method for solving control problems, Paper presented at the IEEE International Conference on Neural Networks, San Diego, CA.

  • VanLehn, K. (1991), Architectures for intelligence, Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Von Eckardt, B. (1993), What is cognitive science?, Cambridge, MA: MIT Press.

    Google Scholar 

  • Williams, R. and Zipser, D. (1989), 'A learning algorithm for continually running fully recurrent neural networks', Neural Computation 1, pp. 270-280.

    Google Scholar 

  • Yu, Y.-H. and Simmons, R. F. (1990), 'Extra output based learning', Proceedings of the International Joint Conference on Neural Networks (IJCNN-90) 3, pp. 161-166.

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dawson, M., Medler, D., McCaughan, D. et al. Using Extra Output Learning to Insert a Symbolic Theory into a Connectionist Network. Minds and Machines 10, 171–201 (2000). https://doi.org/10.1023/A:1008313828824

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008313828824

Navigation