On the Need for a Neural Abstract Machine

Sona, Diego; Sperduti, Alessandro

doi:10.1007/3-540-44565-X_7

Diego Sona³ &
Alessandro Sperduti³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1828))

1017 Accesses
1 Citations

Abstract

The complexity of learning tasks and their variety, as well as the number of different neural networks models for sequence learning is quite high. Moreover, in addition to architectural details and training algorithms peculiarities, there are other relevant factors which add complexity to the management of a neural network for the adaptive processing of sequences. For example, training heuristics, such as adaptive learning rates, regularization, and pruning, are very important, as well as insertion of a priori domain knowledge. All these issues must be considered and matched with the complexity of the application domain at hand. This means that the successful application of a neural network to a real world domain has to answer to several questions on the type of architecture, training algorithms, training heuristics, and knowledge insertion, according to the problem complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abu-Mostafa, Y. S., 1990. Learning from Hints in Neural Networks. Journal of Complexity 6:192–198.
Article MATH MathSciNet Google Scholar
Abu-Mostafa, Y. S., 1993a. Hints and the VC Dimension. Neural Computation 5, no. 2:278–288.
Article Google Scholar
Abu-Mostafa, Y. S., 1993b. A Method for Learning From Hints. In Advances in Neural Information Processing Systems, eds. S. J. Hanson, J. D. Cowan, and C. L. Giles, vol. 5, pp. 73–80. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Abu-Mostafa, Y. S., 1995a. Financial Applications of Learning from Hints. In Advances in Neural Information Processing Systems, eds. G. Tesauro, D. Touretzky, and T. Leen, vol. 7, pp. 411–418. The MIT Press.
Google Scholar
Abu-Mostafa, Y. S., 1995b. Hints. Neural Computation 7, no. 4:639–671.
Article Google Scholar
Al-Mashouq, K. A. and Reed, I. S., 1991. Including Hints in Training Neural Nets. Neural Computation 3, no. 3:418–427.
Article Google Scholar
Alquézar, R. and Sanfeliu, A., 1995. An Algebraic Framework to Represent Finite State Machines in Single-Layer Recurrent Neural Networks. Neural Computation 7, no. 5:931–949.
Article Google Scholar
Amari, S., 1995. Information Geometry of the EM and em Algorithms for Neural Networks. Neural Networks 8, no. 9:1379–1408.
Article Google Scholar
Angeline, P. J., Saunders, G. M., and Pollack, J. P., 1994. An Evolutionary Algorithm That Constructs Recurrent Neural Networks. IEEE Transactions on Neural Networks 5, no. 1:54–65.
Article Google Scholar
Battiti, T., 1992. First-and Second-Order Methods for Learning: Between Steepest Descent and Newton’s Method. Neural Computation 4, no. 2:141–166.
Article Google Scholar
Bengio, Y., Simard, P., and Frasconi, P., 1994. Learning Long-Term Dependencies with Gradient Descent is Difficult. IEEE Transactions on Neural Networks 5, no. 2:157–166.
Article Google Scholar
Berthold, M. and Fischer, I., 1997. Formalizing Neural Networks Using Graph Transformations. In Proceedings of the IEEE International Conference on Neural Networks, vol. 1, pp. 275–280. IEEE.
Google Scholar
Börger, E., 1995. Why Use Evolving Algebras for Hardware and Software Engineering? In SOFSEM’95, 22nd Seminar on Current Trends in Theory and Practice of Informatics, ed. J. W. Miroslav BARTOSEK, Jan STAUDEK, vol. 1012 of Lecture Notes in Computer Science, pp. 236–271. Berlin Heidelberg New York: Springer-Verlag.
Google Scholar
Börger, E., 1999. High Level System Design and Analysis using Abstract State Machines. In Current Trends in Applied Formal Methods (FM-Trends 98), eds. D. Hutter, W. Stephan, P. Traverso, and M. Ullmann, vol. 1641 of Lecture Notes in Computer Science, pp. 1–43. Berlin Heidelberg New York: Springer-Verlag.
Chapter Google Scholar
Casey, M., 1996. The Dynamics of Discrete-Time Computation, with Application to Recurrent Neural Networks and Finite State Machine Extraction. Neural Computation 8, no. 6:1135–1178.
Article Google Scholar
Das, S., Giles, C. L., and Sun, G. Z., 1992. Learning Context-free Grammars: Limitations of a Recurrent Neural Network with an External Stack Memory. In Proceedings of The Fourteenth Annual Conference of the Cognitive Science Society, pp. 791–795. San Mateo, CA: Morgan Kaufmann Publishers.
Google Scholar
Dempster, A. P., Laird, N. M., and Rubin, D. B., 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society series B 39:1–38.
MATH MathSciNet Google Scholar
Fahlman, S., 1991. The Recurrent Cascade-Correlation Architecture. In Advances in Neural Information Processing Systems 3, eds. R. Lippmann, J. Moody, and D. Touretzky, pp. 190–196. San Mateo, CA: Morgan Kaufmann Publishers.
Google Scholar
Fischer, I., Koch, M., and Berthold, M. R., 1998a. Proving Properties of Neural Networks with Graph Transformations. In Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 457–456. Anchorage, Alaska.
Google Scholar
Fischer, I., Koch, M., and Berthold, M. R., 1998b. Showing the Equivalence of Two Training Algorithms-Part2. In Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 441–446. Anchorage, Alaska.
Google Scholar
Forcada, M. L. and Carrasco, R. C., 1995. Learning the Initial State of a Second-Order Recurrent Neural Network during Regular-Language Inference. Neural Computation 7, no. 5:923–930.
Article Google Scholar
Frasconi, P., Gori, M., Maggini, M., and Soda, G., 1991. A Unified Approach for Integrating Explicit Knowledge and Learning by Example in Recurrent Networks. In International Joint Confernece on Neural Networks, pp. 811–816.
Google Scholar
Frasconi, P., Gori, M., and Soda, G., 1995. Recurrent Neural Networks and Prior Knowledge for Sequence Processing: A Constrained Nondeterministic Approach. Knowledge Based Systems 8, no. 6:313–332.
Article Google Scholar
Frasconi, P., Gori, M., and Sperduti, A., 1998. A General Framework for Adaptive Processing of Data Structures. IEEE Transactions on Neural Networks 9, no. 5:768–786.
Article Google Scholar
Frasconi, P., Gori, M., and Sperduti, A., 2000. Integration of Graphical-Based Rules with Adaptive Learning of Structured Information. In Hybrid Neural Symbolic Integration, eds. S. Wermter and R. Sun. Springer-Verlag. To appear.
Google Scholar
Geman, S., Bienenstock, E., and Doursat, R., 1992. Neural Networks and the Bias/Variance Dilemma. Neural Computation 4, no. 1:1–58.
Article Google Scholar
Giles, C. L., Chen, D., Sun, G.-Z., Chen, H.-H., Lee, Y.-C., and Goudreau, M. W., 1995. Constructive Learning of Recurrent Neural Networks: Limitations of Recurrent Casade Correlation and a Simple Solution. IEEE Transactions on Neural Networks 6, no. 4:829–836.
Article Google Scholar
Giles, C. L., Miller, C. B., Chen, D., Chen, H. H., Sun, G. Z., and Lee, Y. C., 1992. Learning and Extracted Finite State Automata with Second-Order Recurrent Neural Networks. Neural Computation 4, no. 3:393–405.
Article Google Scholar
Giles, C. L. and Omlin, C. W., 1993a. Extraction, Insertion and Refinement of Symbolic Rules in Dynamically-Driven Recurrent Neural Networks. Connection Science 5, no. 3:307–337.
Article Google Scholar
Giles, C. L. and Omlin, C. W., 1993b. Rule Refinement with Recurrent Neural Networks. In 1993 IEEE International Conference on Neural Networks (ICNN’93), vol. II, p. 810. Piscataway, NJ: IEEE Press.
Google Scholar
Giles, C. L. and Omlin, C. W., 1994. Pruning Recurrent Neural Networks for Improved Generalization Performance. IEEE Transactions on Neural Networks 5, no. 5:848–851.
Article Google Scholar
Goudreau, M. W., Giles, C. L., Chakradhar, S. T., and Chen, D., 1993. On Recurrent Neural Networks and Representing Finite State Recognizers. In Third International Conference on Artificial Neural Networks, pp. 51–55. The Institution of Electrical Engineers, London, UK.
Google Scholar
Gurevich, Y., 1995. Evolvin Algebras 1993: Lipari Guide. In Specification and Validation Methods, ed. E. Börger, pp. 9–36. Oxford University Press.
Google Scholar
Hochreiter, S. and Schmidhuber, J., 1997. Long Short Term Memory. Neural Computation 9, no. 8:123–141.
Article Google Scholar
Judd, J. S., 1989. Neural Network Design and the Complexity of Learning. MIT press.
Google Scholar
Koch, M., Fischer, I., and Berthold, M. R., 1998. Showing the Equivalence of Two Training Algorithms-Part1. In Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 441–446. Anchorage, Alaska.
Google Scholar
Kolen, J. F., 1994. Fool’s Gold: Extracting Finite State Machines from Recurrent Network Dynamics. In Advances in Neural Information Processing Systems, eds. J. D. Cowan, G. Tesauro, and J. Alspector, vol. 6, pp. 501–508. Morgan Kaufmann Publishers, Inc.
Google Scholar
Kremer, S., 1996. Finite State Automata that Recurrent Cascade-Correlation Cannot Represent. In Advances in Neural Information Processing Systems 8, eds. D. Touretzky, M. Mozer, and M. Hasselno. MIT Press. 612–618.
Google Scholar
Kremer, S. C., 1995. On the Computational Power of Elman-Style Recurrent Networks. IEEE Transactions on Neural Networks 6, no. 4:1000–1004.
Article Google Scholar
Kuan, C.-M., Hornik, K., and White, H., 1994. A Convergence Result for Learning in Recurrent Neural Networks. Neural Computation 6, no. 3:420–440.
Article Google Scholar
Lin, T., Horne, B. G., Tiño, P., and Giles, C. L., 1996. Learning Long-Term Dependencies in NARX Recurrent Neural Networks. IEEE Transactions on Neural Networks 7, no. 6:1329–1338.
Article Google Scholar
Ma, S. and Ji, C., 1998. Fast Training of Reccurent Networks Based on the EM Algorithm. IEEE Transactions on Neural Networks 9, no. 1:11–26.
Article Google Scholar
Maclin, R. and Shavlik, J. W., 1992. Refining Algorithms with Knowledge-Based Neural Networks: Improving the Chou-Fasman Algorithm for Protein Folding. In Computational Learning Theory and Natural Learning Systems, eds. S. Hanson, G. Drastal, and R. Rivest. MIT Press.
Google Scholar
McClelland, J. L. and Rumelhart, D. E., 1987. PARALLEL DISTRIBUTED PROCESSING, Explorations in the Microstructure of Cognition. Volume 1: Foundations Volume 2: Psychological and Biological Models. MIT Press. The PDP Research Group, MIT.
Google Scholar
Nerrand, O., Roussel-Ragot, P., Personnaz, L., Dreyfus, G., and Marcos, S., 1993. Neural Networks and Nonlinear Adaptive Filtering: Unifying Concepts and New Algorithms. Neural Computation 5, no. 2:165–199.
Article Google Scholar
Omlin, C. and Giles, C., 1996. Constructing Deterministic Finite-State Automata in Recurrent Neural Networks. Journal of the ACM 43, no. 6:937–972.
Article MATH MathSciNet Google Scholar
Omlin, C. W., Giles, C. L., and Miller, C. B., 1992. Heuristics for the Extraction of Rules from Discrete-Time Recurrent Neural Networks. In Proceedings International Joint Conference on Neural Networks 1992, vol. I, pp. 33–38.
Article Google Scholar
Pedersen, M. W. and Hansen, L. K., 1995. Recurrent Networks: Second Order Properties and Pruning. In Advances in Neural Information Processing Systems, eds. G. Tesauro, D. Touretzky, and T. Leen, vol. 7, pp. 673–680. The MIT Press.
Google Scholar
Puskorius, G. V. and Feldkamp, L. A., 1994. Neurocontrol of Nonlinear Dynamical Systems with Kalman Filter Trained Recurrent Networks. IEEE Transactions on Neural Networks 5, no. 2:279–297.
Article Google Scholar
Rozemberg, G., Courcelle, B., Ehrig, H., Engels, G., Janssens, D., Kreowski, H., and Montanari, U., eds., 1997. Handbookof Graph Grammars: Foundations, vol. 1. Workd Scientific.
Google Scholar
Santini, S., Bimbo, A. D., and Jain, R., 1995. Block structured recurrent neural netorks. Neural Networks 8:135–147.
Article Google Scholar
Saunders, G. M., Angeline, P. J., and Pollack, J. B., 1994. Structural and Behavioral Evolution of Recurrent Networks. In Advances in Neural Information Processing Systems, eds. J. D. Cowan, G. Tesauro, and J. Alspector, vol. 6, pp. 88–95. Morgan Kaufmann Publishers, Inc.
Google Scholar
Schmidhuber, J., 1992. Learning Complex, Extended Sequences Using the Principle of History Compression. Neural Computation 4, no. 2:234–242.
Article Google Scholar
Seidl, D. and Lorenz, D., 1991. A structure by which a recurrent neural network can approximate a nonlinear dynamic system. In Proceedings of the International Joint Conference on Neural Networks, vol. 2, pp. 709–714.
Google Scholar
Siegelmann, H., Horne, B., and Giles, C., 1997. Computational capabilities of recurrent NARX neural networks. IEEE Trans. on Systems, Man and Cybernetics In press.
Google Scholar
Siegelmann, H. T. and Sontag, E. D., 1991. Turing Computability with Neural Nets. Applied Mathematics Letters 4, no. 6:77–80.
Article MATH MathSciNet Google Scholar
Siegelmann, H. T. and Sontag, E. D., 1995. On the Computational Power of Neural Nets. Journal of Computer and System Sciences 50, no. 1:132–150.
Article MATH MathSciNet Google Scholar
Simard, P., Victorri, B., Le Cun, Y., and Denker, J., 1992. Tangent Prop-A Formalism for Specifying Selected Invariances in an Adaptive Network. In Advances in Neural Information Processing Systems, eds. J. E. Moody, S. J. Hanson, and R. P. Lippmann, vol. 4, pp. 895–903. Morgan Kaufmann Publishers, Inc.
Google Scholar
Sontag, E., 1993. Neural Networks for control. In Essays on Control: Perspectives in the Theory and its Applications, eds. H. L. Trentelman and J. C. Willemsd, pp. 339–380. Boston, MA: Birkhauser.
Google Scholar
Sperduti, A. and Starita, A., 1997. Supervised Neural Networks for the Classification of Structures. IEEE Transactions on Neural Networks 8, no. 3:714–735.
Article Google Scholar
Sun, R. and Sessions, C., 1998. Extracting plans from reinforcement learners. Proceedings of the 1998 International Symposium on Intelligent Data Engineering and Learning, eds. L. Xu, L. Chan, I. King, and A. Fu, pp.243–248. Springer-Verlag.
Google Scholar
Sutton, R. S., 1988. Learning to Predict by the Methods of Temporal Differences. Machine Learning 3:9–44.
Google Scholar
Tesauro, G., 1992. Practical Issues in Temporal Difference Learning. Machine Learning 8:257–277.
MATH Google Scholar
Tino, P., Horne, B., and C.L. Giles, 1995. Fixed Points in Two-Neuron Discrete Time Recurrent Networks: Stability and Bifurcation Considerations. Tech. Rep. UMIACS-TR-95-51 and CS-TR-3461, Institute for Advance Computer Studies, University of Maryland, College Park, MD 20742.
Google Scholar
Towell, G. G. and Shavlik, J. W., 1993. Extracting Refined Rules from Knowledge-Based Neural Networks. Machine Learning 13:71–101.
Google Scholar
Tsoi, A., 1998a. Gradient Based Learning Methods. In Adaptive Processing of Sequences and Data Structures: Lecture Notes in Artificial Intelligence, eds. C. Giles and M. Gori, pp. 27–62. New York, NY: Springer Verlag.
Chapter Google Scholar
Tsoi, A., 1998b. Recurren Neural Network Architectures: An Overview. In Adaptive Processing of Sequences and Data Structures: Lecture Notes in Artificial Intelligence, eds. C. Giles and M. Gori, pp. 1–26. New York, NY: Springer Verlag.
Chapter Google Scholar
Tsoi, A. and Tan, S., 1997. Recurrent Neural Networks: A constructive algorithm and its properties. Neurocomputing 15, no. 3–4:309–326.
Article Google Scholar
Tsoi, A. C. and Back, A., 1997. Discrete Time Recurrent Neural Network Architectures: A Unifying Review. Neurocomputing 15:183–223.
Article MATH Google Scholar
Wan, E. A. and Beaufay, F., 1998. Diagrammatic Methods for Deriving and Relating Temporal Neural Network Algorithms. In Adaptive Processing of Sequences and Data Structures: Lecture Notes in Artificial Intelligence, eds. C. Giles and M. Gori, pp. 63–98. New York, NY: Springer Verlag.
Chapter Google Scholar
Wan, E. A. and Beaufays, F., 1996. Diagrammatic Derivation of Gradient Algorithms for Neural Networks. Neural Computation 8, no. 1:182–201.
Article Google Scholar
Wiklicky, H., 1994. On the Non-Existence of a Universal Learning Algorithm for Recurrent Neural Networks. In Advances in Neural Information Processing Systems, eds. J. D. Cowan, G. Tesauro, and J. Alspector, vol. 6, pp. 431–436. Morgan Kaufmann Publishers, Inc.
Google Scholar
Wiles, J. and Bollard, S., 1996. Beyond finite state machines: steps towards representing and extracting context-free languages from recurrent neural networks. In NIPS’96 Rule Extraction from Trained Artificial Neural Networks Workshop, eds. R. Andrews and J. Diederich.
Google Scholar
Williams, R. J., 1992. Some Observations on the Use of the Extended Kalman Filter as a Recurrent Network Learning Algorithm. Tech. Rep. NU-CCS-92-1, Computer Science, Northeastern University, Boston, MA.
Google Scholar
Williams, R. J. and Zipser, D., 1988. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks. Tech. Rep. ICS Report 8805, Institute for Cognitive Science, University of California at San Diego, La Jolla, CA.
Google Scholar
Wu, L. and Moody, J., 1996. A Smoothing Regularizer for Feedforward and Recurrent Neural Networks. Neural Computation 8, no. 3:461–489.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Università di Pisa, Pisa
Diego Sona & Alessandro Sperduti

Authors

Diego Sona
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Sperduti
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CECS Department, University of Missouri-Columbia, 201 Engineering Building West, Columbia, MO, 65211-2060, USA
Ron Sun
NEC Research Institute, 4 Independence Way, Princeton, NJ, 08540, USA
C. Lee Giles

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sona, D., Sperduti, A. (2000). On the Need for a Neural Abstract Machine. In: Sun, R., Giles, C.L. (eds) Sequence Learning. Lecture Notes in Computer Science(), vol 1828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44565-X_7

Download citation

DOI: https://doi.org/10.1007/3-540-44565-X_7
Published: 07 December 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41597-8
Online ISBN: 978-3-540-44565-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics