Skip to main content

On the Need for a Neural Abstract Machine

  • Chapter
  • First Online:
Book cover Sequence Learning

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1828))

Abstract

The complexity of learning tasks and their variety, as well as the number of different neural networks models for sequence learning is quite high. Moreover, in addition to architectural details and training algorithms peculiarities, there are other relevant factors which add complexity to the management of a neural network for the adaptive processing of sequences. For example, training heuristics, such as adaptive learning rates, regularization, and pruning, are very important, as well as insertion of a priori domain knowledge. All these issues must be considered and matched with the complexity of the application domain at hand. This means that the successful application of a neural network to a real world domain has to answer to several questions on the type of architecture, training algorithms, training heuristics, and knowledge insertion, according to the problem complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abu-Mostafa, Y. S., 1990. Learning from Hints in Neural Networks. Journal of Complexity 6:192–198.

    Article  MATH  MathSciNet  Google Scholar 

  2. Abu-Mostafa, Y. S., 1993a. Hints and the VC Dimension. Neural Computation 5, no. 2:278–288.

    Article  Google Scholar 

  3. Abu-Mostafa, Y. S., 1993b. A Method for Learning From Hints. In Advances in Neural Information Processing Systems, eds. S. J. Hanson, J. D. Cowan, and C. L. Giles, vol. 5, pp. 73–80. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  4. Abu-Mostafa, Y. S., 1995a. Financial Applications of Learning from Hints. In Advances in Neural Information Processing Systems, eds. G. Tesauro, D. Touretzky, and T. Leen, vol. 7, pp. 411–418. The MIT Press.

    Google Scholar 

  5. Abu-Mostafa, Y. S., 1995b. Hints. Neural Computation 7, no. 4:639–671.

    Article  Google Scholar 

  6. Al-Mashouq, K. A. and Reed, I. S., 1991. Including Hints in Training Neural Nets. Neural Computation 3, no. 3:418–427.

    Article  Google Scholar 

  7. Alquézar, R. and Sanfeliu, A., 1995. An Algebraic Framework to Represent Finite State Machines in Single-Layer Recurrent Neural Networks. Neural Computation 7, no. 5:931–949.

    Article  Google Scholar 

  8. Amari, S., 1995. Information Geometry of the EM and em Algorithms for Neural Networks. Neural Networks 8, no. 9:1379–1408.

    Article  Google Scholar 

  9. Angeline, P. J., Saunders, G. M., and Pollack, J. P., 1994. An Evolutionary Algorithm That Constructs Recurrent Neural Networks. IEEE Transactions on Neural Networks 5, no. 1:54–65.

    Article  Google Scholar 

  10. Battiti, T., 1992. First-and Second-Order Methods for Learning: Between Steepest Descent and Newton’s Method. Neural Computation 4, no. 2:141–166.

    Article  Google Scholar 

  11. Bengio, Y., Simard, P., and Frasconi, P., 1994. Learning Long-Term Dependencies with Gradient Descent is Difficult. IEEE Transactions on Neural Networks 5, no. 2:157–166.

    Article  Google Scholar 

  12. Berthold, M. and Fischer, I., 1997. Formalizing Neural Networks Using Graph Transformations. In Proceedings of the IEEE International Conference on Neural Networks, vol. 1, pp. 275–280. IEEE.

    Google Scholar 

  13. Börger, E., 1995. Why Use Evolving Algebras for Hardware and Software Engineering? In SOFSEM’95, 22nd Seminar on Current Trends in Theory and Practice of Informatics, ed. J. W. Miroslav BARTOSEK, Jan STAUDEK, vol. 1012 of Lecture Notes in Computer Science, pp. 236–271. Berlin Heidelberg New York: Springer-Verlag.

    Google Scholar 

  14. Börger, E., 1999. High Level System Design and Analysis using Abstract State Machines. In Current Trends in Applied Formal Methods (FM-Trends 98), eds. D. Hutter, W. Stephan, P. Traverso, and M. Ullmann, vol. 1641 of Lecture Notes in Computer Science, pp. 1–43. Berlin Heidelberg New York: Springer-Verlag.

    Chapter  Google Scholar 

  15. Casey, M., 1996. The Dynamics of Discrete-Time Computation, with Application to Recurrent Neural Networks and Finite State Machine Extraction. Neural Computation 8, no. 6:1135–1178.

    Article  Google Scholar 

  16. Das, S., Giles, C. L., and Sun, G. Z., 1992. Learning Context-free Grammars: Limitations of a Recurrent Neural Network with an External Stack Memory. In Proceedings of The Fourteenth Annual Conference of the Cognitive Science Society, pp. 791–795. San Mateo, CA: Morgan Kaufmann Publishers.

    Google Scholar 

  17. Dempster, A. P., Laird, N. M., and Rubin, D. B., 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society series B 39:1–38.

    MATH  MathSciNet  Google Scholar 

  18. Fahlman, S., 1991. The Recurrent Cascade-Correlation Architecture. In Advances in Neural Information Processing Systems 3, eds. R. Lippmann, J. Moody, and D. Touretzky, pp. 190–196. San Mateo, CA: Morgan Kaufmann Publishers.

    Google Scholar 

  19. Fischer, I., Koch, M., and Berthold, M. R., 1998a. Proving Properties of Neural Networks with Graph Transformations. In Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 457–456. Anchorage, Alaska.

    Google Scholar 

  20. Fischer, I., Koch, M., and Berthold, M. R., 1998b. Showing the Equivalence of Two Training Algorithms-Part2. In Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 441–446. Anchorage, Alaska.

    Google Scholar 

  21. Forcada, M. L. and Carrasco, R. C., 1995. Learning the Initial State of a Second-Order Recurrent Neural Network during Regular-Language Inference. Neural Computation 7, no. 5:923–930.

    Article  Google Scholar 

  22. Frasconi, P., Gori, M., Maggini, M., and Soda, G., 1991. A Unified Approach for Integrating Explicit Knowledge and Learning by Example in Recurrent Networks. In International Joint Confernece on Neural Networks, pp. 811–816.

    Google Scholar 

  23. Frasconi, P., Gori, M., and Soda, G., 1995. Recurrent Neural Networks and Prior Knowledge for Sequence Processing: A Constrained Nondeterministic Approach. Knowledge Based Systems 8, no. 6:313–332.

    Article  Google Scholar 

  24. Frasconi, P., Gori, M., and Sperduti, A., 1998. A General Framework for Adaptive Processing of Data Structures. IEEE Transactions on Neural Networks 9, no. 5:768–786.

    Article  Google Scholar 

  25. Frasconi, P., Gori, M., and Sperduti, A., 2000. Integration of Graphical-Based Rules with Adaptive Learning of Structured Information. In Hybrid Neural Symbolic Integration, eds. S. Wermter and R. Sun. Springer-Verlag. To appear.

    Google Scholar 

  26. Geman, S., Bienenstock, E., and Doursat, R., 1992. Neural Networks and the Bias/Variance Dilemma. Neural Computation 4, no. 1:1–58.

    Article  Google Scholar 

  27. Giles, C. L., Chen, D., Sun, G.-Z., Chen, H.-H., Lee, Y.-C., and Goudreau, M. W., 1995. Constructive Learning of Recurrent Neural Networks: Limitations of Recurrent Casade Correlation and a Simple Solution. IEEE Transactions on Neural Networks 6, no. 4:829–836.

    Article  Google Scholar 

  28. Giles, C. L., Miller, C. B., Chen, D., Chen, H. H., Sun, G. Z., and Lee, Y. C., 1992. Learning and Extracted Finite State Automata with Second-Order Recurrent Neural Networks. Neural Computation 4, no. 3:393–405.

    Article  Google Scholar 

  29. Giles, C. L. and Omlin, C. W., 1993a. Extraction, Insertion and Refinement of Symbolic Rules in Dynamically-Driven Recurrent Neural Networks. Connection Science 5, no. 3:307–337.

    Article  Google Scholar 

  30. Giles, C. L. and Omlin, C. W., 1993b. Rule Refinement with Recurrent Neural Networks. In 1993 IEEE International Conference on Neural Networks (ICNN’93), vol. II, p. 810. Piscataway, NJ: IEEE Press.

    Google Scholar 

  31. Giles, C. L. and Omlin, C. W., 1994. Pruning Recurrent Neural Networks for Improved Generalization Performance. IEEE Transactions on Neural Networks 5, no. 5:848–851.

    Article  Google Scholar 

  32. Goudreau, M. W., Giles, C. L., Chakradhar, S. T., and Chen, D., 1993. On Recurrent Neural Networks and Representing Finite State Recognizers. In Third International Conference on Artificial Neural Networks, pp. 51–55. The Institution of Electrical Engineers, London, UK.

    Google Scholar 

  33. Gurevich, Y., 1995. Evolvin Algebras 1993: Lipari Guide. In Specification and Validation Methods, ed. E. Börger, pp. 9–36. Oxford University Press.

    Google Scholar 

  34. Hochreiter, S. and Schmidhuber, J., 1997. Long Short Term Memory. Neural Computation 9, no. 8:123–141.

    Article  Google Scholar 

  35. Judd, J. S., 1989. Neural Network Design and the Complexity of Learning. MIT press.

    Google Scholar 

  36. Koch, M., Fischer, I., and Berthold, M. R., 1998. Showing the Equivalence of Two Training Algorithms-Part1. In Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 441–446. Anchorage, Alaska.

    Google Scholar 

  37. Kolen, J. F., 1994. Fool’s Gold: Extracting Finite State Machines from Recurrent Network Dynamics. In Advances in Neural Information Processing Systems, eds. J. D. Cowan, G. Tesauro, and J. Alspector, vol. 6, pp. 501–508. Morgan Kaufmann Publishers, Inc.

    Google Scholar 

  38. Kremer, S., 1996. Finite State Automata that Recurrent Cascade-Correlation Cannot Represent. In Advances in Neural Information Processing Systems 8, eds. D. Touretzky, M. Mozer, and M. Hasselno. MIT Press. 612–618.

    Google Scholar 

  39. Kremer, S. C., 1995. On the Computational Power of Elman-Style Recurrent Networks. IEEE Transactions on Neural Networks 6, no. 4:1000–1004.

    Article  Google Scholar 

  40. Kuan, C.-M., Hornik, K., and White, H., 1994. A Convergence Result for Learning in Recurrent Neural Networks. Neural Computation 6, no. 3:420–440.

    Article  Google Scholar 

  41. Lin, T., Horne, B. G., Tiño, P., and Giles, C. L., 1996. Learning Long-Term Dependencies in NARX Recurrent Neural Networks. IEEE Transactions on Neural Networks 7, no. 6:1329–1338.

    Article  Google Scholar 

  42. Ma, S. and Ji, C., 1998. Fast Training of Reccurent Networks Based on the EM Algorithm. IEEE Transactions on Neural Networks 9, no. 1:11–26.

    Article  Google Scholar 

  43. Maclin, R. and Shavlik, J. W., 1992. Refining Algorithms with Knowledge-Based Neural Networks: Improving the Chou-Fasman Algorithm for Protein Folding. In Computational Learning Theory and Natural Learning Systems, eds. S. Hanson, G. Drastal, and R. Rivest. MIT Press.

    Google Scholar 

  44. McClelland, J. L. and Rumelhart, D. E., 1987. PARALLEL DISTRIBUTED PROCESSING, Explorations in the Microstructure of Cognition. Volume 1: Foundations Volume 2: Psychological and Biological Models. MIT Press. The PDP Research Group, MIT.

    Google Scholar 

  45. Nerrand, O., Roussel-Ragot, P., Personnaz, L., Dreyfus, G., and Marcos, S., 1993. Neural Networks and Nonlinear Adaptive Filtering: Unifying Concepts and New Algorithms. Neural Computation 5, no. 2:165–199.

    Article  Google Scholar 

  46. Omlin, C. and Giles, C., 1996. Constructing Deterministic Finite-State Automata in Recurrent Neural Networks. Journal of the ACM 43, no. 6:937–972.

    Article  MATH  MathSciNet  Google Scholar 

  47. Omlin, C. W., Giles, C. L., and Miller, C. B., 1992. Heuristics for the Extraction of Rules from Discrete-Time Recurrent Neural Networks. In Proceedings International Joint Conference on Neural Networks 1992, vol. I, pp. 33–38.

    Article  Google Scholar 

  48. Pedersen, M. W. and Hansen, L. K., 1995. Recurrent Networks: Second Order Properties and Pruning. In Advances in Neural Information Processing Systems, eds. G. Tesauro, D. Touretzky, and T. Leen, vol. 7, pp. 673–680. The MIT Press.

    Google Scholar 

  49. Puskorius, G. V. and Feldkamp, L. A., 1994. Neurocontrol of Nonlinear Dynamical Systems with Kalman Filter Trained Recurrent Networks. IEEE Transactions on Neural Networks 5, no. 2:279–297.

    Article  Google Scholar 

  50. Rozemberg, G., Courcelle, B., Ehrig, H., Engels, G., Janssens, D., Kreowski, H., and Montanari, U., eds., 1997. Handbookof Graph Grammars: Foundations, vol. 1. Workd Scientific.

    Google Scholar 

  51. Santini, S., Bimbo, A. D., and Jain, R., 1995. Block structured recurrent neural netorks. Neural Networks 8:135–147.

    Article  Google Scholar 

  52. Saunders, G. M., Angeline, P. J., and Pollack, J. B., 1994. Structural and Behavioral Evolution of Recurrent Networks. In Advances in Neural Information Processing Systems, eds. J. D. Cowan, G. Tesauro, and J. Alspector, vol. 6, pp. 88–95. Morgan Kaufmann Publishers, Inc.

    Google Scholar 

  53. Schmidhuber, J., 1992. Learning Complex, Extended Sequences Using the Principle of History Compression. Neural Computation 4, no. 2:234–242.

    Article  Google Scholar 

  54. Seidl, D. and Lorenz, D., 1991. A structure by which a recurrent neural network can approximate a nonlinear dynamic system. In Proceedings of the International Joint Conference on Neural Networks, vol. 2, pp. 709–714.

    Google Scholar 

  55. Siegelmann, H., Horne, B., and Giles, C., 1997. Computational capabilities of recurrent NARX neural networks. IEEE Trans. on Systems, Man and Cybernetics In press.

    Google Scholar 

  56. Siegelmann, H. T. and Sontag, E. D., 1991. Turing Computability with Neural Nets. Applied Mathematics Letters 4, no. 6:77–80.

    Article  MATH  MathSciNet  Google Scholar 

  57. Siegelmann, H. T. and Sontag, E. D., 1995. On the Computational Power of Neural Nets. Journal of Computer and System Sciences 50, no. 1:132–150.

    Article  MATH  MathSciNet  Google Scholar 

  58. Simard, P., Victorri, B., Le Cun, Y., and Denker, J., 1992. Tangent Prop-A Formalism for Specifying Selected Invariances in an Adaptive Network. In Advances in Neural Information Processing Systems, eds. J. E. Moody, S. J. Hanson, and R. P. Lippmann, vol. 4, pp. 895–903. Morgan Kaufmann Publishers, Inc.

    Google Scholar 

  59. Sontag, E., 1993. Neural Networks for control. In Essays on Control: Perspectives in the Theory and its Applications, eds. H. L. Trentelman and J. C. Willemsd, pp. 339–380. Boston, MA: Birkhauser.

    Google Scholar 

  60. Sperduti, A. and Starita, A., 1997. Supervised Neural Networks for the Classification of Structures. IEEE Transactions on Neural Networks 8, no. 3:714–735.

    Article  Google Scholar 

  61. Sun, R. and Sessions, C., 1998. Extracting plans from reinforcement learners. Proceedings of the 1998 International Symposium on Intelligent Data Engineering and Learning, eds. L. Xu, L. Chan, I. King, and A. Fu, pp.243–248. Springer-Verlag.

    Google Scholar 

  62. Sutton, R. S., 1988. Learning to Predict by the Methods of Temporal Differences. Machine Learning 3:9–44.

    Google Scholar 

  63. Tesauro, G., 1992. Practical Issues in Temporal Difference Learning. Machine Learning 8:257–277.

    MATH  Google Scholar 

  64. Tino, P., Horne, B., and C.L. Giles, 1995. Fixed Points in Two-Neuron Discrete Time Recurrent Networks: Stability and Bifurcation Considerations. Tech. Rep. UMIACS-TR-95-51 and CS-TR-3461, Institute for Advance Computer Studies, University of Maryland, College Park, MD 20742.

    Google Scholar 

  65. Towell, G. G. and Shavlik, J. W., 1993. Extracting Refined Rules from Knowledge-Based Neural Networks. Machine Learning 13:71–101.

    Google Scholar 

  66. Tsoi, A., 1998a. Gradient Based Learning Methods. In Adaptive Processing of Sequences and Data Structures: Lecture Notes in Artificial Intelligence, eds. C. Giles and M. Gori, pp. 27–62. New York, NY: Springer Verlag.

    Chapter  Google Scholar 

  67. Tsoi, A., 1998b. Recurren Neural Network Architectures: An Overview. In Adaptive Processing of Sequences and Data Structures: Lecture Notes in Artificial Intelligence, eds. C. Giles and M. Gori, pp. 1–26. New York, NY: Springer Verlag.

    Chapter  Google Scholar 

  68. Tsoi, A. and Tan, S., 1997. Recurrent Neural Networks: A constructive algorithm and its properties. Neurocomputing 15, no. 3–4:309–326.

    Article  Google Scholar 

  69. Tsoi, A. C. and Back, A., 1997. Discrete Time Recurrent Neural Network Architectures: A Unifying Review. Neurocomputing 15:183–223.

    Article  MATH  Google Scholar 

  70. Wan, E. A. and Beaufay, F., 1998. Diagrammatic Methods for Deriving and Relating Temporal Neural Network Algorithms. In Adaptive Processing of Sequences and Data Structures: Lecture Notes in Artificial Intelligence, eds. C. Giles and M. Gori, pp. 63–98. New York, NY: Springer Verlag.

    Chapter  Google Scholar 

  71. Wan, E. A. and Beaufays, F., 1996. Diagrammatic Derivation of Gradient Algorithms for Neural Networks. Neural Computation 8, no. 1:182–201.

    Article  Google Scholar 

  72. Wiklicky, H., 1994. On the Non-Existence of a Universal Learning Algorithm for Recurrent Neural Networks. In Advances in Neural Information Processing Systems, eds. J. D. Cowan, G. Tesauro, and J. Alspector, vol. 6, pp. 431–436. Morgan Kaufmann Publishers, Inc.

    Google Scholar 

  73. Wiles, J. and Bollard, S., 1996. Beyond finite state machines: steps towards representing and extracting context-free languages from recurrent neural networks. In NIPS’96 Rule Extraction from Trained Artificial Neural Networks Workshop, eds. R. Andrews and J. Diederich.

    Google Scholar 

  74. Williams, R. J., 1992. Some Observations on the Use of the Extended Kalman Filter as a Recurrent Network Learning Algorithm. Tech. Rep. NU-CCS-92-1, Computer Science, Northeastern University, Boston, MA.

    Google Scholar 

  75. Williams, R. J. and Zipser, D., 1988. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks. Tech. Rep. ICS Report 8805, Institute for Cognitive Science, University of California at San Diego, La Jolla, CA.

    Google Scholar 

  76. Wu, L. and Moody, J., 1996. A Smoothing Regularizer for Feedforward and Recurrent Neural Networks. Neural Computation 8, no. 3:461–489.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sona, D., Sperduti, A. (2000). On the Need for a Neural Abstract Machine. In: Sun, R., Giles, C.L. (eds) Sequence Learning. Lecture Notes in Computer Science(), vol 1828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44565-X_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-44565-X_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41597-8

  • Online ISBN: 978-3-540-44565-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics