Abstract
Whereas basic machine learning research has mostly viewed input data as an unordered random sample from a population, researchers have also studied learning from data whose input sequence follows a regular sequence. To do so requires that we regard the input data as a stream and identify regularities in the data values as they occur. In this brief survey I review three sequential-learning problems, examine some new, and not-so-new, algorithms for learning from sequences, and give applications for these methods. The three generic problems I discuss are:
-
Predicting sequences of discrete symbols generated by stochastic processes.
-
Learning streams by extrapolation from a general rule.
-
Learning to predict time series.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
N. Abe and M. Warmuth. On the computational complexity of approximating distributions by probabilistic automata. Machine Learning, 9:205–260, 1992.
D. Angluin. Inference of reversible languages. Journal of the Association for Computing Machinery, 29:741–765, 1982.
E. Baum and D. Haussler. What size net gives valid generalization? Neural Computation, 1:151–160, 1989.
L. E. Baum and T. Petrie. Statistical inference for probabilistic functions of finite state markov chains. Ann. Math. Stat., 37:1554–1563, 1966.
T. C. Bell, J. G. Cleary, and I. H. Witten. Text Compression. Prentice Hall, Englewood Cliffs, N.J., 1990.
G. Box and G. Jenkins. Time Series Analysis, Forecasting, and Control. Holden Day, 1976.
R. Burstall and J. Darlington. A transformation system for developing recursive programs. Journal of the Association for Computing Machinery, pages 44–67, 1977.
M. Casdagli. Nonlinear prediction of chaotic time series. Physica D, 35, 1989.
D. Catlin. Estimation, Control, and the Discrete Kalman Filter. Springer Verlag, New York, 1989.
A. K. Dewdney. Computer recreations. Scientific American, pages 14–21, 1986.
J. Skilling (ed.). Maximum Entropy and Bayesian Methods. Kluwer Academic, 1989.
Y. Sakakibara et al. Stochastic context-free grammars for modeling RNA. Technical Report UCSC-CRL-93-16, University of California, Santa Cruz, 1993. (submitted).
W. Press et al. Numerical Recipes. Cambridge University Press, 1992.
B. Falkenhainer and R. Michalski. Integrating qualitative and quantitative discovery: the ABACUS system. Machine Learning, 1:367–401, 1986.
J. D. Farmer and J. Sidorowich. Exploiting chaos to predict the future and reduce noise. In Evolution, learning, and cognition. World Scientific, 1989.
S. Gull. Developments in maximum entropy data analysis. In Maximum Entropy and Bayesian Methods, pages 53–71. Kluwer Academic, 1989.
S. Hardy. Synthesis of LISP programs from examples. In Proceedings of 4th International Joint Conference on A.I., pages 268–273, 1975.
C. Hedrick. Learning production systems from examples. Artificial Intelligence, 7:21–49, 1976.
L. A. Hermens and J. C. Schlimmer. Applying machine learning to electronic form filling. In Proc. SPIE Application of AI: Machine Vision and Robotics, 1993.
G. Hinton and D. van Camp. Keeping neural networks simply by minimizing the description length of the weights. In Proc. Sixth Annual ACM Conference on Computational Learning Theory, 1993.
K. Kotovsky and H. Simon. Empirical tests of a theory of human acquision of concepts for sequential patterns. Cognitive Psychology, 4:399–424, 1973.
J. Koza. Genetic Programming. M.I.T. Press, 1992.
P. Laird. Dynamic optimization. In Proc., 9th International Machine Learning Conference. Morgan Kaufmann, 1992.
P. Laird and R. Saul. Discrete sequence prediction and its applications. Machine Learning, 1993. (To appear).
P. Laird and R. Saul. Sequence extrapolation. In Proceedings, 13th International Joint Conference on Artificial Intelligence, 1993.
P. Laird, R. Saul, and P. Dunning. A model of sequence extrapolation. In Proceedings of the 6th Annual Conference on Computational Learning Theory, 1993.
P. Langley. Rediscovering physics with BACON.3. In Proc. IJCAI 6, pages 505–507, 1977.
A. Lapedes and R. Farber. Non-linear signal processing using neural networks. Technical report, Los Alamos National Laborator, 1987.
Y. le Cun, J. Denker, and S. Solla. Optimal brain dammage. In Advances in Neural Information Processing Systems 2, pages 598–605. Morgan Kaufmann, 1990.
Kai-Fu Lee. Large-vocabulary speaker-independent continuous speech recognition: the SPHINX System. PhD thesis, Carnegie-Mellon University, Computer Science Department, 1988.
J. Levinson, L. Rabiner, and M. Sondhi. An introduction to the application of the theory of probabilistic functions of markov processes in automatic speech recognition. Bell Sys. Tech. J., 62:1035–1074, 1983.
S. Muggleton, editor. Inductive Logic Programming. Academic Press, 1992.
N. Packard and J. Crutchfield et al. Geometry from a time series. Physical Review Letters, pages 712–716, 1980.
Jan Paredis. Learning the behavior of dynamical systems from examples. In Proc. 6th International Workshop on Machine Learning, pages 137–139. Morgan-Kaufmann, 1989.
M. Pazzani and D. Kibler. The utility of knowledge in inductive learning. Machine Learning, 9:57–98, 1992.
S. Persson. Some Sequence Extrapolation Programs: A study of representation and modeling in inquiry system. PhD thesis, University of California, Berkeley, 1966. Also printed as Stanford University Computer Science Department Technical Report # CS50, 1966.
M. Pivar and M. Finkelstein. Automation, using LISP, of induction on sequences. In E. Berkeley and D. Bobrow, editors, The Programming Language LISP. Information International, Inc., 1964.
R. Quinlan. Learning logical definitions from relations. Machine Learning, 5:239–266, 1990.
R. Redner and H. Walker. Mixture densities, maximum likelihood, and the EM algorithm. SIAM Review, 26:195–239, 1984.
T. Sauer, J. Yorke, and M. Casdagli. Embedology. J. Statistical Physics, pages 579–616, 1991.
J. Scargle. An introduction to chaotic and random time series analysis. Int. J. of Imaging Systems and Technology, pages 243–253, 1989.
D Shaw, W. Swartout, and C. Green. Inferring LISP programs from example problems. In Proceedings of 4th International Joint Conference on A.I., pages 260–267, 1975.
P. Summers. A methodology for lisp program construction from examples. J.ACM, 24, 1977.
F. Takens. Detecting strange attractors in turbulence. In D. Rand and L.-S. Young, editors, Dynamical Systems and Turbulence. Springer Verlag, 1981.
H. Tamaki and T. Sato. Unfold/fold transformation of logic programs. In 2nd International Logic Programming Conf., 1984.
A. Tarantola. Inverse Problem Theory. Elsevier, 1987.
J. Vitter and P. Krishnan. Optimal prefetcching with data compression. Technical Report CS-91-46, Brown University Department of Computer Science, 1991.
J. Vitter and P. Krishnan. Optimal prefetching via data compression. In Proceedings of the 32nd Annual IEEE Symposium on Foundations of Computer Science, 1991.
E. Wan. Temporal backpropagation: An efficient algorithm for finite impulse response neural networks. In Connectionist Methods: Proc. of the 1990 Summer School, pages 131–140, 1990.
A. Weigend and B. Hubermann and D. Rumelhart. Predicting the future: a connectionist approach. International Journal of Neural Systems, 1:193–209, 1990.
A. Weigend and N. Gershenfeld. Predicting the Future and Understanding the Past: A Comparison of Approaches. Addison-Wesley, 1993.
A. Weigend, D. Rumelhart, and B. Hubermann. Backpropagation, weight elimination, and time series prediction. In Connectionist Methods: Proc. of the 1990 Summer School, pages 105–116, 1990.
Ross Williams. Adaptive Data Compression. Kluwer Academic Publishers, Boston, 1991.
H. Wold. A study in the analysis of stationary time series. Almqvst and Wiksell, Uppsala, 1938.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Laird, P. (1993). Identifying and using patterns in sequential data. In: Jantke, K.P., Kobayashi, S., Tomita, E., Yokomori, T. (eds) Algorithmic Learning Theory. ALT 1993. Lecture Notes in Computer Science, vol 744. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57370-4_33
Download citation
DOI: https://doi.org/10.1007/3-540-57370-4_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57370-8
Online ISBN: 978-3-540-48096-9
eBook Packages: Springer Book Archive