Abstract
Advances in technology and in active vision research allow and encourage sequential visual information acquisition. Hidden Markov models (HMMs) can represent probabilistic sequences and probabilistic graph structures: here we explore their use in controlling the acquisition of visual information. We include a brief tutorial with two examples: (1) use input sequences to derive an aspect graph and (2) similarly derive a finite state machine for control of visual processing.
The first main topic is the use of HMMs in both their learning and generative modes, and their augmentation to allow inputs sensed during generation to modify the generated outputs temporarily or permanently. We propose these augmented HMMs as a theory of adaptive skill acquisition and generation. The second main topic builds on the first: the augmented HMMs can be used for knowledge fusion. We give an example, the what-where-AHMM, which creates a hybrid skill from separate skills based on object location and object identity. Insofar as low-level skills can be learned from the output of high-level cognitive processes, AHMMs can provide a link between high-level and low-level vision.
Similar content being viewed by others
References
Bajcsy, R. 1988. “Active perception”.IEEE Proc. 76(8): 996–1005.
Baker, J.K. 1979. “Trainable grammars for speech recognition”.97th Meet. Acoustic. Soc. Amer., Speech Communications Paper.
Ballard, D.H. 1991. “Animate vision”.Artificial Intelligence. 48: 57–86.
Ballard, D.H., and Brown, C.M. 1982.Computer Vision. Prentice-Hall: New York.
Barto, A., Sutton, R., and Watkins, C. 1989. “Learning and sequential decision making”. Technical Report 89-95, Department of Computer and Information Science, University of Massachusetts at Amherst, September.
Bobick, A.F., and Bolles, R.C. 1989. Representation space: an approach to the integration of visual information”.Proc. IEEE Conf. Comput. Vision Patt. Recog., San Diego, pp. 492–499.
Bolle, R.M., Califano, A., and Kjeldsen, R. 1990. “Data and model driven foveation”.Proc. IEEE Intern. Conf. Patt. Recog., pp. 1–7.
Bourlard, H., and Wellekens, C.J. 1988. “Links between Markov models and multilayer perceptrons”.Proc. Neural Infor. Process. Conf., pp. 502–510.
Brown, C.M. 1988. “The Rochester robot”. Technical Report 257, Department of Computer Science, University of Rochester, August.
Browse, R.A., and Rodrigues, M.G. 1988. “Propagation of interpretations based on graded resolution input”.Proc. 2nd Intern. Conf. Comput. Vision, Tampa, pp. 405–410.
Burt, P.J. 1988. “Smart sensing within a pyramid vision machine”.IEEE Proc. 76(8): 1006–1015.
Casacuberta, F. 1990. “Some relations among stochastic finite state networks used in automatic speech recognition“.IEEE Trans. Patt. Anal. Mach. Intell. 12(7): 691–694.
Clark, J.J., and Ferrier, N.J. 1988. “Modal control of an attentive vision system”.Proc. 2nd Intern. Conf. Comput. Vision, Tampa, pp. 514–523.
Dean, T., and Kanazawa, K. 1989. “A model for reasoning about persistence and causation”.Computational Intelligence 5(3): 142–150.
Gong, X., and Huang, N.K. 1988. “Textured image recognition using hidden Markov model”.Proc. IEEE Intern. Conf. Acoust., Speech, and Sig. Process, pp. 1128–1131.
He. Y., and Kundu, A. 1991. “Planar shape classification using hidden Markov model”.Proc. IEEE Conf. Comput. Vision Patt. Recog.
Kehagias, A. 1991. “Approximation and estimation of stochastic processes by hidden Markov models”. Ph.D. thesis, Brown University, Division of Applied Mathematics, May.
Koenderink, J.J., and Doorn, A.J. 1979. “The internal representation of solid shape with respect to vision.”Biological Cybernetics 32: 211–216.
Kung, S.Y., and Hwang, J.N. 1989. “A unified systolic architecture for artificial neural networks.”Proc. IEEE Intern. Conf. Acoust., Speech. Sig. Process.
Mao, W.D., and Kung, S.Y. 1990. “An object recognition system using stochastic knowledge source and VLSI parallel architecture.”Proc. IEEE Intern. Conf. Patt. Recog., pp. 832–836.
Mozer, M.C., and Bachrach, J. 1990. “Discovering the structure of a reactive environment by exploration.”Neural Computation 2: 447–457.
Noton, D., and Stark, L. 1971. “Eye movements and visual perception.”Scientific American 224(6): 34–43.
Olson, T.J., and Coombs, D.J. 1991. “Real-time vergence control for binocular robots.”Intern. J. Comput. Vision. 7(1): 67–89.
Rabiner, L.R. 1989. “A tutorial on hidden Markov models and selected applications in speech recognition.”IEEE Proc. 77(2): 257–286.
Rabiner, L.R., and Juang, B.H. 1986. “An introduction to hidden Markov models.”IEEE ASSP Magazine, January.
Rimey, R.D., and Brown, C.M. 1990. “Selective attention as sequential behavior: Modeling eye movements with an augmented hidden Markov model.” Technical Report 327 (revised), Department of Computer Science, University of Rochester, April.
Rosenschein, S.J. 1985. “Formal theories of knowledge in AI and robotics.”New Gen. Comput. 3(4): 345–358.
Rueckl, J.G., Cave, K.R., and Kosslyn, S.M. 1989. “Why are ‘what’ and ‘where’ processed by separate cortical visual systems? A computational investigation.”J. Cog. Neurosci. 1(2): 171–186, Spring.
Seibert, M., and Waxman, A.M. 1990. “Learning aspect graph representations from view sequences.”Proc. Neural Infor. Process. Conf., pp. 258–265.
Stark, L., and Ellis, S.R. 1981. “Scanpaths revisited: Cognitive models direct active looking.” In D.F., Fisher, R.A., Monty, and J.W., Senders, eds.,Eye Movements: Cognition and Visual Perception. Lawrence Erlbaum: Hillsdale, N.J.
Ullman, S. 1984. “Visual routines.”Cognition 18: 97–157.
Watts, N.A. 1988. “Calculating the principal views of a polyhedron.”Proc. IEEE Intern. Conf. Patt. Recog., November.
Whitehead, S.D., and Ballard, D.H. 1990. “Active perception and reinforcement learning,”Proc. Mach. Learn. Conf., pp. 179–188.
Wright, C.E. 1990. “Controlling sequential motor activity,” In D.N., Osherson, S.M., Kosslyn, and J.M., Hollerbach, eds.,An Invitation to Cognitive Science, vol. 2,Visual Cognition and Action, MIT Press: Cambridge, MA, pp. 285–316.
Yeshurun, Y., and Schwartz, E.L. 1989. “Shape description with a space-variant sensor.”IEEE Trans. Patt. Anal. Mach. Intell. 11(11): 1217–1222.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rimey, R.D., Brown, C.M. Controlling eye movements with hidden Markov models. Int J Comput Vision 7, 47–65 (1991). https://doi.org/10.1007/BF00130489
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00130489