Abstract
Many vision problems involve the detection of the boundary of an object, like a hand, or the tracking of a one-dimensional structure, such as a road in an aerial photograph. These problems can be formulated in terms of Bayesian probability theory and hence expressed as optimization problems on trees or graphs. The twenty questions, or minimum entropy, algorithm has recently been developed by Geman and Jedynak (1994) as a highly effective, and intuitive, tree search algorithm for road tracking. In this paper we analyse this algorithm to understand how it compares to existing algorithms used for vision, and related, optimization problems. First we show that it is a special case of the focus of attention planning strategy used on causal graphs, or Bayes nets, [18]. We then show its relations to standard methods, already successfully applied to vision optimization problems, such as dynamic programming, decision trees, the A* algorithm used in artificial intelligence [22] and the, closely related, Dijkstra algorithm of computer science [4]. These comparisons show that twenty questions is often equivalent to an algorithm, which we call A+, which tries to explore the most probable paths first. We show that A+ is a greedy, and suboptimal, variant of A*. This suggests that A+ and twenty questions will be faster than A* and Dijkstra for certain problems but they may occasionally converge to the wrong answer. However, the fact that A+ and twenty questions maintain a probabilistic estimate of how well they are doing may give warning of faulty convergence and also allow intelligent pruning to speed up the search.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
M. Barzohar and D. B. Cooper, “Automatic Finding of Main Roads in Aerial Images by Using Geometric-Stochastic Models and Estimation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 459–464, 1993.
R. Basri, L. Costa, D. Geiger, and D. Jacobs. “Determining shape similarity.” In IEEE workshop on Physics Based Vision, Boston, June 1995.
R. E. Bellman, Applied Dynamic Programming, Princeton University Press, 1962.
D. Bertsekas. Dynamic Programming and Optimal Control. Vol. 1. (2nd Ed.) Athena Scientific Press. 1995.
T. F. Cootes and C. J. Taylor, “Active Shape Models — 'smart Snakes',” British Machine Vision Conference, pp. 266–275, Leeds, UK, September 1992.
J. Coughlan. “Global Optimization of a Deformable Hand Template Using Dynamic Programming.” Harvard Robotics Laboratory. Technical report. 95-1. 1995.
T.M. Cover and J.A. Thomas. Elements of Information Theory. Wiley Interscience Press. New York. 1991.
M.A. Fischler and R.A. Erschlager. “The Representation and Matching of Pictorial Structures”. IEEE. Trans. Computers. C-22. 1973.
D. Geiger, A. Gupta, L.A. Costa, and J. Vlontzos. “Dynamic programming for detecting, tracking and matching elastic contours.” IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-17, March 1995.
D. Geiger and T-L Liu. “Top-Down Recognition and Bottom-Up Integration for Recognizing Articulated Objects”. Preprint. Courant Institute. New York University. 1996.
D. Geman. and B. Jedynak. “An active testing model for tracking roads in satellite images”. Preprint. Dept. Mathematics and Statistics. University of Massachusetts. Amherst. 1994.
U. Grenander, Y. Chow and D. M. Keenan, Hands: a Pattern Theoretic Study of Biological Shapes, Springer-Verlag, 1991.
U. Grenander and M.I. Miller. “Representation of Knowledge in Complex Systems”. Journal of the Royal Statistical Society, Vol. 56, No. 4, pp 569–603. 1994.
N. Khaneja, M.I. Miller, and U. Grenander. “Dynamic Programming Generation of Geodesics and Sulci on Brain Surfaces”. Submitted to PAMI. 1997.
L. Kontsevich. Private Communication. 1996.
S.L. Lauritzen and D.J. Spiegelhalter. “Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems”. Journal of the Royal Statistical Society. B. Vol. 50. No. 2., pp 157–224. 1988.
U. Montanari. “On the optimal detection of curves in noisy pictures.” Communications of the ACM, pages 335–345, 1971.
J. Pearl. Probabilistic Reasoning in Intelligent Systems. San Mateo, CA:Morgan Kauffman. 1988.
W. Richards, and A. Bobick. “Playing twenty questions with nature.” In: Computational Processes in Human Vision: An Interdisciplinary Perspective. Z. Pylyshyn, Ed; Ablex, Norwood, NJ. 1988.
B. D. Ripley. “Classification and Clustering in Spatial and Image Data”. In Analyzing and Modeling Data and Knowledge. Eds. M. Schader. Springer-Verlag. Berlin. 1992.
L.H. Straib and J.S. Duncan. “Parametrically deformable contour models”. Proceedings of Computer Vision and Pattern Recognition, pp 98–103. San Diego, CA. 1989.
P.H. Winston. Artificial Intelligence. Addison-Wesley Publishing Company. Reading, Massachusetts. 1984.
L. Wiscott and C. von der Marisburg. “A Neural System for the Recognition of Partially Occluded Objects in Cluttered Scenes”. Neural Computation. 7(4):935–948. 1993.
A.L. Yuille “Deformable Templates for Face Recognition”. Journal of Cognitive Neuroscience. Vol 3, Number 1. 1991.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yuille, A.L., Coughlan, J. (1997). Twenty questions, focus of attention, and A*: A theoretical comparison of optimization strategies. In: Pelillo, M., Hancock, E.R. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 1997. Lecture Notes in Computer Science, vol 1223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62909-2_81
Download citation
DOI: https://doi.org/10.1007/3-540-62909-2_81
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62909-2
Online ISBN: 978-3-540-69042-9
eBook Packages: Springer Book Archive