Abstract
We extend the classical notion of computational visual saliency to multi-image data collected using a stationary pan-tilt-zoom (PTZ) camera by introducing the concept of consistency: the requirement that the set of generated saliency maps should each assign the same saliency value to unique regions of the environment that appear in more than one image. We show that processing each image independently will often fail to provide a consistent measure of saliency, and that using an image mosaic to quantify saliency suffers from several drawbacks. We then propose ray saliency and an immediate extension, approximate ray saliency: a mosaic-free method for calculating a consistent measure of bottom-up saliency. Experimental results demonstrating the effectiveness of the proposed approach are presented.
Similar content being viewed by others
References
Achanta, R., Hemami, S., Estrada, F., & Susstrunk, S. (2009). Frequency-tuned salient region detection. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2010). SLIC Superpixels. Tech. Rep. 149300, EPFL.
Agapito, L., Hayman, E., & Reid, I. (2001). Self-calibration of rotating and zooming cameras. International Journal of Computer Vision, 45(2), 107–127.
Agarwal, S., & Mierle, K. (2014). Ceres solver: Tutorial & reference. Google Inc.
Aloimonos, J., Weiss, I., & Bandyopadhyay, A. (1988). Active vision. International Journal of Computer Vision, 1(4), 333–356.
Ballard, D. (1991). Animate vision. Artificial Intelligence, 48(1), 57–86.
Bentley, J. (1975). Multidimensional binary search trees use for associative searching. Communications of the ACM, 18(9), 509–517.
Bogdanova, I., Bur, A., & Hugli, H. (2008). Visual attention on the sphere. IEEE Transactions on Image Processing, 17(11), 2000–2014.
Bogdanova, I., Bur, A., Hügli, H., & Farine, P. (2010). Dynamic visual attention on the sphere. Computer Vision and Image Understanding, 114(1), 100–110.
Borji, A., Sihite, D., & Itti, L. (2012). Salient object detection: A benchmark. In Proceedings of the European conference on computer vision.
Brown, M., & Lowe, D. (2007). Automatic panoramic image stitching using invariant features. International Journal of Computer Vision, 74(1), 59–73.
Bruce, N., & Tsotsos, J. (2005). Saliency based on information maximization. In Neural information processing systems
Bruce, N., & Tsotsos, J. (2009). Saliency, attention, and visual search: An information theoretic approach. Journal of Vision, 9(3), 1–24.
Cover, T., & Thomas, J. (2006). Elements of Information Theory (2nd ed.). New York: Wiley.
Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Fleming, K., Peters, R., & Bodenheimer, R. (2006). Image mapping and visual attention on a sensory ego-sphere. In Proceedings of the IEEE/RSJ International conference on intelligent robots and systems.
Gao, D., Han, S., & Vasconcelos, N. (2009). Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 989–1005.
Gao, D., Mahadevan, V., & Vasconcelos, N. (2007). The discriminant center-surround hypothesis for bottom-up saliency. In Neural information processing systems.
Harel, J., Koch, C., & Perona, P. (2006). Graph-based visual saliency. In Neural information processing systems.
Hartley, R. (1997). Self-calibration of stationary cameras. International Journal of Computer Vision, 22(1), 5–23.
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.
Ip, C., & Varshney, A. (2011). Saliency-assisted navigation of very large landscape images. IEEE Transactions on Visualization and Computer Graphics, 17(12), 1737–1746.
Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.
Jiang, H., Wang, J., Yuan, Z., Wu, Y., Zheng, N., & Li, S. (2013). Salient object detection: A discriminative regional feature integration approach. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology, 4(4), 219–227.
Kovesi, P. (2000). MATLAB and octave functions for computer vision and image processing. Available from: http://www.csse.uwa.edu.au/~pk/research/matlabfns/.
Lee, D., & Wong, C. (1977). Worst-case analysis for region and partial region searches in multidimensional binary search trees and balanced quad trees. Acta Informatica, 9(1), 23–29.
Levinshtein, A., Stere, A., Kutulakos, K., Fleet, D., Dickinson, S., & Siddiqi, K. (2009). TurboPixels: Fast superpixels using geometric flows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12), 2290–2297.
Liu, M., Tuzel, O., Ramalingam, S., & Chellappa, R. (2014). Entropy-Rate clustering: Cluster analysis via maximizing a submodular function subject to a matroid constraint. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), 99–112.
Liu, T., Sun, J., Zheng, N., Tang, X., & Shum, H. (2007). Learning to detect a salient object. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Mahadevan, V., & Vasconcelos, N. (2009). Saliency-based discriminant tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Oliva, A., Torralba, A., Castelhano, M., & Henderson, J. (2003). Top-down control of visual attention in object detection. In Proceedings of the IEEE international conference on image processing.
Perazzi, F., Krahenbuhl, P., Pritch, Y., & Hornung, A. (2012). Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Ren, X., & Malik, J. (2003). Learning a classification model for segmentation. In Proceedings IEEE international conference on computer vision.
Ruesch, J., Lopes, M., Bernardino, A., Hornstein, J., Santos-Victor, J., & Pfeifer, R. (2008). Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub. In Proceedings of the IEEE international conference on robotics and automation.
Siagian, C., & Itti, L. (2007). Biologically-inspired robotics vision monte-carlo localization in the outdoor environment. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems.
Szeliski, R. (2006). Image alignment and stitching: A tutorial. Tech. Rep. MSR-TR-2004-92, Microsoft.
Tomasi, C., & Manduchi, R. (1998). Bilateral filtering for gray and color images. In Proceedings of the IEEE international conference on computer vision.
Acknowledgments
Funding for this work was provided by U.S. Army Research Office (ARO) MURI Grant W911NF-09-1-0383.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Yasuyuki Matsushita.
Rights and permissions
About this article
Cite this article
Warnell, G., David, P. & Chellappa, R. Ray Saliency: Bottom-Up Visual Saliency for a Rotating and Zooming Camera. Int J Comput Vis 116, 174–189 (2016). https://doi.org/10.1007/s11263-015-0842-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-015-0842-9