Skip to main content
Log in

A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

This paper presents a robotic head for social robots to attend to scene saliency with bio-inspired saccadic behaviors. Scene saliency is determined by measuring low-level static scene information, motion, and object prior knowledge. Towards the extracted saliency spots, the designed robotic head is able to turn gazes in a saccadic manner while obeying eye–head coordination laws with the proposed control scheme. The results of the simulation study and actual applications show the effectiveness of the proposed method in discovering of scene saliency and human-like head motion. The proposed techniques could possibly be applied to social robots to improve social sense and user experience in human–robot interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Asfour, T., Welke, K., Azad, P., Ude, A., & Dillmann, R. (2008). The Karlsruhe humanoid head. In Proceedings of IEEE-RAS international conference on humanoid robots (pp. 447–453).

  • Breazeal, C. (2000). Sociable machines: Expressive social exchange between humans and robots. Ph.D. thesis, Massachusetts Institute of Technology.

  • Butko, N., Zhang, L., Cottrell, G., & Movellan J. (2008). Visual saliency model for robot cameras. In IEEE international conference on robotics and automation (pp. 2398–2403).

  • Choi, S.-B., Ban, S.-W., & Lee, M. (2004). Biologically motivated visual attention system using bottom-up saliency map and top-down inhibition. Neural Information Processing—Letters and Review, 2(1), 19–25.

    Google Scholar 

  • Corbetta, M., & Shulman, G. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience, 3, 201–215.

    Article  Google Scholar 

  • Crawford, J., Martinez-Trujillo, J., & Klier, E. (2003). Neural control of three-dimensional eye and head movements. Current Opinion in Neurobiology, 13(6), 655–662.

    Article  Google Scholar 

  • Crawford, J., & Vilis, T. (1991). Axes of eye rotation and listing’s law during rotations of the head. Journal of Neurophysiology, 65(3), 407–423.

    Google Scholar 

  • Cui, R., Gao, B., & Guo, J. (2012). Pareto-optimal coordination of multiple robots with safety guarantees. In Autonomous Robots, 1–17. doi:10.1007/s10514-012-9302-3.

  • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Computer Society conference on computer vision and pattern recognition.

  • Donders, F. (1848). Beitrag zur lehre von den bewegungen des menschlichen auges. Holland Beitr Anat Physiol Wiss, 1(104), 384.

    Google Scholar 

  • Doretto, G., Chiuso, A., Wu, Y., & Soatto, S. (2003). Dynamic textures. International Journal of Computer Vision, 51(2), 91–109.

    Article  MATH  Google Scholar 

  • Gao, D., & Vasconcelos, N. (2007). Bottom-up saliency is a discriminant process. In IEEE 11th international conference on computer vision, ICCV 2007 (pp. 1–6).

  • Ge, S., He, H., & Zhang, Z. (2011). Bottom-up saliency detection for attention determination. Machine Vision and Applications, 24, 1–14.

    Google Scholar 

  • Glenn, B., & Vilis, T. (1992). Violations of listing’s law after large eye and head gaze shifts. Journal of Neurophysiology, 68(1), 309–318.

    Google Scholar 

  • Goossens, H., & Opstal, A. (1997). Human eye–head coordination in two dimensions under different sensorimotor conditions. Experimental Brain Research, 114(3), 542–560.

    Article  Google Scholar 

  • Guitton, D., & Volle, M. (1987). Gaze control in humans: Eye–head coordination during orienting movements to targets within and beyond the oculomotor range. Journal of Neurophysiology, 58(3), 427–459.

    Google Scholar 

  • Hartley, R., & Zisserman, A. (2000). Multiple view geometry in computer vision (Vol. 2). New York: Cambridge University Press.

    MATH  Google Scholar 

  • He, H., Ge, S., & Zhang, Z. (2011). Visual attention prediction using saliency determination of scene understanding for social robots. Special issue on towards an effective design of social robots. International Journal of Social Robotics, 3, 457–468.

    Article  Google Scholar 

  • He, H., Zhang, Z., & Ge, S. (2010). Attention determination for social robots using salient region detection. In International conference on social robotics (pp. 295–304). Heidelberg: Springer.

  • Heuring, J., & Murray, D. (1999). Modeling and copying human head movements. IEEE Transactions on Robotics and Automation, 15(6), 1095–1108.

    Article  Google Scholar 

  • Hwang, A. D., Higgins, E. C., & Pomplun, M. (2009). A model of top-down attentional control during visual search in complex scenes. Journal of Vision, 9(5), 25.1–25.18.

    Google Scholar 

  • Itti, L. (2003). Realistic avatar eye and head animation using a neurobiological model of visual attention. Tech. Rep. Defense Technical Information Center Document.

  • Itti, L. (2005). Models of bottom-up attention and saliency. In L. Itti, G. Rees, & J. K. Tsotsos (Eds.), Neurobiology of attention (pp. 576–582). San Diego, CA: Elsevier.

  • Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2010). Learning to predict where humans look. In International conference on computer vision.

  • Kanan, C., Tong, M. H., Zhang, L., & Cottrell, G. W. (2009). Sun: Top-down saliency using natural statistics. Visual Cognition, 17(6–7), 979–1003.

    Article  Google Scholar 

  • Laschi, C., Asuni, G., Guglielmelli, E., Teti, G., Johansson, R., Konosu, H., et al. (2008). A bio-inspired predictive sensory-motor coordination scheme for robot reaching and preshaping. Autonomous Robots, 25(1), 85–101.

    Article  Google Scholar 

  • Le Meur, O., Le Callet, P., Barba, D., & Thoreau, D. (2006). A coherent computational approach to model bottom-up visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 802–817.

  • Lopes, M., Bernardino, A., Santos-Victor, J., Rosander, K., & von Hofsten, C. (2009). Biomimetic eye-neck coordination. In Proceedings of IEEE international conference on development and learning (pp. 1–8).

  • Maini, E., Manfredi, L., Laschi, C., & Dario, P. (2008). Bioinspired velocity control of fast gaze shifts on a robotic anthropomorphic head. Autonomous Robots, 25(1), 37–58.

    Article  Google Scholar 

  • Medendorp, W., Van Gisbergen, J., Horstink, M., & Gielen, C. (1999). Donders’ law in torticollis. Journal of Neurophysiology, 82(5), 2833.

    Google Scholar 

  • Milanese, R., Wechsler, H., Gill, S., Bost, J.-M., & Pun, T. (1994). Integration of bottom-up and top-down cues for visual attention using non-linear relaxation. In IEEE Computer Society conference on computer vision and pattern recognition, Proceedings CVPR’94 (pp. 781–785).

  • Morel, J., & Yu, G. (2009). Asift: A new framework for fully affine invariant image comparison. SIAM Journal on Imaging Sciences, 2(2), 438–469.

    Article  MATH  MathSciNet  Google Scholar 

  • Nagai, Y., Hosoda, K., Morita, A., & Asada, M. (2003). A constructive model for the development of joint attention. Connection Science, 15(4), 211–229.

    Article  Google Scholar 

  • Navalpakkam, V., & Itti, L. (2006). An integrated model of top-down and bottom-up attention for optimizing detection speed. In 2006 IEEE Computer Society conference on computer vision and pattern recognition (Vol. 2, pp. 2049–2056).

  • Oliva, A., Torralba, A., Castelhano, M. S., & Henderson, J. M. (2003). Top-down control of visual attention in object detection. In Proceedings of 2003 IEEE international conference on image processing, ICIP 2003 (Vol. 1, pp. 1–253).

  • Pagel, M., Maël, E., & Von Der Malsburg, C. (1998). Self calibration of the fixation movement of a stereo camera head. Autonomous Robots, 5(3), 355–367.

    Article  Google Scholar 

  • Raphan, T. (1998). Modeling control of eye orientation in three dimensions. I. Role of muscle pulleys in determining saccadic trajectory. Journal of Neurophysiology, 79(5), 2653.

    Google Scholar 

  • Seo, H. J., & Milanfar, P. (2009). Nonparametric bottom-up saliency detection by self-resemblance. In IEEE computer society conference on computer vision and pattern recognition workshops. CVPR Workshops 2009 (pp. 45–52).

  • Smith, R. (2007). An overview of the tesseract ocr engine. In Proceedings of the ninth international conference on document analysis and recognition.

  • Tsagarakis, N., Metta, G., Sandini, G., Vernon, D., Beira, R., Becchi, F., et al. (2007). Icub: the design and realization of an open humanoid platform for cognitive and neuroscience research. Advanced Robotics, 21(10), 1151–1175.

    Article  Google Scholar 

  • Tweed, D. (1997). Three-dimensional model of the human eye–head saccadic system. Journal of Neurophysiology, 77(2), 654.

    Google Scholar 

  • Viola, P., & Jones, M. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.

    Article  Google Scholar 

  • Westheimer, G. (1957). Kinematics of the eye. Journal of the Optical Society of America, 47, 967–974.

    Article  Google Scholar 

Download references

Acknowledgments

The research is partially funded by Singapore National Research Foundation, Interactive Digital Media R&D Program, under research grant R-705-000-017-279, and the National Basic Research Program of China (973 Program) under Grant 2011CB707005.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuzhi Sam Ge.

Appendix

Appendix

1.1 Computation of linear projections using corresponding points

Given one corresponding position, the projective transformation maps is

$$\begin{aligned} \left[ \begin{array}{c} r_{k}^{i}\\ c_{k}^{i}\\ 1 \end{array}\right] =\left[ \begin{array}{ccc} P_{11}^{ij} &{} P_{12}^{ij} &{} P_{13}^{ij}\\ P_{21}^{ij} &{} P_{22}^{ij} &{} P_{23}^{ij}\\ P_{31}^{ij} &{} P_{32}^{ij} &{} P_{33}^{ij} \end{array}\right] \left[ \begin{array}{c} r_{k}^{j}\\ c_{k}^{j}\\ 1 \end{array}\right] \end{aligned}$$
(28)

To solve the optimal linear projection, the linear projection (28) can be rewritten in the matrix form Hartley and Zisserman (2000),

$$\begin{aligned} (\mathbf v _{k}^{x})^{T}\mathbf p ^{ij}&= 0\end{aligned}$$
(29)
$$\begin{aligned} (\mathbf v _{k}^{y})^{T}\mathbf p ^{ij}&= 0 \end{aligned}$$
(30)

where

$$\begin{aligned} \mathbf p ^{ij}&= [P_{11}^{ij},P_{12}^{ij},P_{13}^{ij},P_{21}^{ij},P_{22}^{ij},P_{23}^{ij},P_{31}^{ij},P_{32}^{ij},P_{33}^{ij}]^{T}\end{aligned}$$
(31)
$$\begin{aligned} \mathbf v _{k}^{x}&= [-r_{k}^{i},-c_{k}^{i},-1,0,0,0,r_{k}^{j}r_{k}^{i},r_{k}^{j}c_{k}^{i},r_{k}^{j}]\end{aligned}$$
(32)
$$\begin{aligned} \mathbf v _{k}^{y}&= [0,0,0,-r_{k}^{i},-c_{k}^{i},-1,c_{k}^{j}r_{k}^{i},c_{k}^{j}c_{k}^{i},c_{k}^{j}]. \end{aligned}$$
(33)

Assuming that corresponding points in both images can be identified and correspondence can be approximated with linear maps within a small movement of the camera, the projection matrix can be computed by

$$\begin{aligned} V_{ij}\mathbf p ^{ij}=0 \end{aligned}$$
(34)

where \(V_{ij}=[\mathbf v _{1}^{x},\mathbf v _{1}^{y}, \mathbf v _{2}^{x},\mathbf v _{2}^{y},\ldots , \mathbf v _{k}^{x},\mathbf v _{k}^{y}]\) with \(k\) corresponding points between the two images.

1.2 Quaternion element projection

Property 1

(Quaternion element projection) Let a quaternion be \(x=o\mathsf 1 +a\mathsf i +b\mathsf j +c\mathsf k \) with \(1\), \(\mathsf i \), \(\mathsf j \) and \(\mathsf k \) as the element basis. If \(a=0\), then \(x*\mathsf i =\mathsf i *x^{+}\) where \(x^{+}\) is the conjugate of \(x\). The rest for \(b\) and \(c\) may be deduced by analogy.

Proof

The proof is straightforward by expanding the left and right sides of \(x*\mathsf i =\mathsf i *x^{+}\) with the quaternion definition. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, H., Ge, S.S. & Zhang, Z. A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics. Auton Robot 36, 225–240 (2014). https://doi.org/10.1007/s10514-013-9346-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-013-9346-z

Keywords

Navigation