Skip to main content
Log in

Brain programming as a new strategy to create visual routines for object tracking

Towards automation of video tracking design

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This work describes the use of brain programming for automating the video tracking design process. The challenge is that of creating visual programs that learn to detect a toy dinosaur from a database while tested in a visual-tracking scenario. When planning an object tracking system, two sub-tasks need to be approached: detection of moving objects in each frame and correct association of detection to the same object over time. Visual attention is a skill performed by the brain whose functionality is to perceive salient visual features. The automatic design of visual attention programs through an optimization paradigm is applied to the detection-based tracking of objects in a video from a moving camera. A system based on the acquisition and integration steps of the natural dorsal stream was engineered to emulate its selectivity and goal-driven behavior useful to the task of tracking objects. This is considered a challenging problem since many difficulties can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid structures, object-to-object and object-to-scene occlusions, as well as camera motion, models, and parameters. Tracking relies on the quality of the detection process and automatically designing such stage could significantly improve tracking methods. Experimental results confirm the validity of our approach using three different kinds of robotic systems. Moreover, a comparison with the method of regions with convolutional neural networks is provided to illustrate the benefit of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Ali A, Aggarwal JK (2001) Segmentation and recognition of continuous human activity. In: Proceedings of IEEE workshop on detection and recognition of events in video, pp 28–35. https://ieeexplore.ieee.org/document/938863/

  2. Amazon Web Service. Amazon AI. https://aws.amazon.com/machine-learning/

  3. Avidan S (2004) Support vector tracking. IEEE Trans Pattern Anal Mach Intell 26(8):1064–1072. https://ieeexplore.ieee.org/document/1307012/

    Article  Google Scholar 

  4. Bensebaa Amina, Larabi Slimane (2018) Direction estimation of moving pedestrian groups for intelligent vehicles. Vis Comput 34(6–8):1109–1118. https://doi.org/10.1007/s00371-018-1520-z

    Article  Google Scholar 

  5. Black MJ, Jepson AD (1998) Eigentracking: robust matching and tracking of articulated objects using a view-based representation. Int J Comput Vis 26(1):63–84. https://link.springer.com/article/10.1023/A:1007939232436

    Article  Google Scholar 

  6. Caffe2. https://caffe2.ai/

  7. Chen S, Li Y, Kwok NM (2011) Active vision in robotic systems: a survey of recent developments. Int J Robot Res 30(11):1343–1377. http://journals.sagepub.com/doi/abs/10.1177/0278364911410755

    Article  Google Scholar 

  8. Choudhury SK, Sa PK, Padhy RP, Sharma S, Bakshi S (2018) Improved pedestrian detection using motion segmentation and silhouette orientation. Multimed Tools Appl 17(1):13075–13114. https://doi.org/10.1007/s11042-017-4933-1

    Article  Google Scholar 

  9. Clemente E, Olague G, Dozal L, Mancilla M (2012) Object recognition with an optimized ventral stream model using genetic programming. Appl Evol Comput LNCS 7248:315–325. https://doi.org/10.1007/978-3-642-29178-4_32

    Article  Google Scholar 

  10. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619. https://ieeexplore.ieee.org/document/1000236/

    Article  Google Scholar 

  11. Cremers D, Schnȯrr C (2003) Statistical shape knowledge in variational motion segmentation. Image Vis Comput 21(1):77–86. https://www.sciencedirect.com/science/article/pii/S0262885602001282

    Article  Google Scholar 

  12. Cuda-Convnet. https://code.google.com/archive/p/cuda-convnet/

  13. Deep Learning in MATLAB. https://www.mathworks.com/help/nnet/ug/deep-learning-in-matlab.html

  14. Deng J, Dong W, Socher R, Li L-J, Li F-F (2009) ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 248–255. https://ieeexplore.ieee.org/document/5206848/

  15. Desimone R, Duncan J (1995) Neural mechanisms of selective visual attention. Ann Revue Neurosci 18:193–222. https://www.ncbi.nlm.nih.gov/pubmed/7605061

    Article  Google Scholar 

  16. Dozal L, Olague G, Clemente E, Sánchez M (2012) Evolving visual attention programs through EVO features. Appl Evol Comput LNCS 7248:326–335. https://doi.org/10.1007/978-3-642-29178-4_33

    Article  Google Scholar 

  17. Dozal L, Olague G, Clemente, Hernández DE (2014) Brain programming for the evolution of an artificial dorsal stream. Cogn Comput 6(3):528–557. https://doi.org/10.1007/s12559-014-9251-6

    Article  Google Scholar 

  18. Fan J, Wu Y, Dai S (2010) Discriminative spatial attention for robust tracking. Springer, Berlin, pp 480–493. https://link.springer.com/chapter/10.1007/978-3-642-15549-9_35

    Google Scholar 

  19. Fieguth P, Terzopoulos D (1997) Color-based tracking of heads and other mobile objects at video frame rates. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 21–27. https://ieeexplore.ieee.org/document/609292/

  20. Fukushima K (1975) Cognitron: a self-organizing multilayered neural network. Biol Cybern 20(6):121–136. https://doi.org/10.1007/BF00342633

    Article  Google Scholar 

  21. Fukushima K (1980) Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36 (4):193–202. https://doi.org/10.1007/BF00344251

    Article  MATH  Google Scholar 

  22. Girshick R, Donahue J, Darrel T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 580–587. https://ieeexplore.ieee.org/document/6909475/

  23. Google Cloud Machine Learning. https://cloud.google.com/products/ai/

  24. Google TensorFlow. https://www.tensorflow.org

  25. Hernández DE, Olague G, Clemente E, Dozal L (2012) Evolving a conspicuous point detector based on an artificial dorsal stream: SLAM system. Gen Evol Comput Conf, 1087–1094. https://dl.acm.org/citation.cfm?doid=2330163.2330314

  26. Hernández D, Olague G, Clemente E, Dozal L (2012) Evolutionary purposive or behavioral vision for camera trajectory estimation. Appl Evol Comput LNCS 7248:336–345. https://doi.org/10.1007/978-3-642-29178-4_34

    Article  Google Scholar 

  27. Hernández DE, Clemente E, Olague G, Briseṅo JL (2016) Evolutionary multi-objective visual cortex for object classification in natural images. J Comput Sci 17:216–233. https://doi.org/10.1016/j.jocs.2015.10.011

    Article  Google Scholar 

  28. Hernández DE, Olague G, Hernández B, Clemente E (2017) CUDA-based parallelization of a bio-inspired model for fast object classification. Neural Comput Appl, 1–12. Available online https://link.springer.com/article/10.1007/s00521-017-2873-3

  29. Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern Part C (Appl Rev) 34(3):334–352. https://ieeexplore.ieee.org/document/1310448/

    Article  Google Scholar 

  30. Hubel DH (1982) Exploration of the primary visual cortex, 1955-78. Nature 299:515–524. https://doi.org/10.1038/299515a0

    Article  Google Scholar 

  31. Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat’s striate cortex. J Physiol 148(3):574–591. https://doi.org/10.1113/jphysiol.1959.sp006308

    Article  Google Scholar 

  32. IBM Watson. https://www.ibm.com/watson/

  33. Intille SS, Davis JW, Bobick AF (1997) Real-time closed-world tracking. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 697–703. https://ieeexplore.ieee.org/document/609402/

  34. Isard M, Blake A (1998) Condensation – conditional density propagation for visual tracking. Int J Comput Vis 29(1):5–28. https://link.springer.com/article/10.1023/A:1008078328650

    Article  Google Scholar 

  35. Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2(3):194–203. https://www.nature.com/articles/35058500

    Article  Google Scholar 

  36. Kang Jinman, Cohen I, Medioni G (2003) Continuous tracking within and across camera streams. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 267–272. https://ieeexplore.ieee.org/document/1211363/

  37. Kim K, Davis LS (2011) Object detection and tracking for intelligent video surveillance. Springer, Berlin, pp 265–288. https://link.springer.com/chapter/10.1007

    Google Scholar 

  38. Ko T (2011) A survey on behaviour analysis in video surveillance applications, chapter 16, pp 279–294 InTech. https://www.intechopen.com/books/video-surveillance/a-survey-on-behavior-analysis-in-video-surveillance-applications

  39. Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiol 4(4):219–227. Reprinted in Matters of Intelligence, pp. 115–141, 1987. https://link.springer.com/chapter/10.1007/978-94-009-3833-5_5

    Google Scholar 

  40. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical Report, https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf

  41. LeCun Y, Bottou L, Bengio Ya, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://ieeexplore.ieee.org/document/726791/

    Article  Google Scholar 

  42. Li B, Chellappa R, Zheng Q, Der SZ (2001) Model-based temporal object verification using video. IEEE Trans Image Process 10(6):897–908. https://ieeexplore.ieee.org/document/923286/

    Article  Google Scholar 

  43. Li Z, Wang W, Wang Y, Chen F, Yi W (2013) Visual tracking by proto-objects. Pattern Recogn 46(8):2187–2201. https://www.sciencedirect.com/science/article/pii/S0031320313000575

    Article  Google Scholar 

  44. Ma L, Cheng J, Liu J, Wang J, Lu H (2010) Visual attention model based object tracking. Springer, Berlin, pp 483–493. https://link.springer.com/chapter/10.1007/978-3-642-15696-0_45

    Google Scholar 

  45. Mahadevan V, Vasconcelos N (2009) Saliency-based discriminant tracking. In: 2009 IEEE conference on computer vision and pattern recognition, pp 1007–1013. https://ieeexplore.ieee.org/document/5206573/

  46. Mancas M, Ferrera VPP, Riche N, Taylor JGG (eds) (2016) From human attention to computational attention: a multidisciplinary approach, volume 10 springer series in cognitive and neural systems. Springer. https://www.springer.com/gp/book/9781493934331

  47. Microsoft Azure. https://azure.microsoft.com/en-us/services/machine-learning-studio/

  48. Microsoft Cognitive Toolkit. https://www.microsoft.com/en-us/cognitive-toolkit/

  49. Nanda A, Sa PK, Choudhury SK, Bakshi S, Majhi B (2017) A neuromorphic person re-identification framework for video surveillance. IEEE Access 5:6471–6482. https://ieeexplore.ieee.org/document/7885600/

    Google Scholar 

  50. Nanda A, Chauhan DS, Sa PK, Bakshi S (2018) Illumination and scale invariant relevant visual features with hypergraph-based learning for multi-shot person re-identification. Multimed Tools Appl, 1–26. First online https://doi.org/10.1007/s11042-017-4875-7

    Article  Google Scholar 

  51. Olague G (2016) Evolutionary computer vision – the first footprints. Springer. https://www.springer.com/gp/book/9783662436929

  52. Olague G, Clemente E, Dozal L, Hernández DE (2014) Evolving an artificial visual cortex for object recognition with brain programming. In: Schütze O et al. (eds) EVOLVE – a bridge between probability set oriented numerics and evolutionary computation III, volume 500 of studies in computational intelligence, pp 97–119. https://link.springer.com/chapter/10.1007/978-3-319-01460-9_5

    Chapter  Google Scholar 

  53. Olague G, Hernández DE, Clemente E, Chan-Ley M (2018) Evolving head tracking routines with brain programming. IEEE Access 6:26254–26270. https://doi.org/10.1109/ACCESS.2018.2831633

    Article  Google Scholar 

  54. Osaka N, Rentschler I, Biederman I (eds) (2007) Object recognition attention, and action. Springer. https://www.springer.com/gp/book/9784431730187

  55. Ouerhani N, Hügli H (2003) A model of dynamic visual attention for object tracking in natural image sequences. Springer, Berlin, pp 702–709. https://link.springer.com/chapter/10.1007/3-540-44868-3_89

    MATH  Google Scholar 

  56. Park S, Aggarwal JK (2004) A hierarchical Bayesian network for event recognition of human actions and interactions. Multimed Syst 10(2):164–179. https://link.springer.com/article/10.1007/s00530-004-0148-1

    Article  Google Scholar 

  57. Posner MI, Snyder CR, Davidson BJ (1980) Attention and the detection of signals. J Exp Psychol 109(2):160–174. https://www.ncbi.nlm.nih.gov/pubmed/7381367

    Article  Google Scholar 

  58. Pytorch. https://pytorch.org

  59. Rangarajan K, Shah M (1991) Establishing motion correspondence. CVGIP: Image Understand 54(1):56–73. https://ieeexplore.ieee.org/document/139669/

    Article  Google Scholar 

  60. Rasool Reddy K, Hari Priya K, Neelima N (2015) Object detection and tracking – a survey. In: 2015 International conference on computational intelligence and communication networks (CICN), pp 418–421. https://ieeexplore.ieee.org/document/7546127/

  61. Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nature 2:1019–1025. https://doi.org/10.1038/14819

    Article  Google Scholar 

  62. Rout JK, Singh S, Jena SK, Bakshi S (2017) Deceptive review detection using labeled and unlabeled data. Multimed Tools Appl 76(3):3187–3211. https://link.springer.com/article/10.1007/s11042-016-3819-y

    Article  Google Scholar 

  63. Schweitzer H, Bell JW, Wu F (2002) Very fast template matching. In: European conference on computer vision, vol LNCS 2353, pp 358–372, https://link.springer.com/chapter/10.1007/3-540-47979-1_24

    Chapter  Google Scholar 

  64. Serby D, Meier EK, van Gool L (2004) Probabilistic object tracking using multiple features. In: Proceedings of the 17th international conference on pattern recognition, ICPR, vol 2. IEEE, pp 184–187. https://ieeexplore.ieee.org/document/1334091/

  65. Shafique K, Shah M (2005) A noniterative greedy algorithm for multiframe point correspondence. IEEE Trans Pattern Anal Mach Intell 27(1):51–65. https://ieeexplore.ieee.org/document/1359751/

    Article  Google Scholar 

  66. Smeulders AWM, Chu DM, Cucchiara R, Calderara S, Dehghan A, Shah M (2014) Visual tracking: an experimental survey. IEEE Trans Pattern Anal Mach Intell 36(7):1442–1468. https://ieeexplore.ieee.org/document/6671560/

    Article  Google Scholar 

  67. Theano. http://deeplearning.net/software/theano/index.html

  68. Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cognitive Psychology. https://www.sciencedirect.com/science/article/pii/0010028580900055

  69. Ungerleider LG, Haxby JV (1994) ‘What’ and ‘where’ in the human brain. Curr Opin Neurobiol 4(2):157–165. https://www.ncbi.nlm.nih.gov/pubmed/8038571

    Article  Google Scholar 

  70. Vaswani N, Roy Chowdhury A, Chellappa R (2003) Activity recognition using the dynamics of the configuration of interacting objects. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 633–640. https://ieeexplore.ieee.org/abstract/document/1211526/

  71. Veenman CJ, Reinders MJT, Backer E (2001) Resolving motion correspondence for densely moving points. IEEE Trans Pattern Anal Mach Intell 23 (1):54–72. https://ieeexplore.ieee.org/document/899946/

    Article  Google Scholar 

  72. Wolfe JM (2000) Visual attention. In: de Valois KK (ed) Seeing (handbook of perception and cognition), Chapter 8. Academic Press, pp 335–386. https://www.sciencedirect.com/science/article/pii/B9780124437609500106

    Chapter  Google Scholar 

  73. Yilmaz A, Li Xin, Shah M (2004) Contour-based object tracking with occlusion handling in video acquired using mobile cameras. IEEE Trans Pattern Anal Mach Intell 26(11):1531–1536. https://ieeexplore.ieee.org/document/1335457/

    Article  Google Scholar 

  74. Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv, 38(4). https://doi.org/10.1145/1177352.1177355

    Article  Google Scholar 

  75. Zang Q, Klette R (2003) Object classification and tracking in video surveillance. Springer, Berlin, pp 198–205. https://link.springer.com/chapter/10.1007/978-3-540-45179-2_25

    Google Scholar 

  76. Zhao Q (ed) (2017) Computational and cognitive neuroscience of vision, cognitive science and technology series. Springer. https://www.springer.com/gp/book/9789811002113

    Google Scholar 

  77. Zhou SK, Chellappa R, Moghaddam B (2004) Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Trans Image Process 13(11):1491–1506. https://ieeexplore.ieee.org/document/1344039/

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded by CICESE through Project 634-128 – “Programación cerebral aplicada al estudio del pensamiento y la visión”. In addition, the authors acknowledge the valuable comments of the anonymous reviewers, the Editor of Multimedia Tools and Applications, and the International Editorial Board whose enthusiasm is gladly appreciated.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gustavo Olague.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Olague, G., Hernández, D.E., Llamas, P. et al. Brain programming as a new strategy to create visual routines for object tracking. Multimed Tools Appl 78, 5881–5918 (2019). https://doi.org/10.1007/s11042-018-6634-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6634-9

Keywords

Navigation