Brain programming as a new strategy to create visual routines for object tracking

Olague, Gustavo; Hernández, Daniel E.; Llamas, Paul; Clemente, Eddie; Briseño, José L.

doi:10.1007/s11042-018-6634-9

Brain programming as a new strategy to create visual routines for object tracking

Towards automation of video tracking design

Published: 11 September 2018

Volume 78, pages 5881–5918, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Gustavo Olague ORCID: orcid.org/0000-0001-5773-9517¹,
Daniel E. Hernández²,
Paul Llamas¹,
Eddie Clemente³ &
…
José L. Briseño¹

614 Accesses
19 Citations
Explore all metrics

Abstract

This work describes the use of brain programming for automating the video tracking design process. The challenge is that of creating visual programs that learn to detect a toy dinosaur from a database while tested in a visual-tracking scenario. When planning an object tracking system, two sub-tasks need to be approached: detection of moving objects in each frame and correct association of detection to the same object over time. Visual attention is a skill performed by the brain whose functionality is to perceive salient visual features. The automatic design of visual attention programs through an optimization paradigm is applied to the detection-based tracking of objects in a video from a moving camera. A system based on the acquisition and integration steps of the natural dorsal stream was engineered to emulate its selectivity and goal-driven behavior useful to the task of tracking objects. This is considered a challenging problem since many difficulties can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid structures, object-to-object and object-to-scene occlusions, as well as camera motion, models, and parameters. Tracking relies on the quality of the detection process and automatically designing such stage could significantly improve tracking methods. Experimental results confirm the validity of our approach using three different kinds of robotic systems. Moreover, a comparison with the method of regions with convolutional neural networks is provided to illustrate the benefit of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Toward Autonomous Intelligence: From Active 3D Vision to Invariant Object and Scene Learning, Recognition, and Search

Visual Object Recognition: The Processing Hierarchy of the Temporal Lobe

Extended Object Detection: Flexible Object Description System for Detection in Robotic Tasks

References

Ali A, Aggarwal JK (2001) Segmentation and recognition of continuous human activity. In: Proceedings of IEEE workshop on detection and recognition of events in video, pp 28–35. https://ieeexplore.ieee.org/document/938863/
Amazon Web Service. Amazon AI. https://aws.amazon.com/machine-learning/
Avidan S (2004) Support vector tracking. IEEE Trans Pattern Anal Mach Intell 26(8):1064–1072. https://ieeexplore.ieee.org/document/1307012/
Article Google Scholar
Bensebaa Amina, Larabi Slimane (2018) Direction estimation of moving pedestrian groups for intelligent vehicles. Vis Comput 34(6–8):1109–1118. https://doi.org/10.1007/s00371-018-1520-z
Article Google Scholar
Black MJ, Jepson AD (1998) Eigentracking: robust matching and tracking of articulated objects using a view-based representation. Int J Comput Vis 26(1):63–84. https://link.springer.com/article/10.1023/A:1007939232436
Article Google Scholar
Caffe2. https://caffe2.ai/
Chen S, Li Y, Kwok NM (2011) Active vision in robotic systems: a survey of recent developments. Int J Robot Res 30(11):1343–1377. http://journals.sagepub.com/doi/abs/10.1177/0278364911410755
Article Google Scholar
Choudhury SK, Sa PK, Padhy RP, Sharma S, Bakshi S (2018) Improved pedestrian detection using motion segmentation and silhouette orientation. Multimed Tools Appl 17(1):13075–13114. https://doi.org/10.1007/s11042-017-4933-1
Article Google Scholar
Clemente E, Olague G, Dozal L, Mancilla M (2012) Object recognition with an optimized ventral stream model using genetic programming. Appl Evol Comput LNCS 7248:315–325. https://doi.org/10.1007/978-3-642-29178-4_32
Article Google Scholar
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619. https://ieeexplore.ieee.org/document/1000236/
Article Google Scholar
Cremers D, Schnȯrr C (2003) Statistical shape knowledge in variational motion segmentation. Image Vis Comput 21(1):77–86. https://www.sciencedirect.com/science/article/pii/S0262885602001282
Article Google Scholar
Cuda-Convnet. https://code.google.com/archive/p/cuda-convnet/
Deep Learning in MATLAB. https://www.mathworks.com/help/nnet/ug/deep-learning-in-matlab.html
Deng J, Dong W, Socher R, Li L-J, Li F-F (2009) ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 248–255. https://ieeexplore.ieee.org/document/5206848/
Desimone R, Duncan J (1995) Neural mechanisms of selective visual attention. Ann Revue Neurosci 18:193–222. https://www.ncbi.nlm.nih.gov/pubmed/7605061
Article Google Scholar
Dozal L, Olague G, Clemente E, Sánchez M (2012) Evolving visual attention programs through EVO features. Appl Evol Comput LNCS 7248:326–335. https://doi.org/10.1007/978-3-642-29178-4_33
Article Google Scholar
Dozal L, Olague G, Clemente, Hernández DE (2014) Brain programming for the evolution of an artificial dorsal stream. Cogn Comput 6(3):528–557. https://doi.org/10.1007/s12559-014-9251-6
Article Google Scholar
Fan J, Wu Y, Dai S (2010) Discriminative spatial attention for robust tracking. Springer, Berlin, pp 480–493. https://link.springer.com/chapter/10.1007/978-3-642-15549-9_35
Google Scholar
Fieguth P, Terzopoulos D (1997) Color-based tracking of heads and other mobile objects at video frame rates. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 21–27. https://ieeexplore.ieee.org/document/609292/
Fukushima K (1975) Cognitron: a self-organizing multilayered neural network. Biol Cybern 20(6):121–136. https://doi.org/10.1007/BF00342633
Article Google Scholar
Fukushima K (1980) Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36 (4):193–202. https://doi.org/10.1007/BF00344251
Article MATH Google Scholar
Girshick R, Donahue J, Darrel T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 580–587. https://ieeexplore.ieee.org/document/6909475/
Google Cloud Machine Learning. https://cloud.google.com/products/ai/
Google TensorFlow. https://www.tensorflow.org
Hernández DE, Olague G, Clemente E, Dozal L (2012) Evolving a conspicuous point detector based on an artificial dorsal stream: SLAM system. Gen Evol Comput Conf, 1087–1094. https://dl.acm.org/citation.cfm?doid=2330163.2330314
Hernández D, Olague G, Clemente E, Dozal L (2012) Evolutionary purposive or behavioral vision for camera trajectory estimation. Appl Evol Comput LNCS 7248:336–345. https://doi.org/10.1007/978-3-642-29178-4_34
Article Google Scholar
Hernández DE, Clemente E, Olague G, Briseṅo JL (2016) Evolutionary multi-objective visual cortex for object classification in natural images. J Comput Sci 17:216–233. https://doi.org/10.1016/j.jocs.2015.10.011
Article Google Scholar
Hernández DE, Olague G, Hernández B, Clemente E (2017) CUDA-based parallelization of a bio-inspired model for fast object classification. Neural Comput Appl, 1–12. Available online https://link.springer.com/article/10.1007/s00521-017-2873-3
Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern Part C (Appl Rev) 34(3):334–352. https://ieeexplore.ieee.org/document/1310448/
Article Google Scholar
Hubel DH (1982) Exploration of the primary visual cortex, 1955-78. Nature 299:515–524. https://doi.org/10.1038/299515a0
Article Google Scholar
Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat’s striate cortex. J Physiol 148(3):574–591. https://doi.org/10.1113/jphysiol.1959.sp006308
Article Google Scholar
IBM Watson. https://www.ibm.com/watson/
Intille SS, Davis JW, Bobick AF (1997) Real-time closed-world tracking. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, pp 697–703. https://ieeexplore.ieee.org/document/609402/
Isard M, Blake A (1998) Condensation – conditional density propagation for visual tracking. Int J Comput Vis 29(1):5–28. https://link.springer.com/article/10.1023/A:1008078328650
Article Google Scholar
Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2(3):194–203. https://www.nature.com/articles/35058500
Article Google Scholar
Kang Jinman, Cohen I, Medioni G (2003) Continuous tracking within and across camera streams. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 267–272. https://ieeexplore.ieee.org/document/1211363/
Kim K, Davis LS (2011) Object detection and tracking for intelligent video surveillance. Springer, Berlin, pp 265–288. https://link.springer.com/chapter/10.1007
Google Scholar
Ko T (2011) A survey on behaviour analysis in video surveillance applications, chapter 16, pp 279–294 InTech. https://www.intechopen.com/books/video-surveillance/a-survey-on-behavior-analysis-in-video-surveillance-applications
Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiol 4(4):219–227. Reprinted in Matters of Intelligence, pp. 115–141, 1987. https://link.springer.com/chapter/10.1007/978-94-009-3833-5_5
Google Scholar
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical Report, https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf
LeCun Y, Bottou L, Bengio Ya, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://ieeexplore.ieee.org/document/726791/
Article Google Scholar
Li B, Chellappa R, Zheng Q, Der SZ (2001) Model-based temporal object verification using video. IEEE Trans Image Process 10(6):897–908. https://ieeexplore.ieee.org/document/923286/
Article Google Scholar
Li Z, Wang W, Wang Y, Chen F, Yi W (2013) Visual tracking by proto-objects. Pattern Recogn 46(8):2187–2201. https://www.sciencedirect.com/science/article/pii/S0031320313000575
Article Google Scholar
Ma L, Cheng J, Liu J, Wang J, Lu H (2010) Visual attention model based object tracking. Springer, Berlin, pp 483–493. https://link.springer.com/chapter/10.1007/978-3-642-15696-0_45
Google Scholar
Mahadevan V, Vasconcelos N (2009) Saliency-based discriminant tracking. In: 2009 IEEE conference on computer vision and pattern recognition, pp 1007–1013. https://ieeexplore.ieee.org/document/5206573/
Mancas M, Ferrera VPP, Riche N, Taylor JGG (eds) (2016) From human attention to computational attention: a multidisciplinary approach, volume 10 springer series in cognitive and neural systems. Springer. https://www.springer.com/gp/book/9781493934331
Microsoft Azure. https://azure.microsoft.com/en-us/services/machine-learning-studio/
Microsoft Cognitive Toolkit. https://www.microsoft.com/en-us/cognitive-toolkit/
Nanda A, Sa PK, Choudhury SK, Bakshi S, Majhi B (2017) A neuromorphic person re-identification framework for video surveillance. IEEE Access 5:6471–6482. https://ieeexplore.ieee.org/document/7885600/
Google Scholar
Nanda A, Chauhan DS, Sa PK, Bakshi S (2018) Illumination and scale invariant relevant visual features with hypergraph-based learning for multi-shot person re-identification. Multimed Tools Appl, 1–26. First online https://doi.org/10.1007/s11042-017-4875-7
Article Google Scholar
Olague G (2016) Evolutionary computer vision – the first footprints. Springer. https://www.springer.com/gp/book/9783662436929
Olague G, Clemente E, Dozal L, Hernández DE (2014) Evolving an artificial visual cortex for object recognition with brain programming. In: Schütze O et al. (eds) EVOLVE – a bridge between probability set oriented numerics and evolutionary computation III, volume 500 of studies in computational intelligence, pp 97–119. https://link.springer.com/chapter/10.1007/978-3-319-01460-9_5
Chapter Google Scholar
Olague G, Hernández DE, Clemente E, Chan-Ley M (2018) Evolving head tracking routines with brain programming. IEEE Access 6:26254–26270. https://doi.org/10.1109/ACCESS.2018.2831633
Article Google Scholar
Osaka N, Rentschler I, Biederman I (eds) (2007) Object recognition attention, and action. Springer. https://www.springer.com/gp/book/9784431730187
Ouerhani N, Hügli H (2003) A model of dynamic visual attention for object tracking in natural image sequences. Springer, Berlin, pp 702–709. https://link.springer.com/chapter/10.1007/3-540-44868-3_89
MATH Google Scholar
Park S, Aggarwal JK (2004) A hierarchical Bayesian network for event recognition of human actions and interactions. Multimed Syst 10(2):164–179. https://link.springer.com/article/10.1007/s00530-004-0148-1
Article Google Scholar
Posner MI, Snyder CR, Davidson BJ (1980) Attention and the detection of signals. J Exp Psychol 109(2):160–174. https://www.ncbi.nlm.nih.gov/pubmed/7381367
Article Google Scholar
Pytorch. https://pytorch.org
Rangarajan K, Shah M (1991) Establishing motion correspondence. CVGIP: Image Understand 54(1):56–73. https://ieeexplore.ieee.org/document/139669/
Article Google Scholar
Rasool Reddy K, Hari Priya K, Neelima N (2015) Object detection and tracking – a survey. In: 2015 International conference on computational intelligence and communication networks (CICN), pp 418–421. https://ieeexplore.ieee.org/document/7546127/
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nature 2:1019–1025. https://doi.org/10.1038/14819
Article Google Scholar
Rout JK, Singh S, Jena SK, Bakshi S (2017) Deceptive review detection using labeled and unlabeled data. Multimed Tools Appl 76(3):3187–3211. https://link.springer.com/article/10.1007/s11042-016-3819-y
Article Google Scholar
Schweitzer H, Bell JW, Wu F (2002) Very fast template matching. In: European conference on computer vision, vol LNCS 2353, pp 358–372, https://link.springer.com/chapter/10.1007/3-540-47979-1_24
Chapter Google Scholar
Serby D, Meier EK, van Gool L (2004) Probabilistic object tracking using multiple features. In: Proceedings of the 17th international conference on pattern recognition, ICPR, vol 2. IEEE, pp 184–187. https://ieeexplore.ieee.org/document/1334091/
Shafique K, Shah M (2005) A noniterative greedy algorithm for multiframe point correspondence. IEEE Trans Pattern Anal Mach Intell 27(1):51–65. https://ieeexplore.ieee.org/document/1359751/
Article Google Scholar
Smeulders AWM, Chu DM, Cucchiara R, Calderara S, Dehghan A, Shah M (2014) Visual tracking: an experimental survey. IEEE Trans Pattern Anal Mach Intell 36(7):1442–1468. https://ieeexplore.ieee.org/document/6671560/
Article Google Scholar
Theano. http://deeplearning.net/software/theano/index.html
Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cognitive Psychology. https://www.sciencedirect.com/science/article/pii/0010028580900055
Ungerleider LG, Haxby JV (1994) ‘What’ and ‘where’ in the human brain. Curr Opin Neurobiol 4(2):157–165. https://www.ncbi.nlm.nih.gov/pubmed/8038571
Article Google Scholar
Vaswani N, Roy Chowdhury A, Chellappa R (2003) Activity recognition using the dynamics of the configuration of interacting objects. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 633–640. https://ieeexplore.ieee.org/abstract/document/1211526/
Veenman CJ, Reinders MJT, Backer E (2001) Resolving motion correspondence for densely moving points. IEEE Trans Pattern Anal Mach Intell 23 (1):54–72. https://ieeexplore.ieee.org/document/899946/
Article Google Scholar
Wolfe JM (2000) Visual attention. In: de Valois KK (ed) Seeing (handbook of perception and cognition), Chapter 8. Academic Press, pp 335–386. https://www.sciencedirect.com/science/article/pii/B9780124437609500106
Chapter Google Scholar
Yilmaz A, Li Xin, Shah M (2004) Contour-based object tracking with occlusion handling in video acquired using mobile cameras. IEEE Trans Pattern Anal Mach Intell 26(11):1531–1536. https://ieeexplore.ieee.org/document/1335457/
Article Google Scholar
Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv, 38(4). https://doi.org/10.1145/1177352.1177355
Article Google Scholar
Zang Q, Klette R (2003) Object classification and tracking in video surveillance. Springer, Berlin, pp 198–205. https://link.springer.com/chapter/10.1007/978-3-540-45179-2_25
Google Scholar
Zhao Q (ed) (2017) Computational and cognitive neuroscience of vision, cognitive science and technology series. Springer. https://www.springer.com/gp/book/9789811002113
Google Scholar
Zhou SK, Chellappa R, Moghaddam B (2004) Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Trans Image Process 13(11):1491–1506. https://ieeexplore.ieee.org/document/1344039/
Article Google Scholar

Download references

Acknowledgements

This research was funded by CICESE through Project 634-128 – “Programación cerebral aplicada al estudio del pensamiento y la visión”. In addition, the authors acknowledge the valuable comments of the anonymous reviewers, the Editor of Multimedia Tools and Applications, and the International Editorial Board whose enthusiasm is gladly appreciated.

Author information

Authors and Affiliations

CICESE, Applied Physics Division, Carretera Tijuana-Ensenada No. 3918, Zona Playitas, Ensenada, B.C., México
Gustavo Olague, Paul Llamas & José L. Briseño
TecNM - Instituto Tecnológico de Tijuana, Calzada del Tecnológico S/N, Tomas Aquino, 22414, Tijuana, B.C., México
Daniel E. Hernández
TecNM - Instituto Tecnológico de Ensenada, Boulevard Tecnológico No. 150, Ex-ejido Chapultepec, 22780, Ensenada, B.C., México
Eddie Clemente

Authors

Gustavo Olague
View author publications
You can also search for this author in PubMed Google Scholar
Daniel E. Hernández
View author publications
You can also search for this author in PubMed Google Scholar
Paul Llamas
View author publications
You can also search for this author in PubMed Google Scholar
Eddie Clemente
View author publications
You can also search for this author in PubMed Google Scholar
José L. Briseño
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gustavo Olague.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Olague, G., Hernández, D.E., Llamas, P. et al. Brain programming as a new strategy to create visual routines for object tracking. Multimed Tools Appl 78, 5881–5918 (2019). https://doi.org/10.1007/s11042-018-6634-9

Download citation

Received: 30 September 2017
Revised: 28 August 2018
Accepted: 31 August 2018
Published: 11 September 2018
Issue Date: March 2019
DOI: https://doi.org/10.1007/s11042-018-6634-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Brain programming as a new strategy to create visual routines for object tracking

Abstract

Access this article

Similar content being viewed by others

Toward Autonomous Intelligence: From Active 3D Vision to Invariant Object and Scene Learning, Recognition, and Search

Visual Object Recognition: The Processing Hierarchy of the Temporal Lobe

Extended Object Detection: Flexible Object Description System for Detection in Robotic Tasks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Brain programming as a new strategy to create visual routines for object tracking

Abstract

Access this article

Similar content being viewed by others

Toward Autonomous Intelligence: From Active 3D Vision to Invariant Object and Scene Learning, Recognition, and Search

Visual Object Recognition: The Processing Hierarchy of the Temporal Lobe

Extended Object Detection: Flexible Object Description System for Detection in Robotic Tasks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation