Abstract
Object tracking, especially human tracking is one of the challenging research problems in computer vision. Although the performance has gained some positive changes recently, there is still room for improvement. In this paper, we introduce an approach for human detection and tracking using Convolution Neural Network (CNN) and Hungarian Algorithm (HA). A CNN is used to localize multiple human beings from frame to frame in a video stream. This deep CNN is known as Faster R-CNN which achieved the state of the art performance in object detection problem. In the tracking process, we solve the data association problem in visual tracking using HA. A detected person will be assigned to a tracklet based on the data distribution in the video frame. The experimental results show that our system can deal with the videos captured from different scenarios in near real-time.
Similar content being viewed by others
References
Babenko B, Yang MH, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Mach Intell 33(8):1619–1632
Broida TJ, Chellappa R (1986) Estimation of object motion parameters from noisy images. IEEE Trans Pattern Anal Mach Intell (PAMI) PAMI-8(1):90–99
Del Moral P (1996) Nonlinear filtering: interacting particle solution. Markov Process Relat Fields 2(4):555–580
Girshick R (2015) Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV)
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Grest D, Koch R (2004) Realtime multi-camera person tracking for immersive environments. In: IEEE 6th workshop on multimedia signal processing, pp 387–390
Hare S, Saffari A, Torr PHS (2011) Struck: structured output tracking with Kernels. In: International Conference on Computer Vision (ICCV)
Hare S, Golodetz S, Saffari A (2016) Struck: structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell (PAMI) 38(10):2096–2109
Harville M, Gordon G, Woodfill J (2001) Foreground segmentation using adaptive mixture models in color and depth. In: IEEE workshop on detection and recognition of events in video, pp 3–11
Henriques JF, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with Kernels. In: ECCV, pp 702–715
Jia X, Lu H, Yang M-H (2012) Visual tracking via adaptive structural local sparse appearance model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia, pp 675–678
Kristan M, Matas J, Leonardis A, Vojir T, Pflugfelder R, Fernandez G, Nebehay G, Porikli F, Cehovin L (2016) A novel performance evaluation methodology for single-target trackers. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)
Margrit B, Zheng W (2016) Data association for multi-object visual tracking. Synth Lect Comput Vis 6:1–120
Munkres J (1957) Algorithms for assignment and transportation problems. J Soc Ind Appl Math 5(1):32–38
Munoz-Salinas R, Aguirre E, Garcia-Silvente M (2007) People detection and tracking using stereo vision and color. Image Vis Comput 25(6):995–1007
Nguyen HT, Smeulders AW (2006) Robust tracking using foreground-background texture discrimination. Int J Comput Vis 68(3):277–294
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems. MIT Press, Cambridge, pp 91–99
Wu Y, Lim J, Yang M-H (2013) Online object tracking: a benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Zhang S, Benenson R, Omran M, Hosang J, Schiele B (2016) How far are we from solving pedestrian detection? In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1259–1267
Zhong W, Lu H, Yang M-H (2012) Robust object tracking via sparsity-based collaborative model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2018R1A1A1A05022526), (NRF-2017R1A4A1015559) and (NRF-2015R1D1A3A01019642).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nguyen, H.D., Na, I.S., Kim, S.H. et al. Multiple human tracking in drone image. Multimed Tools Appl 78, 4563–4577 (2019). https://doi.org/10.1007/s11042-018-6141-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6141-z