Abstract
Multi-target tracking in complex scenes is challenging because target appearance features generate partial or significant variations frequently. In order to solve the problem, we propose a multi-target tracking method with hierarchical data association using main-parts and spatial-temporal feature models. In our tracking framework, target feature models and tracklets are initialized when the new targets appear. Main-parts feature model is presented to represent target with partial or no appearance variations. It is established by partitioning a target template into several parts and formulating appearance variation densities of these parts. For the target with significant appearance variations, the tracker learns its global spatial-temporal feature model by integrating appearance with histogram of optical flow features. During tracking, tracklet confidence is exploited to implement hierarchical data association. According to different tracklet confidence values, main-parts and global data association are respectively performed by employing main-parts and spatial-temporal feature models. As a result, our approach uses the Hungarian algorithm to obtain optimal associated pairs between target tracklets and detections. Finally, target feature models and tracklets are updated by the association detections for subsequently tracking. Experiments conducted on CAVIAR, Parking Lot and MOT15 datasets verify the effectiveness and improvement of our multi-target tracking method.









Similar content being viewed by others
References
Ahuja RK, Magnanti TL, Orlin JB (1993) Network flows. Prentice Hall, Englewood Cliffs
Anh NTL, Khan FM, Negin F, Francois B (2017) Multi-object tracking using multi-channel part appearance representation. In: IEEE international conference on advanced video and signal based surveillance, pp 1–6
Bae SH, Yoon KJ (2014) Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1218–1225
Bae H, Yoon J (2018) Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. IEEE Trans Pattern Anal Mach Intell 40(3):595–610
Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image and Video Process 2008(1):1–10
Dehghan A, Assari SM, Shah M (2015) Gmmcp tracker: globally optimal generalized maximum multi clique problem for multiple object tracking. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 4091–4099
Dong W, Chang F, Zhao Z (2015) Visual tracking with multi-feature joint sparse representation. J Electron Imaging 24:013006
Felzenszwalb P, Girshick R, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Hao T, Wang Q, Wu D, Sun JS (2018) Multiple person tracking based on slow feature analysis. Multimedia Tools and Applications 77(3):3623–3637
Hu W, Li X, Luo W, Zhang X, Maybank S, Zhang Z (2012) Single and multiple object tracking using log-Euclidean Riemannian subspace and block-division appearance model. IEEE Trans Pattern Anal Mach Intell 34(12):2420–2440
Izadinia H, Saleemi I, Li W, Shah M (2012) Mp2t: multiple people multiple parts tracker. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 100–114
Ju J, Kim D, Ku B, Han DK, Ko H (2016) Online multi-person tracking with two-stage data association and online appearance model learning. IET Comput Vis 11(1):87–95
Ju J, Kim D, Ku B, Han DK, Ko H (2017) Online multi-object tracking with efficient track drift and fragmentation handling. J Opt Soc Am A 34(2):280–293
Kim TK, Stenger B, Kittler J, Cipolla R (2011) Incremental linear discriminant analysis using sufficient spanning sets and its applications. Int J Comput Vision 91(2):216–232
Kuo CH, Nevatia R (2011) How does person identity recognition help multiperson tracking?. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1217–1224
Leal-Taixe L, Milan A, Reid I, Roth S, Schindler K (2015) Motchallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942
Li H, Li Y, Porikli F (2016) Deeptrack: learning discriminative feature representations by convolutional neural networks for visual tracking. IEEE Trans Image Process 25(4):1834–1848
Lv L, Fan T, Sun Z, Wang J, Xu L (2016) Object tracking with double-dictionary appearance model. Opt Eng 55(8):083106
Meinhardt-Llopis E, Sanchez J, Kondermann D (2013) Horn-schunck optical flow with a multi-scale strategy. Image Process 3:151–172
Milan A, Rezatofighi SH, Dick A, Reid I, Schindler K (2017) Online multi-target tracking using recurrent neural networks. In: AAAI conference on artificial intelligence, pp 4225–4232
Naiel MA, Ahmad MO, Swamy MNS, Lim J (2017) Online multi-object tracking via robust collaborative model and sample selection. Comput Vis Image Underst 154:94–107
Poiesi F, Mazzon R, Cavallaro A (2013) Multi-target tracking on confidence maps: an application to people tracking. Comput Vis Image Underst 117(10):1257–1272
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 51–59
Robert F, Jose SV, James C (2001) Caviar: context aware vision using image-based active recognition. http://homepages.inf.ed.ac.uk/rbf/CAVIAR/
Shu G, Dehghan A, Oreifej O, Hand E, Shah M (2012) Part-based multiple-person tracking with partial occlusion handling. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1815–1821
Sun X, Yao H, Lu X (2014) Dynamic multi-cue tracking using particle filter. Signal Image Video Process 8(1):95–101
Topkaya IS, Erdogan H, Porikli F (2016) Tracklet clustering for robust multiple object tracking using distance dependent Chinese restaurant processes. Signal Image Video Process 10(5):795–802
Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In: Advances in neural information processing systems, pp 809–817
Wang N, Gao X, Sun L, Li J (2017) Anchored neighborhood index for face sketch synthesis. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2017.2709465
Wang N, Gao X, Sun L, Li J (2017) Bayesian face sketch synthesis. IEEE Trans Image Process 26(3):1264–1274
Wang N, Gao X, Li J (2018) Random sampling for fast face sketch synthesis. Pattern Recogn 76:215–227
Wu B, Nevatia R (2006) Tracking of multiple, partially occluded humans based on static body part detection. In: IEEE computer society conference on computer vision and pattern recognition, pp 951–958
Xiang J, Sang N, Hou J, Huang R, Gao C (2016) Multi-target tracking using hough forest random field. IEEE Trans Circuits Syst Video Technol 26 (11):2028–2042
Yang M, Jia Y (2016) Temporal dynamic appearance modeling for online multi-person tracking. Comput Vis Image Underst 153:16–28
Yang B, Nevatia R (2012) Multi-target tracking by online learning of nonlinear motion patterns and robust appearance models. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1918–1925
Yang M, Lv F, Xu W, Gong Y (2009) Detection driven adaptive multi-cue integration for multiple human tracking. In: International conference on computer vision (ICCV), pp 1554–1561
Yoon JH, Yang MH, Lim J, Yoon KJ (2015) Bayesian multi-object tracking using motion context from multiple objects. In: 2015 IEEE winter conference on applications of computer vision (WACV), pp 33–40
Zhang W, Yu X, He X (2017) Learning bidirectional temporal cues for video-based person re-identification. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2017.2718188
Zhang W, Zhang WD, Liu K, Gu J (2018) A feature descriptor based on local normalized difference for real-world texture classification. IEEE Trans Multimedia 20(4):880–888
Zhang W, Chen Q, Zhang WD, He X (2018) Long-range terrain perception using convolutional neural networks. Neurocomputing 275:781–787
Zhang Y, Chang F, Wang M, Zhang F, Han C (2018) Auxiliary learning for crowd counting via count-net. Neurocomputing 273:190–198
Acknowledgments
This work was supported by the National Natural Science Foundation of China, No. 61673244 and 61703240.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, H., Chang, F. & Liu, C. Multi-target tracking with hierarchical data association using main-parts and spatial-temporal feature models. Multimed Tools Appl 78, 29161–29181 (2019). https://doi.org/10.1007/s11042-018-6667-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6667-0