Abstract
This work addresses the task of multi-person tracking in crowded street scenes, where long-term occlusions pose a major challenge. One popular way to address this challenge is to re-identify people before and after occlusions using Convolutional Neural Networks (CNNs). To achieve good performance, CNNs require a large amount of training data, which is not available for multi-person tracking scenarios. Instead of annotating large training sequences, we introduce a customized multi-person tracker that automatically adapts its person re-identification CNNs to capture the discriminative appearance patterns in a test sequence. We show that a few high-quality training examples that are automatically mined from the test sequence can be used to fine-tune pre-trained CNNs, thereby teaching them to recognize the uniqueness of people’s appearance in the test sequence. To that end, we introduce a hierarchical correlation clustering (HCC) framework, in which we utilize an existing robust correlation clustering tracking model, but with different graph structures to generate local, reliable tracklets as well as globally associated tracks. We deploy intuitive physical constraints on the local tracklets to generate the high-quality training examples for customizing the person re-identification CNNs. Our customized multi-person tracker achieves state-of-the-art performance on the challenging MOT16 tracking benchmark.
L. Ma and S. Tang—Equal contributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. Image Video Process. 2008(1), 1–10 (2008)
Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Van Gool, L.: Online multiperson tracking-by-detection from a single, uncalibrated camera. TPAMI 33(9), 1820–1833 (2011)
Brendel, W., Amer, M., Todorovic, S.: Multiobject tracking as maximum weight independent set. In: CVPR, pp. 1273–1280. IEEE (2011)
Bromley, J., et al.: Signature verification using a “siamese” time delay neural network. IJPRAI 7(4), 669–688 (1993)
Charles, J., Pfister, T., Magee, D., Hogg, D., Zisserman, A.: Personalizing human video pose estimation. In: CVPR, pp. 3063–3072. IEEE, Las Vegas (2016)
Choi, W.: Near-online multi-target tracking with aggregated local flow descriptor. In: ICCV, pp. 3029–3037. IEEE, Santiago (2015)
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR, pp. 539–546. IEEE, San Diego (2005)
Dehghan, A., Tian, Y., Torr, P.H.S., Shah, M.: Target identity-aware network flow for online multiple target tracking. In: CVPR, pp. 1146–1154. IEEE, Boston (2015)
Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Math. Program. 45(1), 59–96 (1989)
Henschel, R., Leal-Taixé, L., Cremers, D., Rosenhahn, B.: Fusion of head and full-body detectors for multi-object tracking. In: CVPRW, pp. 1541–1550. IEEE, Salt Lake City (2018)
Huang, C., Wu, B., Nevatia, R.: Robust object tracking by hierarchical association of detection responses. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 788–801. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88688-4_58
Keuper, M., Levinkov, E., Bonneel, N., Lavoue, G., Brox, T., Andres, B.: Efficient decomposition of image and mesh graphs by lifted multicuts. In: ICCV, Santiago, pp. 1751–1759 (2015)
Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: ICCV, pp. 4696–4704. IEEE, Santiago (2015)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR, San Diego, pp. 13–23 (2015)
Kuo, C.H., Huang, C., Nevatia, R.: Multi-target tracking by on-line learned discriminative appearance models. In: CVPR, pp. 685–692. IEEE, San Francisco (2010)
Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: MOTChallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942
Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese CNN for robust target association. In: CVPRW, pp. 418–425. IEEE, Las Vegas (2016)
Levinkov, E., et al.: Joint graph decomposition & node labeling: problem, algorithms, applications. In: CVPR, pp. 1904–1912. IEEE, Honolulu (2017)
Long, C., Haizhou, A., Zijie, Z., Chong, S.: Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In: ICME, pp. 1–6. IEEE, San Diego (2018)
Ma, C., et al.: Trajectory factory: tracklet cleaving and re-connection by deep Siamese Bi-GRU for multiple object tracking. In: ICME, pp. 1–6. IEEE, San Diego (2018)
Milan, A., Roth, S., Schindler, K.: Continuous energy minimization for multitarget tracking. TPAMI 36(1), 58–72 (2014)
Milan, A., Leal-Taixé, L., Reid, I.D., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking (2016). arXiv:1603.00831
Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: CVPR, pp. 1201–1208. IEEE, Colorado Springs (2011)
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Deepmatching: hierarchical deformable dense matching. IJCV 120(3), 300–323 (2016)
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016, Part II. LNCS, vol. 9914, pp. 17–35. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_2
Ristani, E., Tomasi, C.: Tracking multiple people online and in real time. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014, Part V. LNCS, vol. 9007, pp. 444–459. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16814-2_29
Shitrit, H.B., Berclaz, J., Fleuret, F., Fua, P.: Tracking multiple people under global appearance constraints. In: ICCV, pp. 137–144. IEEE, Spain (2011)
Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: NIPS, Montreal, pp. 1988–1996 (2014)
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR, pp. 1–9. IEEE, Boston (2015)
Tang, S., Andres, B., Andriluka, M., Schiele, B.: Subgraph decomposition for multi-target tracking. In: CVPR, pp. 5033–5041. IEEE, Boston (2015)
Tang, S., Andres, B., Andriluka, M., Schiele, B.: Multi-person tracking by multicut and deep matching. In: Hua, G., Jégou, H. (eds.) ECCV 2016, Part II. LNCS, vol. 9914, pp. 100–111. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_8
Tang, S., Andriluka, M., Andres, B., Schiele, B.: Multiple people tracking by lifted multicut and person re-identification. In: CVPR, pp. 3701–3710. IEEE, Honolulu (2017)
Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. IJCV 110(1), 58–69 (2014)
Wang, X., Turetken, E., Fleuret, F., Fua, P.: Tracking interacting objects using intertwined flows. TPAMI 38(11), 2312–2326 (2016)
Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. IJCV 75(2), 247–266 (2007)
Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: ICCV, pp. 4705–4713. IEEE, Santiago (2015)
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: ICPR, pp. 34–39. IEEE, Stockholm (2014)
Zhang, L., Li, Y., Nevatia, R.: Global data association for multi-object tracking using network flows. In: CVPR. IEEE (2008)
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: ICCV, pp. 1116–1124. IEEE, Santiago (2015)
Acknowledgments
This research was supported in part by Toyota Motors Europe and the German Research Foundation (DFG CRC 1233). Disclosure MJB has received research gift funds from Intel, Nvidia, Adobe, Facebook, and Amazon. While MJB is a part-time employee of Amazon, his research was performed solely at, and funded solely by, MPI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ma, L., Tang, S., Black, M.J., Van Gool, L. (2019). Customized Multi-person Tracker. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11362. Springer, Cham. https://doi.org/10.1007/978-3-030-20890-5_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-20890-5_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20889-9
Online ISBN: 978-3-030-20890-5
eBook Packages: Computer ScienceComputer Science (R0)