Abstract
The vehicle tracking in UAV videos is still under-explored with the deep learning methods due to the lack of well labeled datasets. The challenges mainly come from the fact that the UAV view has much wider and changeable landscapes, which hinders the labeling task. In this paper, we propose a meta transfer learning method for adaptive vehicle tracking in UAV videos (MTAVT), which transfers the common features across landscapes, so that it can avoid over-fitting with the limited scale of dataset. Our MTAVT consists of two critical components: a meta learner and a transfer learner. Specifically, meta-learner is employed to adaptively learn the models to extract the sharing features between ground and drone views. The transfer learner is used to learn the domain-shifted features from ground-view to drone-view datasets by optimizing the ground-view models. We further seamlessly incorporate an exemplar-memory curriculum into meta learning by leveraging the memorized models, which serves as the training guidance for sequential sampling. Hence, this curriculum can enforce the meta learner to adapt to the new sequences in the drone-view datasets without losing the previous learned knowledge. Meanwhile, we simplify and stabilize the higher-order gradient training criteria for meta learning by exploring curriculum learning in multiple stages with various domains. We conduct extensive experiments and ablation studies on four public benchmarks and an evaluation dataset from YouTube (to release soon). All the experiments demonstrate that, our MTAVT has superior advantages over state-of-the-art methods in terms of accuracy, robustness, and versatility.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The subscript of g/u stands for the parameters from ground/drone view.
References
Arandjelović, O.: Automatic vehicle tracking and recognition from aerial image sequences. In: AVSS, pp. 1–6 (2015)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: ICML, pp. 41–48 (2009)
Castro, F.M., MarÃn-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15
Chen, B., Wang, D., Li, P., Wang, S., Lu, H.: Real-time ‘Actor-Critic’ tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 328–345. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_20
Chen, Z., Zhuang, J., Liang, X., Lin, L.: Blending-target domain adaptation by adversarial meta-adaptation networks. In: CVPR, pp. 2248–2257 (2019)
Chen, Z., Fu, Y., Wang, Y.X., Ma, L., Liu, W., Hebert, M.: Image deformation meta-networks for one-shot learning. In: CVPR, pp. 8680–8689 (2019)
Du, D., et al.: The unmanned aerial vehicle benchmark: object detection and tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 375–391. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_23
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML, pp. 1126–1135 (2017)
Li, B., et al.: SiamRPN++: evolution of siamese visual tracking with very deep networks. In: CVPR, pp. 4282–4291 (2019)
Li, F., Tian, C., Zuo, W., Zhang, L., Yang, M.H.: Learning spatial-temporal regularized correlation filters for visual tracking. In: CVPR, pp. 4904–4913 (2018)
Li, S., Yeung, D.Y.: Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models. In: AAAI, pp. 4140–4146 (2017)
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: CVPR, pp. 4293–4302 (2016)
Park, E., Berg, A.C.: Meta-tracker: fast and robust online adaptation for visual object trackers. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 587–604. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_35
Ran, N., Kong, L., Wang, Y., Liu, Q.: A robust multi-athlete tracking algorithm by exploiting discriminant features and long-term dependencies. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11295, pp. 411–423. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05710-7_34
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: ICLR (2017)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Song, Y., et al.: Vital: visual tracking via adversarial learning. In: CVPR, pp. 8990–8999 (2018)
Sun, C., Wang, D., Lu, H., Yang, M.H.: Learning spatial-aware regressions for visual tracking. In: CVPR, pp. 8962–8970 (2018)
Sun, Q., Liu, Y., Chua, T.S., Schiele, B.: Meta-transfer learning for few-shot learning. In: CVPR, pp. 403–412 (2019)
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: CVPR, pp. 1199–1208 (2018)
Tang, M., Yu, B., Zhang, F., Wang, J.: High-speed tracking with multi-kernel correlation filters. In: CVPR, pp. 4874–4883 (2018)
Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.S.: End-to-end representation learning for correlation filter based tracking. In: CVPR, pp. 2805–2813 (2017)
Wang, N., Song, Y., Ma, C., Zhou, W., Liu, W., Li, H.: Unsupervised deep tracking. In: CVPR, pp. 1308–1317 (2019)
Wang, N., Zhou, W., Tian, Q., Hong, R., Wang, M., Li, H.: Multi-cue correlation filters for robust visual tracking. In: CVPR, pp. 4844–4853 (2018)
Weinshall, D., Cohen, G., Amir, D.: Curriculum learning by transfer learning: theory and experiments with deep networks. arXiv preprint arXiv:1802.03796 (2018)
Wu, P., Huang, D., Wang, Y.: REVT: robust and efficient visual tracking by region-convolutional regression network. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 440–452. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73603-7_36
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: CVPR, pp. 3415–3424 (2017)
Yun, S., et al.: Action-decision networks for visual tracking with deep reinforcement learning. In: CVPR, pp. 2711–2720 (2017)
Zhu, P., Wen, L., Bian, X., Haibin, L., Hu, Q.: Vision meets drones: a challenge. arXiv preprint arXiv:1804.07437 (2018)
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 103–119. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_7
Acknowledgement
This research is supported in part by National Key R&D Program of China (No. 2017YFB1002602), National Key R&D Program of China (No. 2017YFF0106407), National Natural Science Foundation of China (No. 61672077 and 61532002), Applied Basic Research Program of Qingdao (No. 161013xx), National Science Foundation of USA (No. IIS-0949467, IIS-1047715, IIS-1715985, and IIS-1049448), capital health research and development of special 2016-1-4011, Fundamental Research Funds for the Central Universities, and Beijing Natural Science Foundation-Haidian Primitive Innovation Joint Fund (L182016).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, W. et al. (2020). Meta Transfer Learning for Adaptive Vehicle Tracking in UAV Videos. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11961. Springer, Cham. https://doi.org/10.1007/978-3-030-37731-1_62
Download citation
DOI: https://doi.org/10.1007/978-3-030-37731-1_62
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37730-4
Online ISBN: 978-3-030-37731-1
eBook Packages: Computer ScienceComputer Science (R0)