Meta Transfer Learning for Adaptive Vehicle Tracking in UAV Videos

Song, Wenfeng; Li, Shuai; Guo, Yuting; Li, Shaoqi; Hao, Aimin; Qin, Hong; Zhao, Qinping

doi:10.1007/978-3-030-37731-1_62

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11961))

Included in the following conference series:

International Conference on Multimedia Modeling

2922 Accesses
2 Citations

Abstract

The vehicle tracking in UAV videos is still under-explored with the deep learning methods due to the lack of well labeled datasets. The challenges mainly come from the fact that the UAV view has much wider and changeable landscapes, which hinders the labeling task. In this paper, we propose a meta transfer learning method for adaptive vehicle tracking in UAV videos (MTAVT), which transfers the common features across landscapes, so that it can avoid over-fitting with the limited scale of dataset. Our MTAVT consists of two critical components: a meta learner and a transfer learner. Specifically, meta-learner is employed to adaptively learn the models to extract the sharing features between ground and drone views. The transfer learner is used to learn the domain-shifted features from ground-view to drone-view datasets by optimizing the ground-view models. We further seamlessly incorporate an exemplar-memory curriculum into meta learning by leveraging the memorized models, which serves as the training guidance for sequential sampling. Hence, this curriculum can enforce the meta learner to adapt to the new sequences in the drone-view datasets without losing the previous learned knowledge. Meanwhile, we simplify and stabilize the higher-order gradient training criteria for meta learning by exploring curriculum learning in multiple stages with various domains. We conduct extensive experiments and ablation studies on four public benchmarks and an evaluation dataset from YouTube (to release soon). All the experiments demonstrate that, our MTAVT has superior advantages over state-of-the-art methods in terms of accuracy, robustness, and versatility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The subscript of g/u stands for the parameters from ground/drone view.

References

Arandjelović, O.: Automatic vehicle tracking and recognition from aerial image sequences. In: AVSS, pp. 1–6 (2015)
Google Scholar
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: ICML, pp. 41–48 (2009)
Google Scholar
Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15
Chapter Google Scholar
Chen, B., Wang, D., Li, P., Wang, S., Lu, H.: Real-time ‘Actor-Critic’ tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 328–345. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_20
Chapter Google Scholar
Chen, Z., Zhuang, J., Liang, X., Lin, L.: Blending-target domain adaptation by adversarial meta-adaptation networks. In: CVPR, pp. 2248–2257 (2019)
Google Scholar
Chen, Z., Fu, Y., Wang, Y.X., Ma, L., Liu, W., Hebert, M.: Image deformation meta-networks for one-shot learning. In: CVPR, pp. 8680–8689 (2019)
Google Scholar
Du, D., et al.: The unmanned aerial vehicle benchmark: object detection and tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 375–391. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_23
Chapter Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML, pp. 1126–1135 (2017)
Google Scholar
Li, B., et al.: SiamRPN++: evolution of siamese visual tracking with very deep networks. In: CVPR, pp. 4282–4291 (2019)
Google Scholar
Li, F., Tian, C., Zuo, W., Zhang, L., Yang, M.H.: Learning spatial-temporal regularized correlation filters for visual tracking. In: CVPR, pp. 4904–4913 (2018)
Google Scholar
Li, S., Yeung, D.Y.: Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models. In: AAAI, pp. 4140–4146 (2017)
Google Scholar
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Chapter Google Scholar
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: CVPR, pp. 4293–4302 (2016)
Google Scholar
Park, E., Berg, A.C.: Meta-tracker: fast and robust online adaptation for visual object trackers. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 587–604. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_35
Chapter Google Scholar
Ran, N., Kong, L., Wang, Y., Liu, Q.: A robust multi-athlete tracking algorithm by exploiting discriminant features and long-term dependencies. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11295, pp. 411–423. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05710-7_34
Chapter Google Scholar
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: ICLR (2017)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Song, Y., et al.: Vital: visual tracking via adversarial learning. In: CVPR, pp. 8990–8999 (2018)
Google Scholar
Sun, C., Wang, D., Lu, H., Yang, M.H.: Learning spatial-aware regressions for visual tracking. In: CVPR, pp. 8962–8970 (2018)
Google Scholar
Sun, Q., Liu, Y., Chua, T.S., Schiele, B.: Meta-transfer learning for few-shot learning. In: CVPR, pp. 403–412 (2019)
Google Scholar
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: CVPR, pp. 1199–1208 (2018)
Google Scholar
Tang, M., Yu, B., Zhang, F., Wang, J.: High-speed tracking with multi-kernel correlation filters. In: CVPR, pp. 4874–4883 (2018)
Google Scholar
Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.S.: End-to-end representation learning for correlation filter based tracking. In: CVPR, pp. 2805–2813 (2017)
Google Scholar
Wang, N., Song, Y., Ma, C., Zhou, W., Liu, W., Li, H.: Unsupervised deep tracking. In: CVPR, pp. 1308–1317 (2019)
Google Scholar
Wang, N., Zhou, W., Tian, Q., Hong, R., Wang, M., Li, H.: Multi-cue correlation filters for robust visual tracking. In: CVPR, pp. 4844–4853 (2018)
Google Scholar
Weinshall, D., Cohen, G., Amir, D.: Curriculum learning by transfer learning: theory and experiments with deep networks. arXiv preprint arXiv:1802.03796 (2018)
Wu, P., Huang, D., Wang, Y.: REVT: robust and efficient visual tracking by region-convolutional regression network. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 440–452. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73603-7_36
Chapter Google Scholar
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: CVPR, pp. 3415–3424 (2017)
Google Scholar
Yun, S., et al.: Action-decision networks for visual tracking with deep reinforcement learning. In: CVPR, pp. 2711–2720 (2017)
Google Scholar
Zhu, P., Wen, L., Bian, X., Haibin, L., Hu, Q.: Vision meets drones: a challenge. arXiv preprint arXiv:1804.07437 (2018)
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 103–119. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_7
Chapter Google Scholar

Download references

Acknowledgement

This research is supported in part by National Key R&D Program of China (No. 2017YFB1002602), National Key R&D Program of China (No. 2017YFF0106407), National Natural Science Foundation of China (No. 61672077 and 61532002), Applied Basic Research Program of Qingdao (No. 161013xx), National Science Foundation of USA (No. IIS-0949467, IIS-1047715, IIS-1715985, and IIS-1049448), capital health research and development of special 2016-1-4011, Fundamental Research Funds for the Central Universities, and Beijing Natural Science Foundation-Haidian Primitive Innovation Joint Fund (L182016).

Author information

Authors and Affiliations

The State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
Wenfeng Song, Shuai Li, Yuting Guo, Shaoqi Li, Aimin Hao & Qinping Zhao
Qingdao Research Institute, Beihang University, Qingdao, China
Shuai Li & Yuting Guo
Department of Computer Science, State University of New York at Stony Brook, Stony Brook, USA
Hong Qin

Authors

Wenfeng Song
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuting Guo
View author publications
You can also search for this author in PubMed Google Scholar
Shaoqi Li
View author publications
You can also search for this author in PubMed Google Scholar
Aimin Hao
View author publications
You can also search for this author in PubMed Google Scholar
Hong Qin
View author publications
You can also search for this author in PubMed Google Scholar
Qinping Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shuai Li or Hong Qin .

Editor information

Editors and Affiliations

Korea Advanced Institute of Science and, Daejeon, Korea (Republic of)
Yong Man Ro
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Junmo Kim
National Cheng Kung University, Tainan City, Taiwan
Wei-Ta Chu
Tsinghua University, Beijing, China
Peng Cui
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Jung-Woo Choi
National Tsing Hua University, Hsinchu, Taiwan
Min-Chun Hu
Ghent University, Ghent, Belgium
Wesley De Neve

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, W. et al. (2020). Meta Transfer Learning for Adaptive Vehicle Tracking in UAV Videos. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11961. Springer, Cham. https://doi.org/10.1007/978-3-030-37731-1_62

Download citation

DOI: https://doi.org/10.1007/978-3-030-37731-1_62
Published: 24 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37730-4
Online ISBN: 978-3-030-37731-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics