VisDrone-SOT2020: The Vision Meets Drone Single Object Tracking Challenge Results

Fan, Heng; Wen, Longyin; Du, Dawei; Zhu, Pengfei; Hu, Qinghua; Ling, Haibin; Shah, Mubarak; Wang, Biao; Dong, Bin; Yuan, Di; Wang, Dong; Zhou, Dongjie; Sun, Haoyang; Ghanei-Yakhdan, Hossein; Lu, Huchuan; Khaghani, Javad; Zhou, Jinghao; Wang, Keyang; Pang, Lei; Zhang, Lei; Cheng, Li; Lin, Liting; Ding, Lu; Fan, Nana; Wang, Peng; Zhang, Penghao; Ma, Ruiyan; Marvasti-Zadeh, Seyed Mojtaba; Kasaei, Shohreh; Chen, Shuhao; Lai, Simiao; Xu, Tianyang; He, Wentao; Wu, Xiaojun; Hou, Xin; Zhu, Xuefeng; Gao, Yanjie; Zhao, Yanyun; Wang, Yong; Xu, Yong; Sun, Yubo; Yang, Yuting; Li, Yuxuan; Wang, Zezhou; He, Zhenwei; He, Zhenyu; Luo, Zhipeng; Huang, Zhongjian; Zhang, Zhongzhou; Zhang, Zikai; Yi, Zitong

doi:10.1007/978-3-030-66823-5_44

Heng Fan¹⁰,
Longyin Wen¹¹,
Dawei Du¹²,
Pengfei Zhu¹³,
Qinghua Hu¹³,
Haibin Ling¹⁰,
Mubarak Shah¹⁴,
Biao Wang¹⁶,
Bin Dong²⁴,
Di Yuan²⁹,
Dong Wang¹⁷,
Dongjie Zhou³²,
Haoyang Sun²²,
Hossein Ghanei-Yakhdan²⁰,
Huchuan Lu¹⁷,
Javad Khaghani¹⁹,
Jinghao Zhou²²,
Keyang Wang³³,
Lei Pang²⁸,
Lei Zhang³³,
Li Cheng¹⁹,
Liting Lin²⁷,
Lu Ding³¹,
Nana Fan²⁹,
Peng Wang²²,
Penghao Zhang²⁴,
Ruiyan Ma¹⁵,
Seyed Mojtaba Marvasti-Zadeh¹⁹,
Shohreh Kasaei²¹,
Shuhao Chen¹⁷,
Simiao Lai¹⁷,
Tianyang Xu²⁶,
Wentao He³²,
Xiaojun Wu²⁵,
Xin Hou¹⁶,
Xuefeng Zhu²⁵,
Yanjie Gao¹⁵,
Yanyun Zhao¹⁸,
Yong Wang³⁰,
Yong Xu²⁷,
Yubo Sun²⁴,
Yuting Yang¹⁵,
Yuxuan Li¹⁵,
Zezhou Wang¹⁷,
Zhenwei He³³,
Zhenyu He²⁹,
Zhipeng Luo²³,
Zhongjian Huang¹⁵,
Zhongzhou Zhang³³,
Zikai Zhang²² &
…
Zitong Yi¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12538))

Included in the following conference series:

European Conference on Computer Vision

3812 Accesses
16 Citations

Abstract

The Vision Meets Drone (VisDrone2020) Single Object Tracking is the third annual UAV tracking evaluation activity organized by the VisDrone team, in conjunction with European Conference on Computer Vision (ECCV 2020). The VisDrone-SOT2020 Challenge presents and discusses the results of 13 participating algorithms in detail. By using ensemble of different trackers trained on several large-scale datasets, the top performer in VisDrone-SOT2020 achieves better results than the counterparts in VisDrone-SOT2018 and VisDrone-SOT2019. The challenging results, collected videos as well as the valuation toolkit are made available at http://aiskyeye.com/. By holding VisDrone-SOT2020 challenge, we hope to provide the community a dedicated platform for developing and evaluating drone-based tracking approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

VisDrone-SOT2018: The Vision Meets Drone Single-Object Tracking Challenge Results

BioDrone: A Bionic Drone-Based Single Object Tracking Benchmark for Robust Vision

Article 02 December 2023

VisDrone-VDT2018: The Vision Meets Drone Video Detection and Tracking Challenge Results

Notes

References

Ahn, N., Kang, B., Sohn, K.A.: Efficient deep neural network for photo-realistic image super-resolution. arXiv (2019)
Google Scholar
Kristan, M., et al.: The sixth visual object tracking VOT2018 challenge results. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 3–53. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_1
Chapter Google Scholar
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chapter Google Scholar
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: ICIP (2016)
Google Scholar
Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning discriminative model prediction for tracking. In: ICCV (2019)
Google Scholar
Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: CVPR (2010)
Google Scholar
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Eco: Efficient convolution operators for tracking. In: CVPR (2017)
Google Scholar
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ATOM: accurate tracking by overlap maximization. In: CVPR (2019)
Google Scholar
Danelljan, M., Gool, L.V., Timofte, R.: Probabilistic regression for visual tracking. In: CVPR (2020)
Google Scholar
Danelljan, M., Häger, G., Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: BMVC (2014)
Google Scholar
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: ICCV (2015)
Google Scholar
Danelljan, M., Robinson, A., Shahbaz Khan, F., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 472–488. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_29
Chapter Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Du, D., et al.: The unmanned aerial vehicle benchmark: object detection and tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 375–391. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_23
Chapter Google Scholar
Du, D., Wen, L., Qi, H., Huang, Q., Tian, Q., Lyu, S.: Iterative graph seeking for object tracking. TIP 27(4), 1809–1821 (2018)
MathSciNet MATH Google Scholar
Du, D., et al.: VisDrone-SOT2019: the vision meets drone single object tracking challenge results. In: ICCVW (2019)
Google Scholar
Fan, H., et al.: LaSOT: a high-quality benchmark for large-scale single object tracking. In: CVPR (2019)
Google Scholar
Fan, H., Ling, H.: Parallel tracking and verifying: a framework for real-time and high accuracy visual tracking. In: ICCV (2017)
Google Scholar
Fan, H., Ling, H.: SANet: structure-aware network for visual tracking. In: CVPRW (2017)
Google Scholar
Fan, H., Ling, H.: Siamese cascaded region proposal networks for real-time visual tracking. In: CVPR (2019)
Google Scholar
Galoogahi, H.K., Fagg, A., Huang, C., Ramanan, D., Lucey, S.: Need for speed: a benchmark for higher frame rate object tracking. In: ICCV (2017)
Google Scholar
Galoogahi, H.K., Fagg, A., Lucey, S.: Learning background-aware correlation filters for visual tracking. In: ICCV (2017)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. TPAMI 37(3), 583–596 (2015)
Article Google Scholar
Huang, L., Zhao, X., Huang, K.: GOT-10k: a large high-diversity benchmark for generic object tracking in the wild. TPAMI (2019)
Google Scholar
Jiang, B., Luo, R., Mao, J., Xiao, T., Jiang, Y.: Acquisition of localization confidence for accurate object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 816–832. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_48
Chapter Google Scholar
Jung, I., Son, J., Baek, M., Han, B.: Real-time MDNet. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 89–104. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_6
Chapter Google Scholar
Kristan, M., et al.: A novel performance evaluation methodology for single-target trackers. TPAMI 38(11), 2137–2155 (2016)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: CVPR (2019)
Google Scholar
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: CVPR (2018)
Google Scholar
Li, F., Tian, C., Zuo, W., Zhang, L., Yang, M.H.: Learning spatial-temporal regularized correlation filters for visual tracking. In: CVPR (2018)
Google Scholar
Li, S., Yeung, D.Y.: Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models. In: AAAI (2017)
Google Scholar
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8926, pp. 254–265. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16181-5_18
Chapter Google Scholar
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: algorithms and benchmark. TIP 24(12), 5630–5644 (2015)
MathSciNet MATH Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, T., Wang, G., Yang, Q.: Real-time part-based visual tracking via adaptive correlation filters. In: CVPR (2015)
Google Scholar
Lukezic, A., et al.: CDTB: a color and depth visual object tracking dataset and benchmark. In: ICCV (2019)
Google Scholar
Lv, F., Lu, F., Wu, J., Lim, C.: MBLLEN: low-light image/video enhancement using CNNs. In: BMVC (2018)
Google Scholar
Ma, C., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: ICCV (2015)
Google Scholar
Marvasti-Zadeh, S.M., Khaghani, J., Ghanei-Yakhdan, H., Kasaei, S., Cheng, L.: COMET: context-aware IoU-guided network for small object tracking. arXiv (2020)
Google Scholar
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Chapter Google Scholar
Mueller, M., Smith, N., Ghanem, B.: Context-aware correlation filter tracking. In: CVPR (2017)
Google Scholar
Müller, M., Bibi, A., Giancola, S., Alsubaihi, S., Ghanem, B.: TrackingNet: a large-scale dataset and benchmark for object tracking in the wild. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 310–327. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_19
Chapter Google Scholar
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: CVPR (2016)
Google Scholar
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: YouTube-BoundingBoxes: a large high-precision human-annotated data set for object detection in video. In: CVPR, pp. 7464–7473 (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Smeulders, A.W., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. TPAMI 36(7), 1442–1468 (2014)
Article Google Scholar
Song, Y., et al.: VITAL: visual tracking via adversarial learning. In: CVPR (2018)
Google Scholar
Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: CVPR (2016)
Google Scholar
Valmadre, J., et al.: Long-term tracking in the wild: a benchmark. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 692–707. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_41
Chapter Google Scholar
Voigtlaender, P., Luiten, J., Torr, P.H., Leibe, B.: Siam R-CNN: visual tracking by re-detection. In: CVPR (2020)
Google Scholar
Wang, G., Luo, C., Xiong, Z., Zeng, W.: SPM-Tracker: series-parallel matching for real-time visual object tracking. In: CVPR (2019)
Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: CVPR (2019)
Google Scholar
Wen, L., et al.: VisDrone-SOT2018: the vision meets drone single-object tracking challenge results. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 469–495. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_28
Chapter Google Scholar
Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: CVPR (2013)
Google Scholar
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. TPAMI 37(9), 1834–1848 (2015)
Article Google Scholar
Yan, B., Wang, D., Lu, H., Yang, X.: Alpha-Refine: boosting tracking performance by precise bounding box estimation. arXiv (2020)
Google Scholar
Yang, G., Ramanan, D.: Volumetric correspondence networks for optical flow. In: NeurIPS (2019)
Google Scholar
Ying, Z., Li, G., Ren, Y., Wang, R., Wang, W.: A new low-light image enhancement algorithm using camera response model. In: ICCVW (2017)
Google Scholar
Yuan, D., Fan, N., He, Z.: Learning target-focusing convolutional regression model for visual object tracking. Knowl.-Based Syst. (2020)
Google Scholar
Zhang, Y., Zhang, J., Guo, X.: Kindling the darkness: a practical low-light image enhancer. In: ACM MM (2019)
Google Scholar
Zhou, J., Wang, P., Sun, H.: Discriminative and robust online learning for Siamese visual tracking. In: AAAI (2020)
Google Scholar
Zhou, W., Wen, L., Zhang, L., Du, D., Luo, T., Wu, Y.: SiamMan: Siamese motion-aware network for visual tracking. CoRR abs/1912.05515 (2019)
Google Scholar
Zhu, P., Wen, L., Du, D., Bian, X., Hu, Q., Ling, H.: Vision meets drones: past, present and future. CoRR abs/2001.06303 (2020)
Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware Siamese networks for visual object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 103–119. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_7
Chapter Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61876127 and Grant 61732011, in part by Natural Science Foundation of Tianjin under Grant 17JCZDJC30800.

Author information

Authors and Affiliations

Stony Brook University, New York, NY, USA
Heng Fan & Haibin Ling
JD Finance America Corporation, Mountain View, CA, USA
Longyin Wen
Kitware, Inc., Clifton Park, NY, USA
Dawei Du
Tianjin University, Tianjin, China
Pengfei Zhu & Qinghua Hu
University of Central Florida, Orlando, FL, USA
Mubarak Shah
Xidian University, Xi’an, China
Ruiyan Ma, Yanjie Gao, Yuting Yang, Yuxuan Li & Zhongjian Huang
WeBank, Shenzhen, China
Biao Wang & Xin Hou
Dalian University of Technology, Dalian, China
Dong Wang, Huchuan Lu, Shuhao Chen, Simiao Lai & Zezhou Wang
Beijing University of Posts and Telecommunications, Beijing, China
Yanyun Zhao & Zitong Yi
University of Alberta, Edmonton, Canada
Javad Khaghani, Li Cheng & Seyed Mojtaba Marvasti-Zadeh
Yazd University, Yazd, Iran
Hossein Ghanei-Yakhdan
Sharif University of Technology, Tehran, Iran
Shohreh Kasaei
Northwestern Polytechnical University, Xi’an, China
Haoyang Sun, Jinghao Zhou, Peng Wang & Zikai Zhang
DeepBlue Technology (Shanghai), Shanghai, China
Zhipeng Luo
DeepBlue Technology, Beijing, China
Bin Dong, Penghao Zhang & Yubo Sun
Jiangnan University, Wuxi, China
Xiaojun Wu & Xuefeng Zhu
University of Surrey, Guildford, UK
Tianyang Xu
South China University of Technology, Guangzhou, China
Liting Lin & Yong Xu
Chinese Academy of Sciences, Beijing, China
Lei Pang
Harbin Institute of Technology, Shenzhen, Shenzhen, China
Di Yuan, Nana Fan & Zhenyu He
Sun Yat-Sen University, Guangzhou, China
Yong Wang
Shanghai Jiao Tong University, Shanghai, China
Lu Ding
Beijing Institute of Remote Sensing Equipment, Beijing, China
Dongjie Zhou & Wentao He
Chongqing University, Chongqing, China
Keyang Wang, Lei Zhang, Zhenwei He & Zhongzhou Zhang

Authors

Heng Fan
View author publications
You can also search for this author in PubMed Google Scholar
Longyin Wen
View author publications
You can also search for this author in PubMed Google Scholar
Dawei Du
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Qinghua Hu
View author publications
You can also search for this author in PubMed Google Scholar
Haibin Ling
View author publications
You can also search for this author in PubMed Google Scholar
Mubarak Shah
View author publications
You can also search for this author in PubMed Google Scholar
Biao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Dong
View author publications
You can also search for this author in PubMed Google Scholar
Di Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Dong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dongjie Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Haoyang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Hossein Ghanei-Yakhdan
View author publications
You can also search for this author in PubMed Google Scholar
Huchuan Lu
View author publications
You can also search for this author in PubMed Google Scholar
Javad Khaghani
View author publications
You can also search for this author in PubMed Google Scholar
Jinghao Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Keyang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Pang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Li Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Liting Lin
View author publications
You can also search for this author in PubMed Google Scholar
Lu Ding
View author publications
You can also search for this author in PubMed Google Scholar
Nana Fan
View author publications
You can also search for this author in PubMed Google Scholar
Peng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Penghao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ruiyan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Seyed Mojtaba Marvasti-Zadeh
View author publications
You can also search for this author in PubMed Google Scholar
Shohreh Kasaei
View author publications
You can also search for this author in PubMed Google Scholar
Shuhao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Simiao Lai
View author publications
You can also search for this author in PubMed Google Scholar
Tianyang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Wentao He
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Hou
View author publications
You can also search for this author in PubMed Google Scholar
Xuefeng Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yanjie Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yanyun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yubo Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuting Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuxuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Zezhou Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenwei He
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu He
View author publications
You can also search for this author in PubMed Google Scholar
Zhipeng Luo
View author publications
You can also search for this author in PubMed Google Scholar
Zhongjian Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongzhou Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zikai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zitong Yi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pengfei Zhu .

Editor information

Editors and Affiliations

University of Clermont Auvergne, Clermont Ferrand, France
Adrien Bartoli
Università degli Studi di Udine, Udine, Italy
Andrea Fusiello

A Descriptions of Submitted Trackers

In the appendix, we summarize 13 trackers submitted in the VisDrone-SOT2020 Challenge, which are ordered according to the submissions of their final results.

1.1 A.1 Strategy and Motion Integrated Long-Term Experts-Version 2 (SMILEv2)

Yuxuan Li, Zhongjian Huang and Biao Wang

liyuxuan_xidian@126.com, huangzj@stu.xidian.edu.cn, biaowang@webank.com

SMILEv2 is combined with three kind of basic trackers and integrated in our IPIU-tracking framework. In this new framework, we are able to select different trackers in different situations by a semi-automatic way. As shown in Fig. 9, the framework has three parts, which are prediction module, tracking module and fix module. For prediction module, we introduce the Kalman filter and the optical flow method of VCN [61] as the object motion information and camera motion information respectively. For tracking module, we use three trackers including Dimp [5], SiamMask [56], SORT-MOT [4]. For fix module, we first obtain the output of prediction and tracking modules, and then judge the final result.

1.2 A.2 Long-Term Tracking with Night-Enhancement and Motion Integrated (LTNMI)

Yuting Yang, Yanjie Gao, Ruiyan Ma and Xin Hou

{ytyang_1,yjgao}@stu.xidian.edu.cn, 3028408083@qq.com, xinhou@webank.com

LTNMI is a combination of ATOM [8], SiamRPN++ [31], Siam-RCNN [54] and Dimp [5]. We combined the ATOM and SiamRPN++ to get a better result, and then our method can give the reliability low limits of the above two systems on the condition of different confidence levels, which makes the systems more reliable respectively as different features play different role in the process of tracking based on their reliability. In addition, we improve the prediction of blurred scenes by using SIFT algorithm to match features. By estimating motion, the regression boxes can continue tracking the target in case of occlusion. When encountering dark or low-resolution scenes, we use threshold judgment and image brightness enhancement processing. We use MBLLEN [40] algorithm to process weak light enhancement. And then, we use Dimp to get the result of the sequences with weak light enhancement. At last, we use Siam-RCNN to find some lost frames. As a result, when the overlap are of fused result and the result generated by Siam-RCNN is nearly 95%, we conclude that the result generated by Siam-RCNN is better because of accurate detecting bounding box.

1.3 A.3 Ensemble of Classification and Matching Models with Alpha-Refine for UAV Tracking (ECMMAR)

Shuhao Chen, Zezhou Wang, Simiao Lai, Dong Wang and Huchuan Lu

{shuhaochn,zzwang}@mail.dlut.edu.cn, laisimiao1@gmail.com,

{wdice,lhchuan}@dlut.edu.cn

ECMMAR tracker is improved from Dimp [5] and SiamRPN++ [31] with online update module [65]. Dimp performs well in distinguishing distractors, while SiamRPN++ with the re-detection module performs well in detecting target when target disappeared by full occlusion or fast perspective conversion. The main modification are: 1) Develop an interactive mechanism to handle with long-term tracking and improve the robustness. 2) Muti-scale search regions are set to help to re-detect target when full occlusion or fast perspective conversion happened. 3) Use a refinement module [60] to refine the localized bounding box. 4) Employ a low-light image enhancement [62] method to deal with low-light scenes. 5) Fine-tune the superdimp pre-trained model and alpha-refine pre-trained model with visdrone2020 dataset. 6) Motion compensation is used when the camera viewing angle changes greatly. 7) Inertial motion is added when both tracker results are unreliable.

1.4 A.4 UAV Tracking with Extra Proposals Based on Corrected Velocity Prediction (CVP-superdimp)

Zitong Yi and Yanyun Zhao

{zitong.yi,zyy}@mail.dlut.edu.cn

CVP-superdimp is a robust tracking strategy under the circumstance of UAV tracking, especially for the nerve-wracking problem of fierce camera moving and long-term full occlusion. The base tracker follows [5, 9], which contains two modules: object classification module based on DIMP and bounding box regression module based on prDIMP. Our proposed tracking strategy adds a new module of velocity prediction for both short-term and long-term, which can provide additional high-quality proposals for tracker searching in the next frame.

1.5 A.5 LTCOMET: Context-Aware IoU-Guided Network for Small Object Tracking (LTCOMET)

Seyed Mojtaba Marvasti-Zadeh, Javad Khaghani, Li Cheng, Hossein Ghanei-Yakhdan and Shohreh Kasaei

{mojtaba.marvasti,khaghani,lcheng5}@ualberta.ca,hghaneiy@yazd.ac.ir,

kasaei@sharif.edu

To bridge the gap between aerial views tracking methods and modern trackers, the modified context-aware IoU-guided tracker (LTCOMET) is proposed that exploits the offline reference proposal generation strategy (same as COMET tracker [42]), multitask two-stream network [42], kindling the darkness (KinD) [64], and photo-realistic cascading residual network (PCARN) [1]. The network architecture is the same as [42] without using channel reduction after the multi-scale aggregation and fusion modules (MSAFs). The KinD employs a network for light adjustment and degradation removal, which is employed as a preprocessing of LTCOMET on target patches. Also, the LTCOMET employs the generator network of PCARN to recover high-resolution patches of target and its context from low-resolution ones. Furthermore, the proposed method uses a widowing search strategy when it loses the target. The proposed LTCOMET has been trained on a broad range of tracking datasets while it exploits various photometric and geometric distortions (i.e., data augmentations) to improve the variability of target regions.

1.6 A.6 Discriminative and Robust Online Learning for Long Term Siamese Visual Tracking (DROL_LT)

Jinghao Zhou, Peng Wang, Haoyang Sun and Zikai Zhang

{jensen.zhoujh,zzkdemail}@gmail.com,{peng.wang,sunhaoyang}@gmail.com

DROL_LT is based on DROL [65]. DROL proposes an online module with an attention mechanism for offline Siamese networks to extract target-specific features under L2 error. DROL also proposes a filter update strategy adaptive to treacherous background noises for discriminative learning, and a template update strategy to handle large target deformations for robust learning. DROL_LT adds two modules to improve DROL results in long term tracking tasks. (1) A detector is added to help DROL recover the targets, which disappear and appear many times. ROI Align is used to extract the features from mixed offline feature maps with the bounding boxes information from detector. (2) A mechanism is designed to help tracker to decide when to update online classifiers and when to use detectors, which depends on a set of thresholds given from experience.

1.7 A.7 Discriminative and Robust Online Learning for Long Term Siamese Visual Tracking (DIMP-SiamRPN)

Zhipeng Luo, Penghao Zhang, Yubo Sun and Bin Dong

{luozp,zhangph,sunyb,Dongbin}@deepblueai.com

DIMP-SiamRPN is improved based on PrDIMP [9] and SiamRPN++ [31]. First, we use the frame numbers to divide videos in the challenge set into long-term videos and short-term videos. The short videos are tested using PrDIMP’s hyper-parameter adjustment model to obtain the results. Daytime scenes in the long videos are tested by the SiamRPN++ model. In the SiamRPN++ model, we enlarge the instance size 15 pixels every frame, and the upper limit of the search threshold is 1000. In addition, when the target seems to be lost, we reset the center of search scope to the center of the image. Furthermore, we define a make-up strategy to deal with occlusion. As scenes of night in the long videos, we divide them into strong light scenes and dark scenes according to the light intensity, in which different inference parameters are used.

1.8 A.8 Discriminative Model Prediction and Accurate Re-detection for Drone Tracking (DiMP_AR)

Xuefeng Zhu, Xiaojun Wu and Tianyang Xu

{xuefeng_zhu95,xiaojun_wu_jnu,tianyang_xu}@163.com

DiMP_AR is based on the DiMP [5] by adding a re-detection module. We use the DiMP tracker as a local tracker to predict target state normally and the RT-MDNet [28] is used as a verifier to verify the prediction of DiMP. If the verification is above a predefined threshold, the normal local tracking is conducted in next frame. Otherwise, the re-detection module will be activated. Firstly, the faster R-CNN detector [48] is used to detect some highly possible target candidates in the whole image of next frame. Then the SiamRPN++ [31] tracker is employed to detect the search regions regarding the possible target candidates. When the target is regained, we switch to local tracking with the tracker DiMP.

1.9 A.9 Precise Visual Tracking by Re-detection (PrSiamR-CNN)

Zhongzhou Zhang, Lei Zhang, Keyang Wang and Zhenwei He

{zz.zhang,leizhang,wangkeyang,hzw}@cqu.edu.cn

PrSiamR-CNN is modified from recently proposed state-of-the-art single object tracker Siam R-CNN [54] by using extra training data from VisDrone-SOT2020.

1.10 A.10 Discriminative Model Prediction with Deeper ResNet-101 (DiMP-101)

Liting Lin and Yong Xu

l.lt@mail.scut.edu.cn

DiMP-101 is based on the DiMP [5] model, adopting deeper ResNet-101 as the backbone. With higher learning capacity of the feature extraction network, the performance of the tracking algorithm has been significantly improved to a new level.

1.11 A.11 ECO: Efficient Convolution Operators for Tracking (ECO)

Lei Pang

panglei2015@ia.ac.cn

ECO [7] is a discriminative correlation filter based tracker using deep features. This method introduces a factorized convolution operator and a compact generative model of the training sample distribution to reduce model parameters. In addition, it proposes a conservative model update strategy with improved robustness and reduced complexity. More details can be referred to [7].

1.12 A.12 Target-Focusing Convolutional Regression Tracking (TFCR)

Di Yuan, Nana Fan and Zhenyu He

dyuanhit@gmail.com

TFCR [63] is a target-focusing convolutional regression (CR) model for visual object tracking tasks. This model uses a target-focusing loss function to alleviate the influence of background noise on the response map of the current tracking image frame, which effectively improves the tracking accuracy. In particular, it can effectively balance the disequilibrium of positive and negative samples by reducing some effects of the negative samples that act on the object appearance model.

1.13 A.13 DDL-Tracker (DDL)

Yong Wang, Lu Ding, Dongjie Zhou and Wentao He

wangyong5@mail.sysu.edu.cn,dinglu@sjtu.edu.cn,13520071811@163.com, weishiinsky@126.com

DDL-tracker employs deep layers to extract features. Meanwhile, one HOG detector is trained online. If the tracking result is below a threshold, we use the results by the detector.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, H. et al. (2020). VisDrone-SOT2020: The Vision Meets Drone Single Object Tracking Challenge Results. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12538. Springer, Cham. https://doi.org/10.1007/978-3-030-66823-5_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-66823-5_44
Published: 03 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66822-8
Online ISBN: 978-3-030-66823-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

VisDrone-SOT2020: The Vision Meets Drone Single Object Tracking Challenge Results

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

VisDrone-SOT2018: The Vision Meets Drone Single-Object Tracking Challenge Results

BioDrone: A Bionic Drone-Based Single Object Tracking Benchmark for Robust Vision

VisDrone-VDT2018: The Vision Meets Drone Video Detection and Tracking Challenge Results

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Descriptions of Submitted Trackers

A Descriptions of Submitted Trackers

1.1 A.1 Strategy and Motion Integrated Long-Term Experts-Version 2 (SMILEv2)

1.2 A.2 Long-Term Tracking with Night-Enhancement and Motion Integrated (LTNMI)

1.3 A.3 Ensemble of Classification and Matching Models with Alpha-Refine for UAV Tracking (ECMMAR)

1.4 A.4 UAV Tracking with Extra Proposals Based on Corrected Velocity Prediction (CVP-superdimp)

1.5 A.5 LTCOMET: Context-Aware IoU-Guided Network for Small Object Tracking (LTCOMET)

1.6 A.6 Discriminative and Robust Online Learning for Long Term Siamese Visual Tracking (DROL_LT)

1.7 A.7 Discriminative and Robust Online Learning for Long Term Siamese Visual Tracking (DIMP-SiamRPN)

1.8 A.8 Discriminative Model Prediction and Accurate Re-detection for Drone Tracking (DiMP_AR)

1.9 A.9 Precise Visual Tracking by Re-detection (PrSiamR-CNN)

1.10 A.10 Discriminative Model Prediction with Deeper ResNet-101 (DiMP-101)

1.11 A.11 ECO: Efficient Convolution Operators for Tracking (ECO)

1.12 A.12 Target-Focusing Convolutional Regression Tracking (TFCR)

1.13 A.13 DDL-Tracker (DDL)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us