Skip to main content
Log in

Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper studies vehicle ReID in aerial videos taken by Unmanned Aerial Vehicles (UAVs). Compared with existing vehicle ReID tasks performed with fixed surveillance cameras, UAV vehicle ReID is still under-explored and could be more challenging, e.g., aerial videos have dynamic and complex backgrounds, different vehicles show similar appearance, and the same vehicle commonly show distinct viewpoints and scales. To facilitate the research on UAV vehicle ReID, this paper contributes a novel dataset called UAV-VeID. UAV-VeID contains 41,917 images of 4601 vehicles captured by UAVs, where each vehicle has multiple images taken from different viewpoints. UAV-VeID also includes a large-scale distractor set to encourage the research on efficient ReID schemes. Compared with existing vehicle ReID datasets, UAV-VeID exhibits substantial variances in viewpoints and scales of vehicles, thus requires more robust features. To alleviate the negative effects of those variances, this paper also proposes a viewpoint adversarial training strategy and a multi-scale consensus loss to promote the robustness and discriminative power of learned deep features. Extensive experiments on UAV-VeID show our approach outperforms recent vehicle ReID algorithms. Moreover, our method also achieves competitive performance compared with recent works on existing vehicle ReID datasets including VehicleID, VeRi-776 and VERI-Wild.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Avola, D., Cinque, L., Foresti, G. L., Martinel, N., Pannone, D., & Piciarelli, C. (2018). A UAV video dataset for mosaicking and change detection from low-altitude flights. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50, 2139–2149.

    Article  Google Scholar 

  • Bai, Y., Lou, Y., Gao, F., Wang, S., Wu, Y., & Duan, L. Y. (2018). Group-sensitive triplet embedding for vehicle reidentification. TMM, 20(2385), 2399.

    Google Scholar 

  • Chang, X., Hospedales, T. M., & Xiang, T. (2018). Multi-level factorisation net for person re-identification. In CVPR.

  • Chatfield, K., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets.

  • Chen, L. C., Yang, Y., Wang, J., Xu, W., & Yuille, A. L. (2016). Attention to scale: Scale-aware semantic image segmentation. In CVPR.

  • Chu, R., Sun, Y., Li, Y., Liu, Z., Zhang, C., & Wei, Y. (2019). Vehicle re-identification with viewpoint-aware metric learning. In ICCV.

  • Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., et al. (2018). The unmanned aerial vehicle benchmark: Object detection and tracking. In ECCV.

  • Ganin, Y., & Lempitsky, V. (2014). Unsupervised domain adaptation by backpropagation.

  • Ganin, Y., & Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In ICML.

  • Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., et al. (2016). Domain-adversarial training of neural networks. JMLR.

  • Girisha, S., Pai, M. M., Verma, U., & Pai, R. M. (2019). Performance analysis of semantic segmentation algorithms for finely annotated new uav aerial video dataset (manipaluavid). IEEE Access, 7, 136239–136253.

    Article  Google Scholar 

  • Guo, H., Zhao, C., Liu, Z., Wang, J., Lu, H. (2018). Learning coarse-to-fine structured feature embedding for vehicle re-identification. In AAAI.

  • He, J., Deng, Z., & Qiao, Y. (2019b). Dynamic multi-scale filters for semantic segmentation. In ICCV.

  • He, B., Li, J., Zhao, Y., & Tian, Y. (2019a). Part-regularized near-duplicate vehicle re-identification. In CVPR.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.

  • Hsieh, M. R., Lin, Y. L., & Hsu, W. H. (2017). Drone-based object counting by spatially regularized regional proposal network. In ICCV.

  • Huang, G., & Chen, D. (2018). Multi-scale dense networks for resource efficient image classification. In ICLR.

  • Kanac, A., & Zhu, X. (2018). Vehicle re-identification in context. In GCPR.

  • Kanacı, A., Zhu, X., & Gong, S. (2017). Vehicle reidentification by fine-grained cross-level deep learning. In BMVC.

  • Khorramshahi, P., Kumar, A., Peri, N., Rambhatla, S. S., Chen, J. C., & Chellappa, R. (2019). A dual path modelwith adaptive attention for vehicle re-identification. In ICCV.

  • Li, Y., Chen, Y., Wang, N., & Zhang, Z. (2019). Scale-aware trident networks for object detection. In ICCV.

  • Li, W., Zhu, X., & Gong, S. (2018). Harmonious attention network for person re-identification. In CVPR.

  • Liu, X., Liu, W., Ma, H., & Fu, H. (2016c). Large-scale vehicle re-identification in urban surveillance videos. In ICME.

  • Liu, X., Liu, W., Ma, H., & Li, S. (2019b). PVSS: A progressive vehicle search system for video surveillance networks.

  • Liu, X., Liu, W., Mei, T., & Ma, H. (2016d). A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In ECCV.

  • Liu, L., Qiu, Z., Li, G., Liu, S., Ouyang, W., & Lin, L. (2019a). Crowd counting with deep structured scale integration network. In ICCV.

  • Liu, H., Tian, Y., Yang, Y., Pang, L., & Huang, T. (2016a). Deep relative distance learning: Tell the difference between similar vehicles. In CVPR.

  • Liu, W., Wen, Y., Yu, Z., & Yang, M. (2016b). Large-margin softmax loss for convolutional neural networks. In ICML.

  • Liu, X., Zhang, S., Huang, Q., & Gao, W. (2018). RAM: A region-aware deep model for vehicle re-identification. In ICME.

  • Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2018). Conditional adversarial domain adaptation. In NIPS.

  • Lou, Y., Bai, Y., Liu, J., Wang, S., & Duan, L. (2019). Veri-wild: A large dataset and a new method for vehicle re-identification in the wild. In CVPR.

  • Lu, J., Yang, J., Batra, D., & Parikh, D. (2016). Hierarchical question-image co-attention for visual question answering. In NIPS.

  • Mueller, M., Smith, N., & Ghanem, B. (2016). A benchmark and simulator for UAV tracking. In ECCV.

  • Pei, Z., Cao, Z., Long, M., & Wang, J. (2018). Multi-adversarial domain adaptation. In AAAI.

  • Qian, X., Fu, Y., Jiang, Y. G., Xiang, T., & Xue, X. (2017). Multi-scale deep learning architectures for person re-identification. In ICCV.

  • Ramprasaath, R. S., Michael, C., Abhishek, D., Ramakrishna, V., Devi, P., & Dhruv, B. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. In ICCV.

  • Redmon, J., & Farhadi, A. (2017). Yolo9000: Better, faster, stronger. In CVPR.

  • Robicquet, A., Sadeghian, A., Alahi, A., & Savarese, S. (2016). Learning social etiquette: Human trajectory understanding in crowded scenes. In ECCV.

  • Schroff, F., Kalenichenko, D., & Philbin, J. (2015) FaceNet: A unified embedding for face recognition and clustering. In CVPR.

  • Shen, Y., Xiao, T., Li, H., Yi, S., & Wang, X. (2017). Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals. In ICCV.

  • Sun, Y., Zheng, L., Yang, Y., Tian, Q., & Wang, S. (2018). Beyond part models person retrieval with refined part pooling. In ECCV.

  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In CVPR.

  • Tan, W., Yan, B., & Bare, B. (2018). Feature super-resolution: Make machine see more clearly. In CVPR.

  • Teng, S., Liu, X., Zhang, S., & Huang, Q. (2018) SCAN: Spatial and channel attention network for vehicle re-identification. In PCM.

  • Teng, S., Zhang, S., Huang, Q., & Sebe, N. (2020). Multi-view spatial attention embedding for vehicle re-identification. TCSVT.

  • Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In CVPR.

  • Wang, D., & Zhang, S. (2020). Unsupervised person re-identification via multi-label classification. In CVPR.

  • Wang, X., Jabri, A., & Efros, A. A. (2019). Learning correspondence from the cycle-consistency of time. In CVPR.

  • Wang, Z., Tang, L., Liu, X., Yao, Z., Yi, S., Shao, J., et al. (2017). Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. In CVPR.

  • Wei, L., Zhang, S., Gao, W., & Tian, Q. (2017a). Person transfer gan to bridge domain gap for person re-identification. In CVPR.

  • Wei, L., Zhang, S., Yao, H., Gao, W., & Tian, Q. (2017b). Glad: Global-local-alignment descriptor for pedestrian retrieval. In ACM MM.

  • Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In ECCV.

  • Yan, K., Tian, Y., Wang, Y., Zeng, W., & Huang, T. (2017). Exploiting multi-grain ranking constraints for precisely searching visually-similar vehicles. In ICCV.

  • Yang, L., Luo, P., Change Loy, C., & Tang, X. (2015). A large-scale car dataset for fine-grained categorization and verification. In CVPR.

  • Yao, H., Zhang, S., Zhang, Y., Li, J., & Tian, Q. (2017). One-shot fine-grained instance retrieval. In ACM MM.

  • Yuan, Y., Yang, K., & Zhang, C. (2017). Hard-aware deeply cascaded embedding. In ICCV.

  • Zhang, Y., Liu, D., & Zha, Z. J. (2017). Improving triplet-wise training of convolutional neural network for vehicle re-identification. In ICME.

  • Zhou, Y., & Shao, L. (2018). Aware attentive multi-view inference for vehicle re-identification. In CVPR.

  • Zhou, K., Yang, Y., Cavallaro, A., & Xiang, T. (2019). Omni-scale feature learning for person re-identification. In ICCV.

  • Zhu, J. Y., Taesung, P., Phillip, I., & Alexei, A. E. (2017). Unpaired imageto-image translation using cycle-consistent adversarial networks. In ICCV.

  • Zhu, P., Wen, L., Bian, X., Ling, H., & Hu, Q. (2018a). Vision meets drones: A challenge.

  • Zhu, P., Wen, L., Du, D., Bian, X., Hu, Q., & Ling, H. (2020). Vision meets drones: Past, present and future.

  • Zhu, Z., Wu, W., Zou, W., & Yan, J. (2018b). End-to-end flow correlation tracking with spatial-temporal attention. In CVPR.

Download references

Acknowledgements

This work is supported in part by National Natural Science Foundation of China under Grant Nos. 61620106009, U20B2052, 61936011, 61931008 and 61836002, in part by the Italy-China collaboration project TALENT 2018YFE0118400, in part by Beijing Natural Science Foundation under Grant No. JQ18012, in part by Key Research Program of Frontier Sciences, CAS: QYZDJ-SSW-SYS013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiliang Zhang.

Additional information

Communicated by Mei Chen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Teng, S., Zhang, S., Huang, Q. et al. Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification. Int J Comput Vis 129, 719–735 (2021). https://doi.org/10.1007/s11263-020-01402-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-020-01402-2

Keywords

Navigation