skip to main content
10.1145/3394171.3413573acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Fine-grained Feature Alignment with Part Perspective Transformation for Vehicle ReID

Published: 12 October 2020 Publication History

Abstract

Given a query image, vehicle Re-Identification is to search the same vehicle in multi-camera scenarios, which are attracting much attention in recent years. However, vehicle ReID severely suffers from the perspective variation problem. For different vehicles with similar color and type which are taken from different perspectives, all visual patterns are misaligned and warped, which is hard for the model to find out the exact discriminative regions. In this paper, we propose part perspective transformation module (PPT) to map the different parts of vehicle into a unified perspective respectively. The PPT disentangles the vehicle features of different perspectives and then aligns them in a fine-grained level. Further, we propose a dynamically batch hard triplet loss to select the common visible regions of the compared vehicles. Our approach helps the model to generate the perspective invariant features and find out the exact distinguishable regions for vehicle ReID. Extensive experiments on three standard vehicle ReID datasets show the effectiveness of our method.

Supplementary Material

MP4 File (3394171.3413573.mp4)
Given a query image, vehicle Re-Identification is to search the same vehicle in multi-camera scenarios. However, vehicle ReID severely suffers from the perspective variation problem. For different vehicles with similar color and type which are taken from different perspectives, all visual patterns are misaligned and warped, which is hard for the model to find out the exact discriminative regions. We propose part perspective transformation module (PPT) to map the different parts of vehicle into a unified perspective respectively. The PPT disentangles the vehicle features of different perspectives and then aligns them in a fine-grained level. Further, we propose a dynamically batch hard triplet loss to select the common visible regions of the compared vehicles. Our approach helps the model to generate the perspective invariant features and find out the exact distinguishable regions for vehicle ReID. Extensive experiments on three standard vehicle ReID datasets show the effectiveness of our method.

References

[1]
Yan Bai, Yihang Lou, Feng Gao, Shiqi Wang, Yuwei Wu, and Ling-Yu Duan. 2018. Group-sensitive triplet embedding for vehicle reidentification. IEEE Transactions on Multimedia, Vol. 20, 9 (2018), 2385--2399.
[2]
Dong Chen, Gang Hua, Fang Wen, and Jian Sun. 2016. Supervised transformer network for efficient face detection. In European Conference on Computer Vision. Springer, 122--138.
[3]
Ruihang Chu, Yifan Sun, Yadong Li, Zheng Liu, Chi Zhang, and Yichen Wei. 2019. Vehicle Re-identification with Viewpoint-aware Metric Learning. In Proceedings of the IEEE International Conference on Computer Vision. 8282--8291.
[4]
Haiyun Guo, Chaoyang Zhao, Zhiwei Liu, Jinqiao Wang, and Hanqing Lu. 2018. Learning Coarse-to-Fine Structured Feature Embedding for Vehicle Re-Identification. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2--7, 2018. 6853--6860. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16206
[5]
Bing He, Jia Li, Yifan Zhao, and Yonghong Tian. 2019 a. Part-regularized Near-duplicate Vehicle Re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3997--4005.
[6]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016. 770--778. https://doi.org/10.1109/CVPR.2016.90
[7]
Lingxiao He, Yinggang Wang, Wu Liu, He Zhao, Zhenan Sun, and Jiashi Feng. 2019 b. Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification. In Proceedings of the IEEE International Conference on Computer Vision. 8450--8459.
[8]
Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In Defense of the Triplet Loss for Person Re-Identification. CoRR, Vol. abs/1703.07737 (2017). arxiv: 1703.07737
[9]
Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. 2015. Spatial transformer networks. In Advances in neural information processing systems. 2017--2025.
[10]
Amin Jourabloo and Xiaoming Liu. 2017. Pose-invariant face alignment via CNN-based dense 3D model fitting. International Journal of Computer Vision, Vol. 124, 2 (2017), 187--203.
[11]
Pirazh Khorramshahi, Amit Kumar, Neehar Peri, Sai Saketh Rambhatla, Jun-Cheng Chen, and Rama Chellappa. 2019. A Dual Path Model With Adaptive Attention For Vehicle Re-Identification. arXiv preprint arXiv:1905.03397 (2019).
[12]
Chen-Hsuan Lin and Simon Lucey. 2017. Inverse compositional spatial transformer networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2568--2576.
[13]
Hongye Liu, Yonghong Tian, Yaowei Wang, Lu Pang, and Tiejun Huang. 2016b. Deep Relative Distance Learning: Tell the Difference between Similar Vehicles. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016. 2167--2175. https://doi.org/10.1109/CVPR.2016.238
[14]
Wu Liu, Xinchen Liu, Huadomg Ma, and Peng Cheng. 2017. Beyond human-level license plate super-resolution with progressive vehicle search and domain priori GAN. In Proceedings of the 25th ACM international conference on Multimedia. 1618--1626.
[15]
Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, and Qingming Huang. 2019. Adaptive reconstruction network for weakly supervised referring expression grounding. In Proceedings of the IEEE International Conference on Computer Vision. 2611--2620.
[16]
Xinchen Liu, Wu Liu, Huadong Ma, and Huiyuan Fu. 2016a. Large-scale vehicle re-identification in urban surveillance videos. In IEEE International Conference on Multimedia and Expo, ICME 2016, Seattle, WA, USA, July 11--15, 2016. 1--6. https://doi.org/10.1109/ICME.2016.7553002
[17]
Xinchen Liu, Wu Liu, Tao Mei, and Huadong Ma. 2018a. PROVID: Progressive and Multimodal Vehicle Reidentification for Large-Scale Urban Surveillance. IEEE Trans. Multimedia, Vol. 20, 3 (2018), 645--658. https://doi.org/10.1109/TMM.2017.2751966
[18]
Xiaobin Liu, Shiliang Zhang, Qingming Huang, and Wen Gao. 2018b. RAM: A Region-Aware Deep Model for Vehicle Re-Identification. In 2018 IEEE International Conference on Multimedia and Expo, ICME 2018, San Diego, CA, USA, July 23--27, 2018. 1--6. https://doi.org/10.1109/ICME.2018.8486589
[19]
Yihang Lou, Yan Bai, Jun Liu, Shiqi Wang, and Lingyu Duan. 2019 a. Veri-wild: A large dataset and a new method for vehicle re-identification in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3235--3243.
[20]
Yihang Lou, Yan Bai, Jun Liu, Shiqi Wang, and Ling - Yu Duan. 2019 b. Embedding Adversarial Learning for Vehicle Re-Identification. IEEE Trans. Image Processing, Vol. 28, 8 (2019), 3794--3807. https://doi.org/10.1109/TIP.2019.2902112
[21]
Hao Luo, Youzhi Gu, Xingyu Liao, Shenqi Lai, and Wei Jiang. 2019. Bag of tricks and a strong baseline for deep person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 0--0.
[22]
Dechao Meng, Liang Li, Xuejing Liu, Yadong Li, Shijie Yang, Zheng-Jun Zha, Xingyu Gao, Shuhui Wang, and Qingming Huang. 2020. Parsing-based View-aware Embedding Network for Vehicle Re-Identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7103--7112.
[23]
Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked Hourglass Networks for Human Pose Estimation. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part VIII. 483--499. https://doi.org/10.1007/978--3--319--46484--8_29
[24]
Weijian Ruan, Wu Liu, Qian Bao, Jun Chen, Yuhao Cheng, and Tao Mei. 2019. Poinet: pose-guided ovonic insight network for multi-person pose tracking. In Proceedings of the 27th ACM International Conference on Multimedia. 284--292.
[25]
M Saquib Sarfraz, Arne Schumann, Andreas Eberle, and Rainer Stiefelhagen. 2018. A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 420--429.
[26]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 815--823.
[27]
Yantao Shen, Tong Xiao, Hongsheng Li, Shuai Yi, and Xiaogang Wang. 2017. Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. 1918--1927. https://doi.org/10.1109/ICCV.2017.210
[28]
Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, and Qi Tian. 2017. Pose-driven deep convolutional model for person re-identification. In Proceedings of the IEEE international conference on computer vision. 3960--3969.
[29]
Yifan Sun, Qin Xu, Yali Li, Chi Zhang, Yikang Li, Shengjin Wang, and Jian Sun. 2019. Perceive Where to Focus: Learning Visibility-aware Part-level Features for Partial Person Re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 393--402.
[30]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7--12, 2015. 1--9. https://doi.org/10.1109/CVPR.2015.7299023
[31]
Zhongdao Wang, Luming Tang, Xihui Liu, Zhuliang Yao, Shuai Yi, Jing Shao, Junjie Yan, Shengjin Wang, Hongsheng Li, and Xiaogang Wang. 2017. Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. 379--387. https://doi.org/10.1109/ICCV.2017.49
[32]
Linjie Yang, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2015. A large-scale car dataset for fine-grained categorization and verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3973--3981.
[33]
Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang, and Qi Tian. 2019. Skeletonnet: A hybrid network with a skeleton-embedding process for multi-view image representation learning. IEEE Transactions on Multimedia, Vol. 21, 11 (2019), 2916--2929.
[34]
Junho Yim, Heechul Jung, ByungIn Yoo, Changkyu Choi, Dusik Park, and Junmo Kim. 2015. Rotating your face using multi-task deep neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 676--684.
[35]
Yuhui Yuan, Kuiyuan Yang, and Chao Zhang. 2017. Hard-Aware Deeply Cascaded Embedding. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. 814--823. https://doi.org/10.1109/ICCV.2017.94
[36]
Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, and Qingming Huang. 2020. State-Relabeling Adversarial Active Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8756--8765.
[37]
Liming Zhao, Xi Li, Yueting Zhuang, and Jingdong Wang. 2017. Deeply-learned part-aligned representations for person re-identification. In Proceedings of the IEEE international conference on computer vision. 3219--3228.
[38]
Aihua Zheng, Xianmin Lin, Chenglong Li, Ran He, and Jin Tang. 2019. Attributes Guided Feature Learning for Vehicle Re-identification. arXiv preprint arXiv:1905.08997 (2019).
[39]
Yi Zhou and Ling Shao. 2018. Viewpoint-Aware Attentive Multi-View Inference for Vehicle Re-Identification. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. 6489--6498. https://doi.org/10.1109/CVPR.2018.00679
[40]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.
[41]
Yangchun Zhu, Zheng-Jun Zha, Tianzhu Zhang, Jiawei Liu, and Jiebo Luo. 2020. A Structured Graph Attention Network for Vehicle Re-Identification. In Proceedings of the 28th ACM international conference on Multimedia.

Cited By

View all
  • (2025)Region-guided spatial feature aggregation network for vehicle re-identificationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109568139(109568)Online publication date: Jan-2025
  • (2024)UnbiasNet: Vehicle Re-Identification Oriented Unbiased Feature Enhancement by Using Causal EffectIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.331729425:2(1925-1937)Online publication date: Feb-2024
  • (2024)Semantic-Oriented Feature Coupling Transformer for Vehicle Re-Identification in Intelligent Transportation SystemIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.325787325:3(2803-2813)Online publication date: Mar-2024
  • Show More Cited By

Index Terms

  1. Fine-grained Feature Alignment with Part Perspective Transformation for Vehicle ReID

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '20: Proceedings of the 28th ACM International Conference on Multimedia
    October 2020
    4889 pages
    ISBN:9781450379885
    DOI:10.1145/3394171
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computer vision
    2. feature alignment
    3. perspective transformation
    4. vehicle ReID

    Qualifiers

    • Research-article

    Conference

    MM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)50
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Region-guided spatial feature aggregation network for vehicle re-identificationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109568139(109568)Online publication date: Jan-2025
    • (2024)UnbiasNet: Vehicle Re-Identification Oriented Unbiased Feature Enhancement by Using Causal EffectIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.331729425:2(1925-1937)Online publication date: Feb-2024
    • (2024)Semantic-Oriented Feature Coupling Transformer for Vehicle Re-Identification in Intelligent Transportation SystemIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.325787325:3(2803-2813)Online publication date: Mar-2024
    • (2024)LAPNet: Local Aware Permutation NetWork for Vehicle Re-Identification2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI)10.1109/ICETCI61221.2024.10594395(581-586)Online publication date: 24-May-2024
    • (2024)Linking unknown characters via oracle bone inscriptions retrievalMultimedia Systems10.1007/s00530-024-01327-730:3Online publication date: 15-Apr-2024
    • (2024)Vehicle Re-identification with a Pose-Aware Discriminative Part Learning ModelPattern Recognition and Computer Vision10.1007/978-981-97-8493-6_18(251-265)Online publication date: 1-Nov-2024
    • (2023)Progressively Hybrid Transformer for Multi-Modal Vehicle Re-IdentificationSensors10.3390/s2309420623:9(4206)Online publication date: 23-Apr-2023
    • (2023)A Pedestrian Re-identification Method Based on Joint learning and Feature alignment Neural NetworkProceedings of the 2023 3rd International Conference on Bioinformatics and Intelligent Computing10.1145/3592686.3592727(227-231)Online publication date: 10-Feb-2023
    • (2023)Viewpoint Alignment and Discriminative Parts Enhancement in 3D Space for Vehicle ReIDIEEE Transactions on Multimedia10.1109/TMM.2022.315410225(2954-2965)Online publication date: 2023
    • (2023)TVG-ReID: Transformer-Based Vehicle-Graph Re-IdentificationIEEE Transactions on Intelligent Vehicles10.1109/TIV.2023.32925138:11(4644-4652)Online publication date: Nov-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media