skip to main content
10.1145/3503161.3548224acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification

Authors Info & Claims
Published:10 October 2022Publication History

ABSTRACT

Sketch re-identification (Re-ID) refers to using sketches of pedestrians to retrieve their corresponding photos from surveillance videos. It can track pedestrians according to the sketches drawn based on eyewitnesses without querying pedestrian photos. Although the Sketch Re-ID concept has been proposed, the gap between the sketch and the photo still greatly hinders pedestrian identity matching. Based on the idea of transplantation without rejection, we propose a Cross-Compatible Embedding (CCE) approach to narrow the gap. A Semantic Consistent Feature Construction (SCFC) scheme is simultaneously presented to enhance feature discrimination. Under the guidance of identity consistency, the CCE performs cross modal interchange at the local token level in the Transformer framework, enabling the model to extract modal-compatible features. The SCFC improves the representation ability of features by handling the inconsistency of information in the same location of the sketch and the corresponding pedestrian photo. The SCFC scheme divides the local tokens of pedestrian images with different modes into different groups and assigns specific semantic information to each group for constructing a semantic consistent global feature representation. Experiments on the public Sketch Re-ID dataset confirm the effectiveness of the proposed method and its superiority over existing methods. Experiments on Sketch-based image retrieval datasets QMUL-Shoe-v2 and QMUL-Chair-v2 are conducted to assess the method's generalization. The results show that the proposed method outperforms the state-of-the-art works compared. The source code of our method is available at: https://github.com/lhf12278/CCSC.

Skip Supplemental Material Section

Supplemental Material

References

  1. Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, and Yi-Zhe Song. 2021. More photos are all you need: semi-supervised learning for fine-grained sketch based image retrieval. In CVPR. 4245--4254.Google ScholarGoogle Scholar
  2. Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, and Yi-Zhe Song. 2020. Sketch less for more: on-the-fly fine-grained sketch-based image retrieval. In CVPR. 9779--9788.Google ScholarGoogle Scholar
  3. Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In ECCV. 213--229.Google ScholarGoogle Scholar
  4. Chun-Fu (Richard) Chen, Quanfu Fan, and Rameswar Panda. 2021a. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. In ICCV. 347--356.Google ScholarGoogle Scholar
  5. Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. 2021b. Pre-trained image processing transformer. In CVPR. 12299--12310.Google ScholarGoogle Scholar
  6. Yangdong Chen, Zhaolong Zhang, Yanfei Wang, Yuejie Zhang, Rui Feng, Tao Zhang, and Weiguo Fan. 2022. AE-Net: Fine-grained sketch-based image retrieval via attention-enhanced network. Pattern Recognition, Vol. 122 (2022), 108291.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR. 248--255.Google ScholarGoogle Scholar
  8. Changxing Ding, Kan Wang, Pengfei Wang, and Dacheng Tao. 2022. Multi-task learning with coarse priors for robust part-aware person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 3 (2022), 1474--1488.Google ScholarGoogle ScholarCross RefCross Ref
  9. Shaojun Gui, Yu Zhu, Xiangxiang Qin, and Xiaofeng Ling. 2020. Learning multi-level domain invariant features for sketch re-identification. Neurocomputing, Vol. 403 (2020), 294--303.Google ScholarGoogle ScholarCross RefCross Ref
  10. Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, and Wei Jiang. 2021. TransReID: transformer-based object re-identification. In ICCV.Google ScholarGoogle Scholar
  11. Rui Hu and John Collomosse. 2013. A performance evaluation of gradient field hog descriptor for sketch based image retrieval. Computer Vision and Image Understanding, Vol. 117, 7 (2013), 790--806.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Huafeng Li, Yiwen Chen, Dapeng Tao, Zhengtao Yu, and Guanqiu Qi. 2021a. Attribute-aligned domain-invariant feature learning for unsupervised domain adaptation person re-identification. IEEE Transactions on Information Forensics and Security, Vol. 16 (2021), 1480--1494.Google ScholarGoogle ScholarCross RefCross Ref
  13. Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, and Feng Wu. 2021b. Diverse part discovery: occluded person re-identification with part-aware transformer. In CVPR. 2898--2907.Google ScholarGoogle Scholar
  14. Shengcai Liao and Ling Shao. 2021. Transformer-based deep image matching for generalizable person re-identification. arXiv preprint arXiv:2105.14432 (2021).Google ScholarGoogle Scholar
  15. Hangyu Lin, Yanwei Fu, Peng Lu, Shaogang Gong, Xiangyang Xue, and Yugang Jiang. 2019. TC-Net for iSBIR: Triplet classification network for instance-level sketch based image retrieval. In ACMMM. 1676--1684.Google ScholarGoogle Scholar
  16. Zhipu Liu, Lei Zhang, and Yang Yang. 2020. Hierarchical bi-directional feature perception network for person re-identification. In ACMMM. 4289--4298.Google ScholarGoogle Scholar
  17. Zhongxing Ma, Yifan Zhao, and Jia Li. 2021. Pose-guided inter- and intra-part relational transformer for occluded person re-identification. In ACMMM. 1487--1496.Google ScholarGoogle Scholar
  18. Lu Pang, Yaowei Wang, Yi-Zhe Song, Tiejun Huang, and Yonghong Tian. 2018. Cross-domain adversarial feature learning for sketch re-identification. In ACMMM. 609--617.Google ScholarGoogle Scholar
  19. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. NeurlPS, Vol. 32, 8026--8037.Google ScholarGoogle Scholar
  20. Patsorn Sangkloy, Nathan Burnell, Cusuh Ham, and James Hays. 2016. The sketchy database: learning to retrieve badly drawn bunnies. ACM Transactions on Graphics, Vol. 35, 4 (2016), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2018. Learning to sketch with shortcut cycle consistency. In CVPR. 801--810.Google ScholarGoogle Scholar
  22. Jifei Song, Qian Yu, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2017a. Deep spatial-semantic attention for fine-grained sketch-based image retrieval. In ICCV. 5551--5560.Google ScholarGoogle Scholar
  23. Jifei Song, Yi zhe Song, Tony Xiang, and Timothy Hospedales. 2017b. Fine-Grained Image Retrieval: the Text/Sketch Input Dilemma. In BMVC. 45.1--45.12.Google ScholarGoogle Scholar
  24. Peize Sun, Jinkun Cao, Yi Jiang, Rufeng Zhang, Enze Xie, Zehuan Yuan, Changhu Wang, and Ping Luo. 2020. Transtrack: multiple-object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for balanceGoogle ScholarGoogle Scholar
  25. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998--6008.Google ScholarGoogle Scholar
  26. Xiaogang Wang, Gianfranco Doretto, Thomas Sebastian, Jens Rittscher, and Peter Tu. 2007. Shape and appearance context modeling. In ICCV. 1--8.Google ScholarGoogle Scholar
  27. Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, and Ping Luo. 2021. Segmenting transparent objects in the wild with transformer. In IJCAI. 1194--1200.Google ScholarGoogle Scholar
  28. Lan Yang, Kaiyue Pang, Honggang Zhang, and Yi-Zhe Song. 2021. SketchAA: abstract representation for abstract sketches. In ICCV. 10077--10086.Google ScholarGoogle Scholar
  29. Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, and Chen Change Loy. 2016. Sketch me that shoe. In CVPR. 799--807.Google ScholarGoogle Scholar
  30. Qian Yu, Jifei Song, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2021. Fine-grained instance-level sketch-based image retrieval. International Journal of Computer Vision, Vol. 129 (2021), 484--500.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In ICCV. 1116--1124.Google ScholarGoogle Scholar
  32. Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020. Random erasing data augmentation. In AAAI. 13001--13008.Google ScholarGoogle Scholar
  33. Kuan Zhu, Haiyun Guo, Shiliang Zhang, Yaowei Wang, Gaopan Huang, Honglin Qiao, Jing Liu, Jinqiao Wang, and Ming Tang. 2021. AAformer: auto-aligned transformer for person re-identification. arXiv preprint arXiv:2104.00921 (2021).Google ScholarGoogle Scholar

Index Terms

  1. Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '22: Proceedings of the 30th ACM International Conference on Multimedia
        October 2022
        7537 pages
        ISBN:9781450392037
        DOI:10.1145/3503161

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 October 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader