research-article

Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification

Authors:
Yafei Zhang

Kunming University of Science and Technology, Kunming, China

Kunming University of Science and Technology, Kunming, China
View Profile

,
Yongzeng Wang

Kunming University of Science and Technology, Kunming, China

Kunming University of Science and Technology, Kunming, China
View Profile

,
Huafeng Li

Kunming University of Science and Technology, Kunming, China

Kunming University of Science and Technology, Kunming, China
View Profile

,
Shuang Li

Kunming University of Science and Technology, Kunming, China

Kunming University of Science and Technology, Kunming, China
View Profile

MM '22: Proceedings of the 30th ACM International Conference on MultimediaOctober 2022Pages 3347–3355https://doi.org/10.1145/3503161.3548224

Published:10 October 2022Publication History

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 3347–3355

ABSTRACT

Sketch re-identification (Re-ID) refers to using sketches of pedestrians to retrieve their corresponding photos from surveillance videos. It can track pedestrians according to the sketches drawn based on eyewitnesses without querying pedestrian photos. Although the Sketch Re-ID concept has been proposed, the gap between the sketch and the photo still greatly hinders pedestrian identity matching. Based on the idea of transplantation without rejection, we propose a Cross-Compatible Embedding (CCE) approach to narrow the gap. A Semantic Consistent Feature Construction (SCFC) scheme is simultaneously presented to enhance feature discrimination. Under the guidance of identity consistency, the CCE performs cross modal interchange at the local token level in the Transformer framework, enabling the model to extract modal-compatible features. The SCFC improves the representation ability of features by handling the inconsistency of information in the same location of the sketch and the corresponding pedestrian photo. The SCFC scheme divides the local tokens of pedestrian images with different modes into different groups and assigns specific semantic information to each group for constructing a semantic consistent global feature representation. Experiments on the public Sketch Re-ID dataset confirm the effectiveness of the proposed method and its superiority over existing methods. Experiments on Sketch-based image retrieval datasets QMUL-Shoe-v2 and QMUL-Chair-v2 are conducted to assess the method's generalization. The results show that the proposed method outperforms the state-of-the-art works compared. The source code of our method is available at: https://github.com/lhf12278/CCSC.

Supplemental Material

Available for Download

mp4

MM22-fp1955.mp4 (32.9 MB)

References

Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, and Yi-Zhe Song. 2021. More photos are all you need: semi-supervised learning for fine-grained sketch based image retrieval. In CVPR. 4245--4254.Google Scholar
Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, and Yi-Zhe Song. 2020. Sketch less for more: on-the-fly fine-grained sketch-based image retrieval. In CVPR. 9779--9788.Google Scholar
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In ECCV. 213--229.Google Scholar
Chun-Fu (Richard) Chen, Quanfu Fan, and Rameswar Panda. 2021a. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. In ICCV. 347--356.Google Scholar
Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. 2021b. Pre-trained image processing transformer. In CVPR. 12299--12310.Google Scholar
Yangdong Chen, Zhaolong Zhang, Yanfei Wang, Yuejie Zhang, Rui Feng, Tao Zhang, and Weiguo Fan. 2022. AE-Net: Fine-grained sketch-based image retrieval via attention-enhanced network. Pattern Recognition, Vol. 122 (2022), 108291.Google ScholarDigital Library
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR. 248--255.Google Scholar
Changxing Ding, Kan Wang, Pengfei Wang, and Dacheng Tao. 2022. Multi-task learning with coarse priors for robust part-aware person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 3 (2022), 1474--1488.Google ScholarCross Ref
Shaojun Gui, Yu Zhu, Xiangxiang Qin, and Xiaofeng Ling. 2020. Learning multi-level domain invariant features for sketch re-identification. Neurocomputing, Vol. 403 (2020), 294--303.Google ScholarCross Ref
Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, and Wei Jiang. 2021. TransReID: transformer-based object re-identification. In ICCV.Google Scholar
Rui Hu and John Collomosse. 2013. A performance evaluation of gradient field hog descriptor for sketch based image retrieval. Computer Vision and Image Understanding, Vol. 117, 7 (2013), 790--806.Google ScholarDigital Library
Huafeng Li, Yiwen Chen, Dapeng Tao, Zhengtao Yu, and Guanqiu Qi. 2021a. Attribute-aligned domain-invariant feature learning for unsupervised domain adaptation person re-identification. IEEE Transactions on Information Forensics and Security, Vol. 16 (2021), 1480--1494.Google ScholarCross Ref
Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, and Feng Wu. 2021b. Diverse part discovery: occluded person re-identification with part-aware transformer. In CVPR. 2898--2907.Google Scholar
Shengcai Liao and Ling Shao. 2021. Transformer-based deep image matching for generalizable person re-identification. arXiv preprint arXiv:2105.14432 (2021).Google Scholar
Hangyu Lin, Yanwei Fu, Peng Lu, Shaogang Gong, Xiangyang Xue, and Yugang Jiang. 2019. TC-Net for iSBIR: Triplet classification network for instance-level sketch based image retrieval. In ACMMM. 1676--1684.Google Scholar
Zhipu Liu, Lei Zhang, and Yang Yang. 2020. Hierarchical bi-directional feature perception network for person re-identification. In ACMMM. 4289--4298.Google Scholar
Zhongxing Ma, Yifan Zhao, and Jia Li. 2021. Pose-guided inter- and intra-part relational transformer for occluded person re-identification. In ACMMM. 1487--1496.Google Scholar
Lu Pang, Yaowei Wang, Yi-Zhe Song, Tiejun Huang, and Yonghong Tian. 2018. Cross-domain adversarial feature learning for sketch re-identification. In ACMMM. 609--617.Google Scholar
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. NeurlPS, Vol. 32, 8026--8037.Google Scholar
Patsorn Sangkloy, Nathan Burnell, Cusuh Ham, and James Hays. 2016. The sketchy database: learning to retrieve badly drawn bunnies. ACM Transactions on Graphics, Vol. 35, 4 (2016), 1--12.Google ScholarDigital Library
Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2018. Learning to sketch with shortcut cycle consistency. In CVPR. 801--810.Google Scholar
Jifei Song, Qian Yu, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2017a. Deep spatial-semantic attention for fine-grained sketch-based image retrieval. In ICCV. 5551--5560.Google Scholar
Jifei Song, Yi zhe Song, Tony Xiang, and Timothy Hospedales. 2017b. Fine-Grained Image Retrieval: the Text/Sketch Input Dilemma. In BMVC. 45.1--45.12.Google Scholar
Peize Sun, Jinkun Cao, Yi Jiang, Rufeng Zhang, Enze Xie, Zehuan Yuan, Changhu Wang, and Ping Luo. 2020. Transtrack: multiple-object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for balanceGoogle Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998--6008.Google Scholar
Xiaogang Wang, Gianfranco Doretto, Thomas Sebastian, Jens Rittscher, and Peter Tu. 2007. Shape and appearance context modeling. In ICCV. 1--8.Google Scholar
Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, and Ping Luo. 2021. Segmenting transparent objects in the wild with transformer. In IJCAI. 1194--1200.Google Scholar
Lan Yang, Kaiyue Pang, Honggang Zhang, and Yi-Zhe Song. 2021. SketchAA: abstract representation for abstract sketches. In ICCV. 10077--10086.Google Scholar
Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, and Chen Change Loy. 2016. Sketch me that shoe. In CVPR. 799--807.Google Scholar
Qian Yu, Jifei Song, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2021. Fine-grained instance-level sketch-based image retrieval. International Journal of Computer Vision, Vol. 129 (2021), 484--500.Google ScholarDigital Library
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In ICCV. 1116--1124.Google Scholar
Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020. Random erasing data augmentation. In AAAI. 13001--13008.Google Scholar
Kuan Zhu, Haiyun Guo, Shiliang Zhang, Yaowei Wang, Gaopan Huang, Honglin Qiao, Jing Liu, Jinqiao Wang, and Ming Tang. 2021. AAformer: auto-aligned transformer for person re-identification. arXiv preprint arXiv:2104.00921 (2021).Google Scholar

Index Terms

Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems

Recommendations

Cross-Domain Adversarial Feature Learning for Sketch Re-identification
MM '18: Proceedings of the 26th ACM international conference on Multimedia

Under person re-identification (Re-ID), a query photo of the target person is often required for retrieval. However, one is not always guaranteed to have such a photo readily available under a practical forensic setting. In this paper, we define the ...
Read More
Beyond Domain Gap: Exploiting Subjectivity in Sketch-Based Person Retrieval
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Person re-identification (re-ID) requires densely distributed cameras. In practice, the person of interest may not be captured by cameras and therefore need to be retrieved using subjective information (e.g., sketches from witnesses). Previous research ...
Read More
Cross-view semantic projection learning for person re-identification

We propose a novel feature transformation algorithm for person re-identification.The hand-crafted features are mapped into the common semantic space via view-specific semantic projection functions.The common intrinsic structure is explored via a shared ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cross transplantation
cross-compatible embedding
semantic consistent features
sketch re-identification
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 192
  Total Downloads
- Downloads (Last 12 months)92
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Cross-Domain Adversarial Feature Learning for Sketch Re-identification

Beyond Domain Gap: Exploiting Subjectivity in Sketch-Based Person Retrieval

Cross-view semantic projection learning for person re-identification