Cascaded attention-guided multi-granularity feature learning for person re-identification

Dong, Husheng; Yang, Yuanfeng; Sun, Xun; Zhang, Liang; Fang, Ligang

doi:10.1007/s00138-022-01353-3

Cascaded attention-guided multi-granularity feature learning for person re-identification

Original Paper
Published: 18 November 2022

Volume 34, article number 4, (2023)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Husheng Dong ORCID: orcid.org/0000-0001-9690-3949^1,2,
Yuanfeng Yang¹^na1,
Xun Sun¹^na1,
Liang Zhang¹^na1 &
…
Ligang Fang¹^na1

484 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Attention mechanism has been extensively employed in the task of person re-identification, as it helps to extract much more discriminative feature representations. However, most of existing works either incorporate a single-scale attention module, or the embedded attentions work independently. Though promising results are achieved, they may fail to mine different subtle visual clues. To mitigate this issue, a novel framework called cascaded attention network (CANet) is proposed, which allows to mine diverse clues and integrate them into final multi-granularity features by a cascaded manner. Specifically, we design a novel hybrid pooling attention module (HPAM) and plug it into backbone network at different stages. To make them work collaboratively, an inter-attention regularization is applied, such that they can localize complementary salient features. Then, CANet extracts global and local features from a part-based pyramidal architecture. For better feature robustness, supervision is applied to not only the pyramidal branches, but also those intermediate attention modules. Furthermore, within each supervision branch, hybrid pooling with two different strides is executed to enhance feature representation capabilities. Extensive experiments with ablation analysis demonstrate the effectiveness of the proposed method, and state-of-the-art results are achieved on three public benchmark datasets, including Market-1501, CUHK03, and DukeMTMC-ReID.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SliceNet: Mask Guided Efficient Feature Augmentation for Attention-Aware Person Re-Identification

Person re-identification method based on fine-grained feature fusion and self-attention mechanism

Article 25 March 2024

Multi-level feature learning with attention for person re-identification

Article 25 August 2020

References

Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3908–3916 (2015)
Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 403–412 (2017)
Chen, B., Deng, W., Hu, J.: Mixed high-order attention network for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 371–381 (2019)
Chen, T., Ding, S., Xie, J., Yuan, Y., Chen, W., Yang, Y., Ren, Z., Wang, Z.: ABD-Net: attentive but diverse person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8351–8361 (2019)
Chen, X., Fu, C., Zhao, Y., Zheng, F., Song, J., Ji, R., Yang, Y.: Salience-guided cascaded suppression network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3300–3310 (2020)
Chen, H., Lagadec, B., Bremond, F.: ICE: inter-instance contrastive encoding for unsupervised person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 14960–14969 (2021)
Chen, P., Liu, W., Dai, P., Liu, J., Ye, Q., Xu, M., Chen, Q., Ji, R.: Occlude them all: occlusion-aware attention network for occluded person re-id. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 11833–11842 (2021)
Chen, Y., Wang, H., Sun, X., Fan, B., Tang, C., Zeng, H.: Deep attention aware feature learning for person re-identification. Pattern Recogn. 126, 108,567 (2022)
Article Google Scholar
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1335–1344 (2016)
Fu, Y., Wei, Y., Zhou, Y., Shi, H., Huang, G., Wang, X., Yao, Z., Huang, T.: Horizontal pyramid matching for person re-identification. In: Proceedings of AAAI Conference on Artificial Intelligence, vol. 33, pp. 8295–8302 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv:1703.07737 (2017)
Islam, K.: Person search: new paradigm of person re-identification: a survey and outlook of recent works. Image Vis. Comput. 101, 103,970 (2020)
Article Google Scholar
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: Deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 152–159 (2014)
Li, D., Chen, X., Zhang, Z., Huang, K.: Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 384–393 (2017)
Li, S., Bak, S., Carr, P., Wang, X.: Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 369–378 (2018)
Li, W., Zhu, X., Gong, S.: Harmonious attention network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2285–2294 (2018)
Li, H., Wu, G., Zheng, W.S.: Combined depth space based architecture search for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6729–6738 (2021)
Li, Y., He, J., Zhang, T., Liu, X., Zhang, Y., Wu, F.: Diverse part discovery: Occluded person re-identification with part-aware transformer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2898–2907 (2021)
Lian, S., Jiang, W., Hu, H.: Attention-aligned network for person re-identification. IEEE Trans. Circuits Syst. Video Technol. 31(8), 3140–3153 (2020)
Article Google Scholar
Liu, X., Zhao, H., Tian, M., Sheng, L., Shao, J., Yi, S., Yan, J., Wang, X.: Hydraplus-net: attentive deep features for pedestrian analysis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 350–359 (2017)
Luo, C., Chen, Y., Wang, N., Zhang, Z.: Spectral feature transformation for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4976–4985 (2019)
Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Luo, H., Jiang, W., Zhang, X., Fan, X., Qian, J., Zhang, C.: Alignedreid++: dynamically matching local information for person re-identification. Pattern Recogn. 94, 53–61 (2019)
Article Google Scholar
Martinel, N., Foresti, G.L., Micheloni, C.: Deep pyramidal pooling with attention for person re-identification. IEEE Trans. Image Process. 29, 7306–7316 (2020)
Article MATH Google Scholar
Ming, Z., Zhu, M., Wang, X., Zhu, J., Cheng, J., Gao, C., Yang, Y., Wei, X.: Deep learning-based person re-identification methods: a survey and outlook of recent works. Image Vis. Comput. 119, 104,394 (2022)
Article Google Scholar
Rao, Y., Chen, G., Lu, J., Zhou, J.: Counterfactual attention learning for fine-grained visual categorization and re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1025–1034 (2021)
Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: Proceedings of European Conference on Computer Vision, pp. 17–35. Springer, Berlin (2016)
Sarfraz, M.S., Schumann, A., Eberle, A., Stiefelhagen, R.: A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 420–429 (2018)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Song, C., Huang, Y., Ouyang, W., Wang, L.: Mask-guided contrastive attention model for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1179–1188 (2018)
Sun, Y., Zheng, L., Deng, W., Wang, S.: SVDNet for pedestrian retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3800–3808 (2017)
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of European Conference on Computer Vision, pp. 480–496 (2018)
Tang, S., Andriluka, M., Andres, B., Schiele, B.: Multiple people tracking by lifted multicut and person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3539–3548 (2017)
Tay, C.P., Roy, S., Yap, K.H.: AANet: attribute attention network for person re-identifications. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7134–7143 (2019)
Varior, R.R., Haloi, M., Wang, G.: Gated siamese convolutional neural network architecture for human re-identification. In: Proceedings of European Conference on Computer Vision, pp. 791–808. Springer, Berlin (2016)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. arXiv:1706.03762 (2017)
Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proc. 26th ACM Multimedia Conference on Multimedia Conference, pp. 274–282 (2018)
Wang, H., Shen, J., Liu, Y., Gao, Y., Gavves, E.: NFormer: robust person re-identification with neighbor transformer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7297–7307 (2022)
Wang, Z., Zhu, F., Tang, S., Zhao, R., He, L., Song, J.: Feature erasing and diffusion network for occluded person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4754–4763 (2022)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of European Conference on Computer Vision, pp. 3–19 (2018)
Wu, G., Zhu, X., Gong, S.: Learning hybrid ranking representation for person re-identification. Pattern Recogn. 121, 108,239 (2022)
Article Google Scholar
Xia, B.N., Gong, Y., Zhang, Y., Poellabauer, C.: Second-order non-local attention networks for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3760–3769 (2019)
Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1249–1258 (2016)
Xu, J., Zhao, R., Zhu, F., Wang, H., Ouyang, W.: Attention-aware compositional network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2119–2128 (2018)
Yang, F., Yan, K., Lu, S., Jia, H., Xie, X., Gao, W.: Attention driven person re-identification. Pattern Recogn. 86, 143–155 (2019)
Article Google Scholar
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 2872–2893 (2021)
Article Google Scholar
Zeng, K., Ning, M., Wang, Y., Guo, Y.: Hierarchical clustering with hard-batch triplet loss for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 13657–13665 (2020)
Zhang, Z., Lan, C., Zeng, W., Chen, Z.: Densely semantically aligned person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 667–676 (2019)
Zhang, A., Gao, Y., Niu, Y., Liu, W., Zhou, Y.: Coarse-to-fine person re-identification with auxiliary-domain classification and second-order information bottleneck. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 598–607 (2021)
Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3186–3195 (2020)
Zhang, Z., Zhang, H., Liu, S.: Person re-identification using heterogeneous local graph attention networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12136–12145 (2021)
Zhao, H., Tian, M., Sun, S., Shao, J., Yan, J., Yi, S., Wang, X., Tang, X.: Spindle Net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1077–1085 (2017)
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1116–1124 (2015)
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1367–1376 (2017)
Zheng, Z., Zheng, L., Yang, Y.: A discriminatively learned CNN embedding for person reidentification. ACM Trans. Multimed. Comput. Commun. Appl. 14(1), 1–20 (2017)
Article Google Scholar
Zheng, Z., Zheng, L., Yang, Y.: Pedestrian alignment network for large-scale person re-identification. IEEE Trans. Circuits Syst. Video Technol. 29(10), 3037–3045 (2018)
Article Google Scholar
Zheng, F., Deng, C., Sun, X., Jiang, X., Guo, X., Yu, Z., Huang, F., Ji, R.: Pyramidal person re-identification via multi-loss dynamic training. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8514–8522 (2019)
Zhong, Z., Zheng, L., Cao, D., Li, S.: Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1318–1327 (2017)
Zhou, S., Wang, J., Meng, D., Liang, Y., Gong, Y., Zheng, N.: Discriminative feature learning with foreground attention for person re-identification. IEEE Trans. Image Process. 28(9), 4671–4684 (2019)
Zhou, J., Su, B., Wu, Y.: Online joint multi-metric adaptation from frequent sharing-subset mining for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2909–2918 (2020)

Download references

Acknowledgements

This work was supported by the Research Funds of Suzhou Vocational University (SVU2021YY03), Science and Technology Program of Suzhou (SS202151, SNG2021037), the Innovation Project of Engineering Research Center of Integration and Application of Digital Learning Technology, Ministry of Education (1221046) and in part by the Program to Cultivate Middle-aged and Young Cadre Teacher of Suzhou Vocational University.

Author information

Yuanfeng Yang, Xun Sun, Liang Zhang and Ligang Fang authors contributed equally to this study.

Authors and Affiliations

School of Computer Engineering, Suzhou Vocational University, Suzhou, 215104, China
Husheng Dong, Yuanfeng Yang, Xun Sun, Liang Zhang & Ligang Fang
Jiangsu Province Support Software Engineering R &D Center for Modern Information Technology Application in Enterprise, Suzhou, 215104, China
Husheng Dong

Authors

Husheng Dong
View author publications
You can also search for this author in PubMed Google Scholar
Yuanfeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Liang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ligang Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Husheng Dong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dong, H., Yang, Y., Sun, X. et al. Cascaded attention-guided multi-granularity feature learning for person re-identification. Machine Vision and Applications 34, 4 (2023). https://doi.org/10.1007/s00138-022-01353-3

Download citation

Received: 01 May 2022
Revised: 07 August 2022
Accepted: 21 October 2022
Published: 18 November 2022
DOI: https://doi.org/10.1007/s00138-022-01353-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cascaded attention-guided multi-granularity feature learning for person re-identification

Abstract

Access this article

Similar content being viewed by others

SliceNet: Mask Guided Efficient Feature Augmentation for Attention-Aware Person Re-Identification

Person re-identification method based on fine-grained feature fusion and self-attention mechanism

Multi-level feature learning with attention for person re-identification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cascaded attention-guided multi-granularity feature learning for person re-identification

Abstract

Access this article

Similar content being viewed by others

SliceNet: Mask Guided Efficient Feature Augmentation for Attention-Aware Person Re-Identification

Person re-identification method based on fine-grained feature fusion and self-attention mechanism

Multi-level feature learning with attention for person re-identification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation