Abstract
Group re-identification (GReID) aims to correctly associate images containing the same group members captured with non-overlapping camera networks, which has important applications in video surveillance. Unlike the person re-identification, the unique challenge of GReID lies in variations of group structure, including the number and layout of members. Current methods use certainty modeling, in which the specific group structure presented in each image is considered. However, certainty modeling can only describe finite group structures and shows poor generalization for unseen group structures, i.e., group variations that do not exist in the training set. In this paper, we propose a methodology called uncertainty modeling, which excavates near-infinite group structures from finite samples by simulating variations in both number and layout. Specifically, member uncertainty treats the number of intra-group members as a truncated Gaussian distribution instead of a fixed value and then simulates member variations by dynamic sampling. Layout uncertainty constructs random affine transformations about the positions of members to enlarge the fixed schemes in the training set. To implement the proposed methodology, we technically propose an Uncertainty-Modeling Second-Order Transformer (UMSOT) that extracts a first-order token for each member and further uses these tokens to learn a second-order token as a group feature. The UMSOT exploits the structural advantages of the transformer to explicitly extract layout features and efficiently integrate appearance and layout features, which are hardly achievable by current CNN- and GNN-based methods. Comprehensive experiments on four datasets (CSG, SYSUGroup, RoadGroup, and iLIDS-MCTS), fully demonstrate the superiority of the proposed method, which surprisingly outperforms the state-of-the-art method by 30.4% in Rank1 on the CSG dataset. https://github.com/LinlyAC/UMSOT.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
We use the \(\min \) or \(\max \) function in accordance with MATLAB style. When considering a matrix, these two functions return the extreme value of each column; when considering a vector, these two functions return a single extreme value.
Equation (18) undergoes further uncertainty modeling to eventually become the serialized input of the group feature transformer, which is merely a demonstration of the second-order idea.
References
Bai, Y., Jiao, J., Ce, W., Liu, J., Lou, Y., Feng, X., & Duan, L.-Y. (2021). Person30k: A dual-meta generalization network for person re-identification. In CVPR (pp. 2123–2132).
Bottou, L. (2012). Stochastic gradient descent tricks. In NN (pp. 421–436).
Cai, Y., Takala, V., & Pietikäinen, M. (2010). Matching groups of people by covariance descriptor. In ICPR (pp. 2744–2747).
Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., & Gao, W. (2021). Pre-trained image processing transformer. In CVPR (pp. 12299–12310).
Choi, S., Kim, T., Jeong, M., Park, H., & Kim, C. (2021). Meta batch-instance normalization for generalizable person re-identification. In CVPR (pp. 3425–3435).
Dai, Y., Li, X., Liu, J., Tong, Z., & Duan, L.-Y. (2021). Generalizable person re-identification with relevance-aware mixture of experts. In CVPR (pp. 16145–16154).
Deng, J., Dong, W., Socher, R., Li, L., Li, K., & Li, F. (2009). Imagenet: A large-scale hierarchical image database. In CVPR (pp. 248–255).
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR.
Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In ICML (Vol. 48, pp. 1050–1059).
Han, J., Li, Y.-L., & Wang, S. (2022). Delving into probabilistic uncertainty for unsupervised domain adaptive person re-identification. AAAI, 36(1), 790–798.
He, L., Liao, X., Liu, W., Liu, X., Cheng, P., & Mei, T. (2020). Fastreid: A pytorch toolbox for general instance re-identification. arXiv preprint arXiv:2006.02631
He, S., Luo, H., Wang, P., Wang, F., Li, H., & Jiang, W. (2021a). Transreid: Transformer-based object re-identification. In ICCV (pp. 15013–15022).
He, T., Shen, X., Huang, J., Chen, Z., & Hua, X.-S. (2021b). Partial person re-identification with part-part correspondence learning. In CVPR (pp. 9105–9115).
Hong, P., Wu, T., Wu, A., Han, X., & Zheng, W.-S. (2021). Fine-grained shape-appearance mutual learning for cloth-changing person re-identification. In CVPR (pp. 10513–10522).
Hong, M., Liu, J., Li, C., & Qu, Y. (2022). Uncertainty-driven dehazing network. AAAI, 36(1), 906–913.
Hu, P., Zheng, H., & Zheng, W. (2021). Part relational mean model for group re-identification. IEEE Access, 9.
Huang, Y., Bai, B., Zhao, S., Bai, K., & Wang, F. (2022). Uncertainty-aware learning against label noise on imbalanced datasets. AAAI, 36(6), 6960–6969.
Huang, Z., Wang, Z., Hu, W., Lin, C. W., & Satoh, S. (2019). Dot-GNN: Domain-transferred graph neural network for group re-identification. In ACMMM (pp. 1888–1896).
Huang, Z., Wang, Z., Tsai, C. C., Satoh, S., & Lin, C. W. (2021). Dotscn: Group re-identification via domain-transferred single and couple representation learning. IEEE TCSVT, 31(7), 2739–2750.
Kendall, A., & Gal, Y. (2017) What uncertainties do we need in Bayesian deep learning for computer vision? In NeurIPS.
Li, X., Dai, Y., Ge, Y., Liu, J., Shan, Y., & Duan, L. (2022). Uncertainty modeling for out-of-distribution generalization. In ICLR.
Liang, S., Dai, W., & Wei, Y. (2021). Uncertainty learning for noise resistant sketch-based 3d shape retrieval. IEEE TIP, 30, 8632–8643.
Liao, S., Hu, Y., Zhu, X., & Li, S. Z. (2015). Person re-identification by local maximal occurrence representation and metric learning. In CVPR (pp. 2197–2206).
Lin, W., Li, Y., Xiao, H., See, J., Zou, J., Xiong, H., Wang, J., & Mei, T. (2021). Group reidentification with multigrained matching and integration. IEEE TCYB, 51(3), 1478–1492.
Lisanti, G., Martinel, N., Bimbo, A. D., & Foresti, G. L. (2017). Group re-identification via unsupervised transfer of sparse features encoding. In ICCV (pp. 2468–2477).
Lisanti, G., Martinel, N., Micheloni, C., Bimbo, A. D., & Foresti, G. L. (2019). From person to group re-identification via unsupervised transfer of sparse features. IVC, 83–84.
Lisanti, G., Masi, I., Bagdanov, A. D., & Bimbo, A. D. (2015). Person re-identification by iterative re-weighted sparse ranking. IEEE TPAMI, 37(8), 1629–1642.
Liu, X., Zhang, P., Yu, C., Lu, H., & Yang, X. (2021a). Watching you: Global-guided reciprocal learning for video-based person re-identification. In CVPR (pp. 13334–13343).
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021b). Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV (pp. 10012–10022).
Luo, H., Jiang, W., Zhang, X., Fan, X., Qian, J., & Zhang, C. (2019). Alignedreid++: Dynamically matching local information for person re-identification. PR, 94, 53–61.
Matsukawa, T., Okabe, T., Suzuki, E., & Sato, Y. (2016). Hierarchical Gaussian descriptor for person re-identification. In CVPR (pp. 1363–1372).
Mei, L., Lai, J., Feng, Z., & Xie, X. (2020). From pedestrian to group retrieval via siamese network and correlation. Neurocomputing, 412, 447–460.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In NeurIPS (pp. 8024–8035).
Sakaridis, C., Dai, D., & Van Gool, L. (2022). Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. IEEE TPAMI, 44(6), 3139–3153.
Sun, Y., Cheng, C., Zhang, Y., Zhang, C., Zheng, L., Wang, Z., & Wei, Y. (2020). Circle loss: A unified perspective of pair similarity optimization. In CVPR.
Sun, Y., Zheng, L., Li, Y., Yang, Y., Tian, Q., & Wang, S. (2021). Learning part-based convolutional features for person re-identification. IEEE TPAMI, 43, 902–917.
Sun, Y., Zheng, L., Yang, Y., Tian, Q., & Wang, S. (2018). Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In ECCV (pp. 480–496).
van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. JMLR, 9(86), 2579–2605.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. NeurIPS, 30.
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lió, P., & Bengio, Y. (2018). Graph attention networks. In ICLR.
Wang, G., Yuan, Y., Chen, X., Li, J., & Zhou, X. (2018). Learning discriminative features with multiple granularities for person re-identification. ACMMM, 274–282.
Wang, G., Yuan, Y., Chen, X., Li, J., & Zhou, X. (2018). Learning discriminative features with multiple granularities for person re-identification. In ACMMM (pp. 274–282).
Wang, H., & Yeung, D.-Y. (2020). A survey on bayesian deep learning. ACM Computing Surveys, 53(5).
Won, C., Ryu, J., & Lim, J. (2021). End-to-end learning for omnidirectional stereo matching with uncertainty prior. IEEE TPAMI, 43(11), 3850–3862.
Wu, G., Zhu, X., & Gong, S. (2022). Learning hybrid ranking representation for person re-identification. PR, 121, 108239.
Xiao, H., Lin, W., Sheng, B., Lu, K., Yan, J., Wang, J., Ding, E., Zhang, Y., & Xiong, H. (2018). Group re-identification: Leveraging and integrating multi-grain information. In: ACMMM (pp. 192–200).
Xie, J., Ma, Z., Xue, J.-H., Zhang, G., Sun, J., Zheng, Y., & Guo, J. (2021). DS-UI: Dual-supervised mixture of gaussian mixture models for uncertainty inference in image recognition. IEEE TIP, 30, 9208–9219.
Xiong, F., Gou, M., Camps, O., & Sznaier, M. (2014). Person re-identification using kernel-based metric learning methods. In ECCV (pp. 1–16).
Xu, Q., Yang, H., Chen, L., & Zhai, G. (2019). Group re-identification with hybrid attention model and residual distance. ICIP, 1217–1221.
Yan, Y., Qin, J., Ni, B., Chen, J., Liu, L., Zhu, F., Zheng, W.-S., Yang, X., & Shao, L. (2023). Learning multi-attention context graph for group-based re-identification. IEEE TPAMI, 45(6), 7001–7018.
Yang, F., Zhong, Z., Luo, Z., Cai, Y., Lin, Y., Li, S., & Sebe, N. (2021). Joint noise-tolerant learning and meta camera shift adaptation for unsupervised person re-identification. In CVPR (pp. 4855–4864).
Yang, G., Fini, E., Xu, D., Rota, P., Ding, M., Nabi, M., Alameda-Pineda, X., & Ricci, E. (2023). Uncertainty-aware contrastive distillation for incremental semantic segmentation. IEEE TPAMI, 45(2), 2567–2581.
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., & Hoi, S. C. H. (2022). Deep learning for person re-identification: A survey and outlook. IEEE TPAMI, 44(6), 2872–2893.
Zhai, Y., Guo, X., Lu, Y., & Li, H. (2019). In defense of the classification loss for person re-identification. In CVPRW (pp. 1526–1535).
Zhang, Q., Lai, J., & Xie, X. (2021). Learning modal-invariant angular metric by cyclic projection network for VIS-NIR person re-identification. IEEE TIP, 30, 8019–8033.
Zhang, Q., Lai, J.-H., Feng, Z., & Xie, X. (2022). Uncertainty modeling with second-order transformer for group re-identification. AAAI, 36(3), 3318–3325.
Zhang, Q., Lai, J., Feng, Z., & Xie, X. (2022). Seeing like a human: Asynchronous learning with dynamic progressive refinement for person re-identification. IEEE TIP, 31, 352–365.
Zhang, J., Fan, D.-P., Dai, Y., Anwar, S., Saleh, F., Aliakbarian, S., & Barnes, N. (2022). Uncertainty inspired RGB-D saliency detection. IEEE TPAMI, 44(9), 5761–5779.
Zhao, Y., Zhong, Z., Yang, F., Luo, Z., Lin, Y., Li, S., & Sebe, N. (2021). Learning to generalize unseen domains via memory-based multi-source meta-learning for person re-identification. In CVPR (pp. 6277–6286).
Zheng, W., Gong, S., & Xiang, T. (2009). Associating groups of people. In BMVC (pp. 1–11).
Zhou, K., Yang, Y., Cavallaro, A., & Xiang, T. (2019). Omni-scale feature learning for person re-identification. In ICCV (pp. 3701–3711).
Zhou, K., Yang, Y., Cavallaro, A., & Xiang, T. (2022). Learning generalisable omni-scale representations for person re-identification. IEEE TPAMI, 44(9), 5056–5069.
Zhu, F., Chu, Q., & Yu, N. (2016). Consistent matching based on boosted salience channels for group re-identification. In ICIP (pp. 4279–4283).
Zhu, J., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV (pp. 2242–2251).
Zhu, J., Yang, H., Lin, W., Liu, N., Wang, J., & Zhang, W. (2021). Group re-identification with group context graph neural networks. IEEE TMM, 23, 2614–2626.
Acknowledgements
This project was supported in part by the NSFC (62076258, U22A2095), in part by the Key-Area Research and Development Program of Guangzhou (202206030003), in part by Guangdong Project (No. 2020B1515120085), and in part by International Program Fund for Young Talent Scientific Research People, Sun Yat-Sen University. This project was partially done by Quan Zhang as a visiting scholar at Johns Hopkins University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
Group re-identification (GReID) aims to make a positive contribution to society. For example, child trafficking and kidnapping usually involve more than two people. GReID can be applied to detect and prevent these events. In addition, the datasets used in this paper also strictly comply with ethical requirements and privacy policies. All the images of people shown in this paper have been processed with privacy protection technology.
Additional information
Communicated by Shin’ichi Satoh.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Q., Lai, J., Feng, Z. et al. Uncertainty Modeling for Group Re-Identification. Int J Comput Vis 132, 3046–3066 (2024). https://doi.org/10.1007/s11263-024-02013-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-024-02013-x