Siamese Cosine Network Embedding for Person Re-identification

Wang, Jiabao; Li, Yang; Miao, Zhuang

doi:10.1007/978-981-10-7305-2_31

Jiabao Wang¹⁶,
Yang Li¹⁶ &
Zhuang Miao¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 773))

Included in the following conference series:

CCF Chinese Conference on Computer Vision

2710 Accesses
4 Citations

Abstract

In person re-identification, feature embedding is the key point for new coming identities. Most state-of-the-art models adopt the features learned by convolutional neural networks (CNNs) to do similarity comparison. However, the learned features are not good enough for new identities because CNNs are designed for classification of class-known objects, not for similarity comparison of any two identities. To improve feature embedding, we propose a pairwise cosine loss based on cosine similarity measurement. Subsequently, we design a Siamese cosine network embedding (SCNE) to learn deep features for person re-identification. It is based on the Siamese architecture, with intra-class input pairs and joint supervision by the softmax loss and the pairwise cosine loss. Experimental results show that our SCNE achieves the state-of-the-art performance on the public Market1501 and CUHK03 person re-ID benchmarks.

You have full access to this open access chapter, Download conference paper PDF

Cross Dataset Person Re-identification

A Simple Deep Feature Representation for Person Re-identification

Deep Metric Learning with Symmetric Triplet Constraint for Person Re-identification

Keywords

1 Introduction

Given a person of interest as query, person re-identification (re-ID) is aim to determine whether the person has been observed by another camera [9, 16, 17, 20, 25, 26]. It is a completely different problem from classification, which can be considered as a close-set problem [6, 8, 11]. However, person re-ID needs to search a new person which has to be treated as a new class because it has never appeared in the training dataset. So person re-ID requires good features to represent the new identities. It is an unclose-set challenging problem.

Recently, the features learned by convolutional neural networks (CNNs) have been widely used for person re-ID [25]. However, these features are not good enough for person re-ID because CNNs is designed for classification of class-known objects, not for similarity comparison of any two identities. As shown in Fig. 1(a), the 2D features are learned by CNNs with only softmax loss on MNIST dataset by LeNets++ [19], where we find that the features fill the whole feature space and have an uniform and flat distribution for each class samples. This is inappropriate for person re-ID, because two intra-class features may have a relatively low similarity even if they are classified correctly. More importantly, there is no extra space for new identities. If we want the CNNs to learn more discriminative features for new identities, we need to compact the intra-class distribution of the learned features for the existing classes.

To compact the intra-class distribution, we propose a new pairwise cosine loss to measure the similarity between two intra-class features. As the features learned by the existing CNNs have an angle distribution as illustrated in Fig. 1(a), so it is desired to use a cosine loss to learn features and also utilize the cosine similarity for feature comparison in the evaluation stage. As shown in Fig. 1(b), by using our proposed pairwise cosine loss, the angle distribution of the features from the known classes are indeed compact. Hence, a lot of room is spared for describing new incoming identities. Another contribution of this paper is to design a novel network based on the Siamese network, which inputs only positive pair of images and pulls their features closer as possible. It is different from the exist methods, which inputs both positive and negative pairs [9, 20].

In this paper, we design a Siamese cosine network embedding (SCNE), to learn the discriminative features for person re-ID. Compared to previous networks, we make the learned features not only separable but also compact. Our contributions are:

A pairwise cosine loss is proposed to compact the distribution of the intra-class features. It is appropriate for cosine similarity comparison in person re-ID application.
We design the SCNE to learn discriminative features by the joint supervision of the softmax loss and the pairwise cosine loss. The input pairs of our proposed network only have the positive pairs, without the negative pairs. This is because the inter-class separation can be achieved by the softmax loss in CNNs.
Experimental results show that our approach achieves the state-of-the-art performance on the public Market1501 and CUHK03 person re-ID benchmarks.

2 Related Works

Our SCNE is inspired by the work of [26], where the identification loss and the verification loss are used for training. The former is the same as the softmax loss, and the latter is a variant of the center loss, where the added Square layer is an Euclidean distance for each dimension of the features. In evaluation, the similarity is computed by the cosine distance, so it is not good enough when the network is supervised by the center loss. However, the pairwise cosine loss we proposed in this paper is consistent with the similarity comparison. So it could achieve better performance than the work of [26].

Several works solved the person re-ID problem based on Siamese network, such as [5, 16, 17, 22]. The work of [17] adopted the Long Short-Term Memory (LSTM) for memorizing the spatial dependencies of the divided regions in a person image. The Siamese network architecture is used for comparing the input pair images by a contrastive loss function. The contractive loss is to repel dissimilar inputs and attract similar inputs. The work of [16] also used the Siamese network for comparing features across pairs of images. It adopted a gating function to selectively emphasize the fine common local patterns in a person image. The work of [5] is also very similar to our work, but it used the GoogLeNet [14] as the base network. And more, a loss specific dropout unit is proposed to have a pairwise-consistent dropout for the verification subnet. This special designed network has achieved great performance. All above works used the negative input pair and the positive input pair to learn the network, which is different from our only positive input pair.

Besides, the work of [22] also used a cosine distance for Siamese network, but they adopted it as a connection function for the cost function. They treated the output of the network as a binary-class classification problem just for similar measurement. It is naturally a verification network, it has been proved that it is not good enough for person re-ID, without the identification network [26]. In this paper, we propose to combine two identification networks by the pairwise cosine loss, which can separate inter-class features and effectively compact intra-class features.

3 Siamese Cosine Network Embedding (SCNE)

3.1 The Proposed Pairwise Cosine Loss

Suppose the input image of CNNs is ${{\mathbf {x}}_{i}}$ and its label is ${{y}_{i}}$. The input ${{\mathbf {f}}_{i}}$ of the last fully-connection (FC) layer is always used as feature to represent ${{\mathbf {x}}_{i}}$ for similarity comparison. In the last FC layer, suppose the parameters is ${{\mathbf {W}}^{j}}, j=1,\ldots ,C$, where C is the number of the output, and then the output is $o_{i}^{j}={{({{\mathbf {W}}^{j}})}^{T}}{{\mathbf {f}}_{i}}$. If we want the jth output to be maximum, we need to maximize the value $o_{i}^{j}$. For the widely used softmax log-loss, we have

$$\begin{aligned} {{L}_{s}}=-\sum \limits _{i=1}^{N}{\log \frac{\exp (o_{i}^{{{y}_{i}}})}{\sum \nolimits _{k=1}^{C}{\exp (o_{i}^{k})}}} \end{aligned}$$

(1)

where $o_{i}^{{{y}_{i}}}$ is the output value at the label ${{y}_{i}}$ position, and N is the number of the samples.

Obviously, the softmax log-loss just separates the features into different class without compacting the intra-class features effectively. The problem boils down to develop an efficient loss function to compact the feature distribution of each class. Intuitively, based on the angle distribution of the features learned by the ImageNet pre-trained CNNs, the model is going to minimize the cosine loss of the two intra-class features produced by the input pair in Siamese network, to pull the intra-class features close to each other. The cosine similarity measurement could be adopted to achieve better performance for person re-ID.

To this end, we propose the pairwise cosine loss function, as formulated in (2):

$$\begin{aligned} {{L}_{c}}=\sum \limits _{i=1}^{N}{\left( 1-\cos ({\mathbf {f}}_{i}^{a},{\mathbf {f}}_{i}^{b})\right) } \end{aligned}$$

(2)

where $\cos (\mathbf {f}_{i}^{a},\mathbf {f}_{i}^{b}) =\frac{{{(\mathbf {f}_{i}^{a})}^{T}}{\mathbf {f}}_{i}^{b}}{||{\mathbf {f}}_{i}^{a}|| ||{\mathbf {f}}_{i}^{b}||} ={{\left( \frac{{\mathbf {f}}_{i}^{a}}{||{\mathbf {f}}_{i}^{a}||} \right) }^{T}} \left( \frac{{\mathbf {f}}_{i}^{b}}{||{\mathbf {f}}_{i}^{b}||} \right) $, ${\mathbf {f}}_{i}^{a}$ and ${\mathbf {f}}_{i}^{b}$ are the deep learned features of the input pair, and $\frac{{\mathbf {f}}_{i}^{a}}{||{\mathbf {f}}_{i}^{a}||}$ and $\frac{{\mathbf {f}}_{i}^{b}}{||{\mathbf {f}}_{i}^{b}||}$ are the $l_2$ normalized features. This loss function has a cosine part, which is the cosine value between ${\mathbf {f}}_{i}^{a}$ and ${\mathbf {f}}_{i}^{b}$. It effectively characterizes the intra-class cosine variation if the pair images have the same label. So it requires that the input of our Siamese network must have only the positive pair.

To learn and update the parameters of our network, we need to compute the gradient of $L_c$ with respect to ${\mathbf {f}}_{i}^{a}$ and ${\mathbf {f}}_{i}^{b}$ to conduct the back propagation algorithm. The gradients are given as follows,

$$\begin{aligned} \frac{\partial {{L}_{c}}}{\partial {\mathbf {f}}_{i}^{a}}=\frac{1}{||{\mathbf {f}}_{i}^{a}||}\left( \cos ({\mathbf {f}}_{i}^{a},{\mathbf {f}}_{i}^{b})\frac{{\mathbf {f}}_{i}^{a}}{||{\mathbf {f}}_{i}^{a}||}-\frac{{\mathbf {f}}_{i}^{b}}{||{\mathbf {f}}_{i}^{b}||} \right) \end{aligned}$$

(3)

$$\begin{aligned} \frac{\partial {{L}_{c}}}{\partial {\mathbf {f}}_{i}^{b}}=\frac{1}{||{\mathbf {f}}_{i}^{b}||}\left( \cos ({\mathbf {f}}_{i}^{a},{\mathbf {f}}_{i}^{b})\frac{{\mathbf {f}}_{i}^{b}}{||{\mathbf {f}}_{i}^{b}||}-\frac{{\mathbf {f}}_{i}^{a}}{||{\mathbf {f}}_{i}^{a}||} \right) \end{aligned}$$

(4)

In (3) and (4), $\cos ({\mathbf {f}}_{i}^{a},{\mathbf {f}}_{i}^{b})$, $\frac{{\mathbf {f}}_{i}^{a}}{||{\mathbf {f}}_{i}^{a}||}$ and $\frac{{\mathbf {f}}_{i}^{b}}{||{\mathbf {f}}_{i}^{b}||}$ can be pre-computed in the forward pass, and they will be re-used in back propagation for efficient computation purpose. It has to be noted that there is no parameters in our added pairwise cosine loss layer.

3.2 Joint Optimization

If one wants to compact the intra-class features while keeping them separated, the softmax log-loss in (1) and the pairwise cosine loss in (2) should be combined. The joint objective function of the two losses is given as follows:

$$\begin{aligned} L={{L}_{s}}+\lambda {{L}_{c}} \end{aligned}$$

(5)

where the parameter $\lambda $ is used for balancing the two losses. The softmax log-loss can be considered as a special case when $\lambda =0$.

If we only use the softmax log-loss as supervision, the learned features would contain large intra-class variations. On the other hand, if we only supervise CNNs by the pairwise cosine loss, the learned features will be degraded to zeros or lines (At this point, the cosine loss is very small). Simply using either of them could not achieve discriminative feature learning. So it is necessary to combine them.

3.3 Architecture of the Designed SCNE

Our network is based on the Siamese network. Figure 2 briefly illustrates the architecture of the proposed network, where the parameter shared layers can be replaced by ImageNet pre-trained CNN layers. The network consists of two parameter shared CNN streams, two modified FC layers and three losses. The features extracted by the network are used as the descriptors, which directly supervised by two softmax losses and one pairwise cosine loss. The softmax loss is used for class prediction and the pairwise cosine loss is used for compacting the intra-class variation. The high level feature ${\mathbf {f}}_{i}^{a}$, ${\mathbf {f}}_{i}^{b}$ are merged in our added pairwise cosine loss layer, which has no parameters. The ImageNet pre-trained CNN model can be taken as AlexNet [8], VGGNet [11] or ResNet [6]. In this paper, we take Res50Net as the baseline for comparing with the-state-of-arts.

In order to finetune the network on different person re-ID datasets, we replace the final FC layer of the pre-trained Res50Net model with a $1\times 1\times 2048\times n$ dimensional FC layer, where n is the number of training identities in the training dataset. Given an input pair of intra-class images resized to $224\times 224$, the network predicts the identities of the two images and computes the pairwise cosine loss for them. The pairwise cosine loss layer is coupled with the last FC layer and affects the distribution of the learned features.

4 Experiments

4.1 Datasets and Preparation

The proposed model is tested on two large-scale person re-ID benchmarks, Market1501 [24] and CUHK03 [9].

Market1501 dataset has 32668 images of 1501 identities. According to the dataset setting, 12936 images of 751 identities are for training and 19732 images of 750 identities and distractors are for testing. The images are cropped by the deformable part model (DPM) [4] detector automatically and are closer to the realistic setting. The evaluation is followed by the dataset baseline setting.

CUHK03 dataset consists of 13164 cropped images of 1467 identities collected in the CUHK campus. The bounding boxes detected by DPM detector are closer to realistic setting and are used in experiments. Following the given setting, the dataset is partitioned into a training set of 1367 identities and a testing set of 100 identities. The experiments are repeated with 20 random splits. In evaluation, we randomly select 100 images from 100 identities under another camera as galley.

The training images are resized to $256\times 256$ uniformly, and subtract the mean image computed from all the training images. For adapting to the input of the Res50Net network, we cropped the images at $224\times 224$. The training images are randomly mirrored horizontally. We get the batch of the training images randomly and online sample another same label images to compose an intra-class input pair.

4.2 Implementation Setting

The MatconvNet package [18] is used for training and testing. The epoch is set to 30 epochs. We adopt the mini-batch stochastic gradient descent to update the parameters of our network. The batch size is set 64 pairs. The learning rate is initialzed as 0.01 and set to 0.001 after 15 epochs, and 0.0001 for the final 5 epochs. There are three objectives in our network. All the gradients produced by every objectives respectively and added together by different weights. We assign 0.5 for the two gradients produced by two softmax log-losses and 1 for the gradient produced by the pairwise cosine loss.

For testing, we extract features by only activating one stream at the output before the FC layer in our fine-tuned model. Given an input image with size $224\times 224$, we feed forward the image to the network and get the corresponding descriptor at the output of the ‘pool5’ layer for Res50Net. Once the descriptors for query and gallery sets are obtained, we sort the cosine distance between two sets to get the final result. The mean average precision (mAP) and rank-1 accuracy are used for evaluation.

4.3 Results on Market1501

On the Market-1501 dataset, we compare the results with state-of-the-art algorithms, in which PersonNet [20], Verification-Classification [26], DeepTransfer [5], Gated Reid [16] and S-LSTM [17] are all based on the Siamese network and have achieved the state-of-the art performance. SMOAnet [2] uses synthetic data to train a Inception network, while GAN ResNet [27] use the generative adversarial networks (GAN) to generate unlabeled samples for learning better models. Both of them can be thought as a variant of data augmentation.

Table 1. Comparison with the state-of-the-art methods on Market1501.

Full size table

The single query (SQ) and multiple query (MQ) results are reported in Table 1. Our SCNE achieves 83.25% rank-1 accuracy and 63.50% mAP under the single query mode and 88.42% rank-1 accuracy and 71.27% mAP under the multiple query mode, which is the second among all the above results. It greatly outperforms Gated Reid [16] and S-LSTM [17] methods, which used the Siamese network without combining classification loss and verification loss. Our method also outperforms Verification-Classification [26], which used a Euclidean loss for verification. It’s not good enough for similarity comparison by using cosine similarity measurement. The best method is the DeepTransfer [5], which adopted a different designed dropout strategy to combine classification loss and verification loss, based on the GoogLeNet base network.

4.4 Results on CUHK03

On the CUHK03 dataset, there are two types of evaluations, single shot (SS) and multiple shots (MS).

In single shot setting, we compare with ImprovedDeep [1], PersonNet [20], Verification-Classification [26], Pose Invariant [23], DNN-IM [13], SOMAnet [2], GAN ResNet [27], CNN-FRW-IC [7], DeepTransfer [5] and ResNet baseline [25]. We randomly select 100 images from 100 identities under another camera as gallery and report the mAP and rank-1 accuracy in Table 2. We achieve rank-1 accuracy = 85.1%, mAP = 83.3%, which is the excellent result compared with above methods.

Table 2. Comparison with the state-of-the-art methods on CUHK03.

Full size table

In multiple shot setting, all the images from another camera are used as gallery and the number of the candidate images is about 500. This evaluation is much closer to image retrieval and alleviate the unstable effect caused by random gallery selection. We compare with S-LSTM [17], Gated Reid [16], Verification-Classification [26], SOMAnet [2] and GAN ResNet [27] on the mAP and rank-1 accuracy. Our SCNE achieves rank-1 accuracy = 82.0%, mAP = 88.1%, which is also very competitive.

4.5 Parameter Sensitivity Analysis

As the parameter $\lambda $ dominates the balance of the pairwise cosine loss and the softmax loss, it is essential to our SCNE. So we conduct experiments to investigate the influence of the parameter $\lambda $ on the Market1501 dataset. The results are reported in Fig. 3. From Fig. 3, we find that a proper $\lambda $ can achieve the best mAP and rank-1 accuracy. A good performance is achieved when $\lambda =1$.

Besides, we also report the performance change of our SCNE as the iteration increases in training in Fig. 4. From Fig. 4, we can find that the performance rise slowly after 20 epoches.

5 Conclusion

In this paper, we propose a pairwise cosine loss to compact the distribution of the intra-class features and design the SCNE to learn the discriminative features for person re-ID. Our SCNE is trained by the joint supervision of the softmax loss and the pairwise cosine loss. Compared to previous networks, we make the learned features not only separable but also compact. Experimental results show that our approach achieves the state-of-the-art performance on the public Market1501 and CUHK03 person re-ID benchmarks. Since our SCNE is apt for similarity comparison, so we will apply it to identity retrieval in the further.

References

Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3908–3916 (2015)
Google Scholar
Barbosa, I.B., Cristani, M., Caputo, B., Rognhaugen, A., Theoharis, T.: Looking beyond appearances: synthetic training data for deep CNNs in re-identification. CoRR abs:1701.03151 (2017)
Google Scholar
Chen, D., Yuan, Z., Chen, B., Zheng, N.: Similarity learning with spatial constraints for person re-identification. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1268–1277 (2016)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627 (2010)
Article Google Scholar
Geng, M., Wang, Y., Xiang, T., Tian, Y.: Deep transfer learning for person re-identification. CoRR abs:1611.05244 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2015)
Google Scholar
Jin, H., Wang, X., Liao, S., Li, S.Z.: Deep person re-identification with improved embedding and efficient training. CoRR abs:1705.03332 (2017)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: deep filter pairing neural network for person re-identification. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 152–159 (2014)
Google Scholar
Liu, H., Feng, J., Qi, M., Jiang, J., Yan, S.: End-to-end comparative attention networks for person re-identification. IEEE Trans. Image Process. 26(7), 3492–3506 (2016)
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs:1409.1556 (2014)
Google Scholar
Su, C., Zhang, S., Xing, J., Gao, W., Tian, Q.: Deep attributes driven multi-camera person re-identification. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 475–491. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_30
Chapter Google Scholar
Subramaniam, A., Chatterjee, M., Mittal, A.: Deep neural networks with inexact matching for person re-identification. In: Advances in Neural Information Processing Systems, pp. 2667–2675 (2016)
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Ustinova, E., Ganin, Y., Lempitsky, V.: Multiregion bilinear convolutional neural networks for person re-identification. CoRR abs:1512.05300 (2015)
Google Scholar
Varior, R.R., Haloi, M., Wang, G.: Gated siamese convolutional neural network architecture for human re-identification. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 791–808. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_48
Chapter Google Scholar
Varior, R.R., Shuai, B., Lu, J., Xu, D., Wang, G.: A siamese long short-term memory architecture for human re-identification. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 135–153. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_9
Google Scholar
Vedaldi, A., Lenc, K.: Matconvnet:convolutional neural networks for matlab. In: ACM International Conference on Multimedia, pp. 689–692 (2015)
Google Scholar
Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Google Scholar
Wu, L., Shen, C., Hengel, A.V.D.: Personnet: person re-identification with deep convolutional neural networks. CoRR abs:1601.07255 (2016)
Google Scholar
Wu, L., Shen, C., Hengel, A.V.D.: Deep linear discriminant analysis on fisher networks: a hybrid architecture for person re-identification. Pattern Recogn. 65, 238–250 (2017)
Article Google Scholar
Yi, D., Lei, Z., Li, S.Z.: Deep metric learning for practical person re-identification. In: International Conference on Pattern Recognition, pp. 34–39 (2014)
Google Scholar
Zheng, L., Huang, Y., Lu, H., Yang, Y.: Pose invariant embedding for deep person re-identification. CoRR abs:1701.07732 (2017)
Google Scholar
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: A benchmark. In: IEEE International Conference on Computer Vision. pp. 1116–1124 (2015)
Google Scholar
Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: Past, present and future. CoRR abs:1610.02984 (2016)
Google Scholar
Zheng, Z., Zheng, L., Yang, Y.: A discriminatively learned cnn embedding for person re-identification. CoRR abs:1611.05666 (2016)
Google Scholar
Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by gan improve the person re-identification baseline in vitro. CoRR abs:1701.07717 (2017)
Google Scholar

Download references

Acknowledgment

This work has been supported by the National Natural Science Foundation of China (61402519), and partially supported by the Natural Science Foundation of Jiangsu Province (BK20150721).

Author information

Authors and Affiliations

College of Command Information Systems, PLA Army Engineering University, Nanjing, 210007, China
Jiabao Wang, Yang Li & Zhuang Miao

Authors

Jiabao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhuang Miao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiabao Wang .

Editor information

Editors and Affiliations

Civil Aviation University of China, Tianjin, China
Jinfeng Yang
Tianjin University, Tianjin, China
Qinghua Hu
Nankai University, Tianjin, China
Ming-Ming Cheng
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Liang Wang
Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Huazhong University of Science and Technology, Wuhan, China
Xiang Bai
Xi’an Jiaotong University, Xi’an, China
Deyu Meng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Li, Y., Miao, Z. (2017). Siamese Cosine Network Embedding for Person Re-identification. In: Yang, J., et al. Computer Vision. CCCV 2017. Communications in Computer and Information Science, vol 773. Springer, Singapore. https://doi.org/10.1007/978-981-10-7305-2_31

Download citation

DOI: https://doi.org/10.1007/978-981-10-7305-2_31
Published: 08 December 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7304-5
Online ISBN: 978-981-10-7305-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Siamese Cosine Network Embedding for Person Re-identification

Abstract