Video-Based Person Re-identification by 3D Convolutional Neural Networks and Improved Parameter Learning

Kato, Naoki; Hakozaki, Kohei; Tanabiki, Masamoto; Furuyama, Junko; Sato, Yuji; Aoki, Yoshimitsu

doi:10.1007/978-3-319-93000-8_18

Naoki Kato¹⁶,
Kohei Hakozaki¹⁶,
Masamoto Tanabiki¹⁷,
Junko Furuyama¹⁷,
Yuji Sato¹⁷ &
…
Yoshimitsu Aoki¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10882))

Included in the following conference series:

International Conference Image Analysis and Recognition

4997 Accesses
1 Citations

Abstract

In this paper we propose a novel approach for video-based person re-identification that exploits convolutional neural networks to learn the similarity of persons observed from video camera. We take 3-dimensional convolutional neural networks (3D CNN) to extract fine-grained spatiotemporal features from the video sequence of a person. Unlike recurrent neural networks, 3D CNN preserves the spatial patterns of the input, which works well on re-identification problem. The network maps each video sequence of a person to a Euclidean space where distances between feature embeddings directly correspond to measures of person similarity. By our improved parameter learning method called entire triplet loss, all possible triplets in the mini-batch are taken into account to update network parameters. This parameter updating method significantly improves training, enabling the embeddings to be more discriminative. Experimental results show that our model achieves new state of the art identification rate on iLIDS-VID dataset and PRID-2011 dataset with 82.0%, 83.3% at rank 1, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: CVPR (2015)
Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561 (2015)
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: CVPR (2016)
Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: CVPR (2015)
Google Scholar
Gao, C., Wang, J., Liu, L., Yu, J.G., Sang, N.: Temporally aligned pooling representation for video-based person re-identification. In: ICIP (2016)
Google Scholar
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: CVPR (2006)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Huang, C., Loy, C.C., Tang, X.: Local similarity-aware deep feature embedding. In: NIPS (2016)
Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Article Google Scholar
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: Deep filter pairing neural network for person re-identification. In: CVPR (2014)
Google Scholar
Liu, K., Ma, B., Zhang, W., Huang, R.: A spatio-temporal appearance representation for viceo-based pedestrian re-identification. In: ICCV (2015)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
MATH Google Scholar
McLaughlin, N., Del Rincon, J.M., Miller, P.: Data-augmentation for reducing dataset bias in person re-identification. In: AVSS (2015)
Google Scholar
McLaughlin, N., Martinez del Rincon, J., Miller, P.: Recurrent convolutional network for video-based person re-identification. In: CVPR (2016)
Google Scholar
Oh Song, H., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: CVPR (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In: NIPS (2015)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: CVPR (2015)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS (2014)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV (2015)
Google Scholar
Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., Chen, B., Wu, Y.: Learning fine-grained image similarity with deep ranking. In: CVPR (2014)
Google Scholar
Wang, T., Gong, S., Zhu, X., Wang, S.: Person re-identification by discriminative selection in video ranking. In: PAMI (2016)
Article Google Scholar
Wu, L., Shen, C., Hengel, A.v.d.: Deep recurrent convolutional networks for video-based person re-identification: An end-to-end approach. arXiv preprint arXiv:1606.01609 (2016)
Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: CVPR (2016)
Google Scholar
You, J., Wu, A., Li, X., Zheng, W.S.: Top-push video-based person re-identification. In: CVPR (2016)
Google Scholar
Zhao, R., Ouyang, W., Wang, X.: Person re-identification by salience matching. In: ICCV (2013)
Google Scholar
Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: Mars: a video benchmark for large-scale person re-identification. In: ECCV (2016)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Keio University, Tokyo, Japan
Naoki Kato, Kohei Hakozaki & Yoshimitsu Aoki
Panasonic Corporation, Osaka, Japan
Masamoto Tanabiki, Junko Furuyama & Yuji Sato

Authors

Naoki Kato
View author publications
You can also search for this author in PubMed Google Scholar
Kohei Hakozaki
View author publications
You can also search for this author in PubMed Google Scholar
Masamoto Tanabiki
View author publications
You can also search for this author in PubMed Google Scholar
Junko Furuyama
View author publications
You can also search for this author in PubMed Google Scholar
Yuji Sato
View author publications
You can also search for this author in PubMed Google Scholar
Yoshimitsu Aoki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naoki Kato .

Editor information

Editors and Affiliations

University of Porto, Porto, Portugal
Aurélio Campilho
University of Waterloo, Waterloo, Ontario, Canada
Fakhri Karray
Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Bart ter Haar Romeny

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kato, N., Hakozaki, K., Tanabiki, M., Furuyama, J., Sato, Y., Aoki, Y. (2018). Video-Based Person Re-identification by 3D Convolutional Neural Networks and Improved Parameter Learning. In: Campilho, A., Karray, F., ter Haar Romeny, B. (eds) Image Analysis and Recognition. ICIAR 2018. Lecture Notes in Computer Science(), vol 10882. Springer, Cham. https://doi.org/10.1007/978-3-319-93000-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-93000-8_18
Published: 06 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92999-6
Online ISBN: 978-3-319-93000-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics