Fusing Appearance Features and Correlation Features for Face Video Retrieval

Jing, Chenchen; Dong, Zhen; Pei, Mingtao; Jia, Yunde

doi:10.1007/978-3-319-77383-4_15

Chenchen Jing¹⁹,
Zhen Dong¹⁹,
Mingtao Pei¹⁹ &
…
Yunde Jia¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10736))

Included in the following conference series:

Pacific Rim Conference on Multimedia

2434 Accesses

Abstract

Face video retrieval has drawn considerable research attention recently. Most prior research mainly focused on either appearance features or correlation features, which could degrade retrieval performance. In this paper, we fuse appearance features and correlation features to exploit rich information of face videos for face video retrieval via a deep convolutional neural network. The network extracts appearance feature and correlation feature from a frame and the covariance matrix of a face video, respectively, and fuses them to obtain a comprehensive video representation. The fused feature is projected to a low-dimensional Hamming space via hash functions for the retrieval task. The network integrates feature extractions, feature fusion, and hash learning into a unified optimization framework to guarantee optimal compatibility of appearance features and correlation features. Experiments on two challenging TV-Series datasets demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Video Code for Efficient Face Video Retrieval

A comparison of deep learning models for end-to-end face-based video retrieval in unconstrained videos

Article Open access 05 January 2022

Video Retrieval Using Query Images and CNN Features

References

Arandjelovic, O., Zisserman, A.: Automatic face recognition for film character retrieval in feature-length films. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 860–867. IEEE (2005)
Google Scholar
Arandjelović, O., Zisserman, A.: On film character retrieval in feature-length films. In: Interactive Video, pp. 89–105 (2006)
Google Scholar
Cevikalp, H., Triggs, B.: Face recognition based on image sets. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2567–2573. IEEE (2010)
Google Scholar
Conjeti, S., Paschali, M., Katouzian, A., Navab, N.: Learning robust hash codes for multiple instance image retrieval. arXiv preprint arXiv:1703.05724 (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 248–255. IEEE (2009)
Google Scholar
Dong, Z., Jia, S., Wu, T., Pei, M.: Face video retrieval via deep learning of binary hash representations. In: AAAI, pp. 3471–3477 (2016)
Google Scholar
Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. In: VLDB, vol. 99, pp. 518–529 (1999)
Google Scholar
Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning binary codes. In: CVPR, pp. 817–824. IEEE (2011)
Google Scholar
Hoang, T., Do, T.T., Tan, D.K.L., Cheung, N.M.: Enhance feature discrimination for unsupervised hashing. arXiv preprint arXiv:1704.01754 (2017)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Google Scholar
Kim, T.K., Kittler, J., Cipolla, R.: Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1005–1018 (2007)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012)
Google Scholar
Li, X., Lin, G., Shen, C., Van Den Hengel, A., Dick, A.R.: Learning hash functions using column generation. In: ICML, vol. 1, pp. 142–150 (2013)
Google Scholar
Li, Y., Wang, R., Cui, Z., Shan, S., Chen, X.: Compact video code and its application to robust face retrieval in TV-Series. In: BMVC (2014)
Google Scholar
Li, Y., Wang, R., Cui, Z., Shan, S., Chen, X.: Spatial pyramid covariance-based compact video code for robust face retrieval in TV-Series. IEEE Trans. Image Process. 25(12), 5905–5919 (2016)
Article MathSciNet Google Scholar
Li, Y., Wang, R., Shan, S., Chen, X.: Hierarchical hybrid statistic based video binary code and its application to face retrieval in TV-Series. In: FG, pp. 1–8. IEEE (2015)
Google Scholar
Liu, W., Wang, J., Ji, R., Jiang, Y.G., Chang, S.F.: Supervised hashing with kernels. In: CVPR, pp. 2074–2081. IEEE (2012)
Google Scholar
Parkhi, O.M., Simonyan, K., Vedaldi, A., Zisserman, A.: A compact and discriminative face track descriptor. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1693–1700 (2014)
Google Scholar
Qiao, S., Wang, R., Shan, S., Chen, X.: Deep video code for efficient face video retrieval (2016)
Google Scholar
Rastegari, M., Farhadi, A., Forsyth, D.: Attribute discovery via predictable discriminative binary codes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 876–889. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_63
Chapter Google Scholar
Sivic, J., Everingham, M., Zisserman, A.: Person spotting: video shot retrieval for face sets. In: Leow, W.-K., Lew, M.S., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 226–236. Springer, Heidelberg (2005). https://doi.org/10.1007/11526346_26
Chapter Google Scholar
Wang, J., Kumar, S., Chang, S.F.: Semi-supervised hashing for scalable image retrieval. In: CVPR, pp. 3424–3431. IEEE (2010)
Google Scholar
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in Neural Information Processing Systems, pp. 1753–1760 (2009)
Google Scholar
Wu, S., Chen, Y.C., Li, X., Wu, A.C., You, J.J., Zheng, W.S.: An enhanced deep feature representation for person re-identification. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–8. IEEE (2016)
Google Scholar
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv preprint arXiv:1411.7923 (2014)
Zhu, F., Kong, X., Zheng, L., Fu, H., Tian, Q.: Part-based deep hashing for large-scale person re-identification. IEEE Trans. Image Process. 26(10), 4806–4817 (2017)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was supported in part by the Natural Science Foundation of China (NSFC) under Grant No. 61472038 and No. 61375044.

Author information

Authors and Affiliations

Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, 100081, People’s Republic of China
Chenchen Jing, Zhen Dong, Mingtao Pei & Yunde Jia

Authors

Chenchen Jing
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Dong
View author publications
You can also search for this author in PubMed Google Scholar
Mingtao Pei
View author publications
You can also search for this author in PubMed Google Scholar
Yunde Jia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingtao Pei .

Editor information

Editors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Bing Zeng
University of Chinese Academy of Sciences, Beijing, China
Qingming Huang
University of Ottawa, Ottawa, Ontario, Canada
Abdulmotaleb El Saddik
University of Electronic Science and Technology of China, Chengdu, China
Hongliang Li
Chinese Academy of Sciences, Beijing, China
Shuqiang Jiang
Harbin Institute of Technology, Harbin, China
Xiaopeng Fan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jing, C., Dong, Z., Pei, M., Jia, Y. (2018). Fusing Appearance Features and Correlation Features for Face Video Retrieval. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-77383-4_15
Published: 10 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77382-7
Online ISBN: 978-3-319-77383-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fusing Appearance Features and Correlation Features for Face Video Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Video Code for Efficient Face Video Retrieval

A comparison of deep learning models for end-to-end face-based video retrieval in unconstrained videos

Video Retrieval Using Query Images and CNN Features

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Fusing Appearance Features and Correlation Features for Face Video Retrieval

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Video Code for Efficient Face Video Retrieval

A comparison of deep learning models for end-to-end face-based video retrieval in unconstrained videos

Video Retrieval Using Query Images and CNN Features

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation