On the Large-Scale Transferability of Convolutional Neural Networks

Zheng, Liang; Zhao, Yali; Wang, Shengjin; Wang, Jingdong; Yang, Yi; Tian, Qi

doi:10.1007/978-3-030-04503-6_3

Liang Zheng¹⁶,
Yali Zhao¹⁷,
Shengjin Wang¹⁷,
Jingdong Wang¹⁸,
Yi Yang¹⁹ &
…
Qi Tian²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11154))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1289 Accesses

Abstract

Given the overwhelming performance of the Convolutional Neural Network (CNN) in the computer vision and machine learning community, this paper aims at investigating the effective transfer of the CNN descriptors in generic and fine-grained classification at a large scale. Our contribution consists in providing some simple yet effective methods in constructing a competitive baseline recognition system. Comprehensively, we study two facts in CNN transfer. (1) We demonstrate the advantage of using images with a properly large size as input to CNN instead of the conventionally resized one. (2) We benchmark the performance of different CNN layers improved by average/max pooling on the feature maps. Our evaluation and observation confirm that the Conv5 descriptor yields very competitive accuracy under such a pooling strategy. Following these good practices, we are capable of producing improved performance on seven image classification benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Sharif Razavian, A., Sullivan, J., Maki, A., Carlsson, S.: A baseline for visual instance retrieval with deep convolutional networks. In: ICLR (2015)
Google Scholar
Zheng, L., Wang, S., Liu, Z., Tian, Q.: Packing and padding: coupled multi-index for accurate image retrieval. In: CVPR (2014) 1947–1954
Google Scholar
Zheng, L., Wang, S., Tian, Q.: Coupled binary embedding for large-scale image retrieval. IEEE Trans. Image Process. 23(8), 3368–3380 (2014)
Article MathSciNet Google Scholar
Wu, L., Shen, C., van den Hengel, A.: Deep linear discriminant analysis on fisher networks: a hybrid architecture for person re-identification. Pattern Recognit. 65, 238–250 (2017)
Article Google Scholar
Wu, L., Wang, Y., Li, X., Gao, J.: What-and-where to match: deep spatially multiplicative integration networks for person re-identification. Pattern Recognit. 76, 727–738 (2018)
Article Google Scholar
Wu, L., Shen, C., van den Hengel, A.: Deep recurrent convolutional networks for video-based person re-identification: an end-to-end approach. arXiv preprint arXiv:1606.01609 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 1–42 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 346–361. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_23
Chapter Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: CVPR Workshops (2014)
Google Scholar
Zheng, L., Wang, S., Wang, J., Tian, Q.: Accurate image search with multi-scale contextual evidences. Int. J. Comput. Vis. 120(1), 1–13 (2016)
Article MathSciNet Google Scholar
Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38
Chapter Google Scholar
Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2017)
Article Google Scholar
Mousavian, A., Kosecka, J.: Deep convolutional features for image based retrieval and scene categorization. arXiv preprint arXiv:1509.06033 (2015)
Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of CNN activations. arXiv preprint arXiv:1511.05879 (2015)
Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)
Article Google Scholar
Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative CNN video representation for event detection. In: CVPR (2015)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. TPAMI 33(1), 117–128 (2011)
Article Google Scholar
Ng, J., Yang, F., Davis, L.: Exploiting local features from deep networks for image retrieval. In: CVPR Workshops (2015)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. arXiv preprint arXiv:1409.4842 (2014)
Xie, L., Zheng, L., Wang, J., Yuille, A., Tian, Q.: Interactive: inter-layer activeness propagation. arXiv preprint arXiv:1605.00052 (2016)
Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54
Chapter Google Scholar
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A., et al.: Sun database: large-scale scene recognition from abbey to zoo. In: CVPR
Google Scholar
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)
Google Scholar
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Sixth Indian Conference on Computer Vision, Graphics & Image Processing (2008)
Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset (2011)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia (2014)
Google Scholar
Relja, A., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR (2012)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Lowe, D.G.: Distinctive image features from scale invariant keypoints. IJCV 60(2), 91–110 (2004)
Article MathSciNet Google Scholar
Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR (2011)
Google Scholar
Zheng, L., Wang, S., Tian, L., He, F., Liu, Z., Tian, Q.: Query-adaptive late fusion for image search and person re-identification. In: CVPR (2015)
Google Scholar

Download references

Acknowledgement

This work was supported in part to Dr. Qi Tian by ARO grant W911NF-15-1-0290 and Faculty Research Gift Awards by NEC Laboratories of America and Blippar. This work was supported in part by National Science Foundation of China (NSFC) 61429201.

Author information

Authors and Affiliations

Singapore University of Technology and Design, Singapore, Singapore
Liang Zheng
Tsinghua University, Beijing, China
Yali Zhao & Shengjin Wang
Microsoft Research, Beijing, China
Jingdong Wang
University of Technology Sydney, Sydney, Australia
Yi Yang
UTSA, San Antonio, USA
Qi Tian

Authors

Liang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yali Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shengjin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jingdong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Zheng .

Editor information

Editors and Affiliations

University of Melbourne, Melbourne, VIC, Australia
Mohadeseh Ganji
University of Melbourne, Melbourne, VIC, Australia
Lida Rashidi
McGill University, Montreal, QC, Canada
Benjamin C. M. Fung
Griffith University, Gold Coast, QLD, Australia
Can Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, L., Zhao, Y., Wang, S., Wang, J., Yang, Y., Tian, Q. (2018). On the Large-Scale Transferability of Convolutional Neural Networks. In: Ganji, M., Rashidi, L., Fung, B., Wang, C. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 11154. Springer, Cham. https://doi.org/10.1007/978-3-030-04503-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-04503-6_3
Published: 21 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04502-9
Online ISBN: 978-3-030-04503-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics