Skip to main content

On the Large-Scale Transferability of Convolutional Neural Networks

  • Conference paper
  • First Online:
Trends and Applications in Knowledge Discovery and Data Mining (PAKDD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11154))

Included in the following conference series:

  • 1289 Accesses

Abstract

Given the overwhelming performance of the Convolutional Neural Network (CNN) in the computer vision and machine learning community, this paper aims at investigating the effective transfer of the CNN descriptors in generic and fine-grained classification at a large scale. Our contribution consists in providing some simple yet effective methods in constructing a competitive baseline recognition system. Comprehensively, we study two facts in CNN transfer. (1) We demonstrate the advantage of using images with a properly large size as input to CNN instead of the conventionally resized one. (2) We benchmark the performance of different CNN layers improved by average/max pooling on the feature maps. Our evaluation and observation confirm that the Conv5 descriptor yields very competitive accuracy under such a pooling strategy. Following these good practices, we are capable of producing improved performance on seven image classification benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  2. Sharif Razavian, A., Sullivan, J., Maki, A., Carlsson, S.: A baseline for visual instance retrieval with deep convolutional networks. In: ICLR (2015)

    Google Scholar 

  3. Zheng, L., Wang, S., Liu, Z., Tian, Q.: Packing and padding: coupled multi-index for accurate image retrieval. In: CVPR (2014) 1947–1954

    Google Scholar 

  4. Zheng, L., Wang, S., Tian, Q.: Coupled binary embedding for large-scale image retrieval. IEEE Trans. Image Process. 23(8), 3368–3380 (2014)

    Article  MathSciNet  Google Scholar 

  5. Wu, L., Shen, C., van den Hengel, A.: Deep linear discriminant analysis on fisher networks: a hybrid architecture for person re-identification. Pattern Recognit. 65, 238–250 (2017)

    Article  Google Scholar 

  6. Wu, L., Wang, Y., Li, X., Gao, J.: What-and-where to match: deep spatially multiplicative integration networks for person re-identification. Pattern Recognit. 76, 727–738 (2018)

    Article  Google Scholar 

  7. Wu, L., Shen, C., van den Hengel, A.: Deep recurrent convolutional networks for video-based person re-identification: an end-to-end approach. arXiv preprint arXiv:1606.01609 (2016)

  8. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  9. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 1–42 (2015)

    Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 346–361. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_23

    Chapter  Google Scholar 

  11. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)

    Google Scholar 

  12. Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: CVPR Workshops (2014)

    Google Scholar 

  13. Zheng, L., Wang, S., Wang, J., Tian, Q.: Accurate image search with multi-scale contextual evidences. Int. J. Comput. Vis. 120(1), 1–13 (2016)

    Article  MathSciNet  Google Scholar 

  14. Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38

    Chapter  Google Scholar 

  15. Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 40(5), 1224–1244 (2017)

    Article  Google Scholar 

  16. Mousavian, A., Kosecka, J.: Deep convolutional features for image based retrieval and scene categorization. arXiv preprint arXiv:1509.06033 (2015)

  17. Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of CNN activations. arXiv preprint arXiv:1511.05879 (2015)

  18. Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)

    Article  Google Scholar 

  19. Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative CNN video representation for event detection. In: CVPR (2015)

    Google Scholar 

  20. Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. TPAMI 33(1), 117–128 (2011)

    Article  Google Scholar 

  21. Ng, J., Yang, F., Davis, L.: Exploiting local features from deep networks for image retrieval. In: CVPR Workshops (2015)

    Google Scholar 

  22. Szegedy, C., et al.: Going deeper with convolutions. arXiv preprint arXiv:1409.4842 (2014)

  23. Xie, L., Zheng, L., Wang, J., Yuille, A., Tian, Q.: Interactive: inter-layer activeness propagation. arXiv preprint arXiv:1605.00052 (2016)

  24. Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54

    Chapter  Google Scholar 

  25. Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A., et al.: Sun database: large-scale scene recognition from abbey to zoo. In: CVPR

    Google Scholar 

  26. Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)

    Google Scholar 

  27. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  28. Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)

    Google Scholar 

  29. Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Sixth Indian Conference on Computer Vision, Graphics & Image Processing (2008)

    Google Scholar 

  30. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset (2011)

    Google Scholar 

  31. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  32. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia (2014)

    Google Scholar 

  33. Relja, A., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR (2012)

    Google Scholar 

  34. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  35. Lowe, D.G.: Distinctive image features from scale invariant keypoints. IJCV 60(2), 91–110 (2004)

    Article  MathSciNet  Google Scholar 

  36. Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR (2011)

    Google Scholar 

  37. Zheng, L., Wang, S., Tian, L., He, F., Liu, Z., Tian, Q.: Query-adaptive late fusion for image search and person re-identification. In: CVPR (2015)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part to Dr. Qi Tian by ARO grant W911NF-15-1-0290 and Faculty Research Gift Awards by NEC Laboratories of America and Blippar. This work was supported in part by National Science Foundation of China (NSFC) 61429201.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Zheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, L., Zhao, Y., Wang, S., Wang, J., Yang, Y., Tian, Q. (2018). On the Large-Scale Transferability of Convolutional Neural Networks. In: Ganji, M., Rashidi, L., Fung, B., Wang, C. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 11154. Springer, Cham. https://doi.org/10.1007/978-3-030-04503-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04503-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04502-9

  • Online ISBN: 978-3-030-04503-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics