Bi-linearly weighted fractional max pooling

Hang, Siang Thye; Aono, Masaki

doi:10.1007/s11042-017-4840-5

Bi-linearly weighted fractional max pooling

An extension to conventional max pooling for deep convolutional neural network

Published: 31 May 2017

Volume 76, pages 22095–22117, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

454 Accesses
18 Citations
Explore all metrics

Abstract

In this paper, we propose to extend the flexibility of the commonly used 2 × 2 non-overlapping max pooling for Convolutional Neural Network. We name it as Bi-linearly Weighted Fractional Max-Pooling. This proposed method enables max pooling operation below stride size 2, and is computed based on four bi-linearly weighted neighboring input activations. Currently, in a 2 × 2 non-overlapping max pooling operation, as spatial size is halved in both x and y directions, three-quarter of activations in the feature maps are discarded. As such reduction is too abrupt, amount of said pooling operation within a Convolutional Neural Network is very limited: further increasing the number of pooling operation results in too little activation left for subsequent operations. Using our proposed pooling method, spatial size reduction can be more gradual and can be adjusted flexibly. We applied a few combinations of our proposed pooling method into 50-layered ResNet and 19-layered VGGNet with reduced number of filters, and experimented on FGVC-Aircraft, Oxford-IIIT Pet, STL-10 and CIFAR-100 datasets. Even with reduced memory usage, our proposed methods showed reasonable improvement in classification accuracy with 50-layered ResNet. Additionally, with flexibility of our proposed pooling method, we change the reduction rate dynamically every training iteration, and our evaluation results indicated potential regularization effect.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing the accuracies by performing pooling decisions adjacent to the output layer

Article Open access 31 August 2023

A improved pooling method for convolutional neural networks

Article Open access 18 January 2024

Introducing Region Pooling Learning

References

Coates A, Lee H, Ng AY (2011) An analysis of single-layer networks in unsupervised feature learning. In: AISTATS 2011, vol 1001
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: AISTATS, vol 9, pp 249–256
Graham B (2014) Fractional max-pooling. arXiv:1412.6071
Han D, Kim J, Kim J (2016) Deep pyramidal residual networks. arXiv:1610.02915
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In: IEEE international conference on computer vision (ICCV) 2015
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd international conference on machine learning, ICML 2015, lille, France, 6-11 July 2015, pp 448–456
Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Google Scholar
Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv:1306.5151
Parkhi OM, Vedaldi A, Zisserman A, Jawahar C (2012) Cats and dogs. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3498–3505
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252. doi:10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks. In: International conference on learning representations (ICLR 2014). CBLS, p 16
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations
Google Scholar
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Google Scholar
Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. In: Deep learning workshop, ICML 2015
Google Scholar
Zeiler M, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. In: Proceedings of the international conference on learning representation (ICLR)
Google Scholar
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer International Publishing, pp 818–833

Download references

Acknowledgements

We would like to thank MEXT KAKENHI, Grant-in-Aid for Challenging Exploratory Research, Grant Number 15K12027 for partial support of our work.

Author information

Authors and Affiliations

Toyohashi University of Technology, Toyohashi, Japan
Siang Thye Hang & Masaki Aono

Authors

Siang Thye Hang
View author publications
You can also search for this author in PubMed Google Scholar
Masaki Aono
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Siang Thye Hang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hang, S.T., Aono, M. Bi-linearly weighted fractional max pooling. Multimed Tools Appl 76, 22095–22117 (2017). https://doi.org/10.1007/s11042-017-4840-5

Download citation

Received: 21 September 2016
Revised: 28 April 2017
Accepted: 17 May 2017
Published: 31 May 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s11042-017-4840-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bi-linearly weighted fractional max pooling

Abstract

Access this article

Similar content being viewed by others

Enhancing the accuracies by performing pooling decisions adjacent to the output layer

A improved pooling method for convolutional neural networks

Introducing Region Pooling Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bi-linearly weighted fractional max pooling

Abstract

Access this article

Similar content being viewed by others

Enhancing the accuracies by performing pooling decisions adjacent to the output layer

A improved pooling method for convolutional neural networks

Introducing Region Pooling Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation