Abstract
In this paper, we aim to improve the performance, time complexity and energy efficiency of deep convolutional neural networks (CNNs) by combining hardware and specialization techniques. Since the pooling step represents a process that contributes significantly to CNNs performance improvement, we propose the Mode-Fisher pooling method. This form of pooling can potentially offer a very promising results in terms of improving feature extraction performance. The proposed method reduces significantly the data movement in the CNN and save up to 10% of total energy, without any performance penalty.


















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Computer file that contains an uncompressed image. It is not viewable directly by most computer systems.
In the literature, the most used filters do not exceed the size (\(5 \times 5\)).
Max pooling: y= Max (\(x_{ij}\)).
Average pooling: y= Mean (\(x_{ij}\)),y represents the output, i and j are the row and column index of the pooling region.
All data sets are summarized in Table 1.
References
Abhimanyu D, Otkrist G, Ramesh R, Nikhil N (2018) Maximum-entropy fine grained classification. In: Advances in neural information processing systems, pp 637–647
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2014) Good practice in large-scale learning for image classification. IEEE Trans Pattern Anal Mach Intell 36:507–520
Asif U, Bennamoun M, Sohel F (2017) A multi-modal, discriminative and spatially invariant CNN for RGB-D object labeling. IEEE Trans Pattern Anal Mach Intell 40(9):2051–2065
Beigpour S, Riess C, Van De Weijer J, Angelopoulou E (2014) Multi-illuminant estimation with conditional random fields. IEEE Trans Image Process 23:83–96
Bianco S (2017) Single and multiple illuminant estimation using convolutional neural networks. IEEE Trans Image Process 26(9):4347–4362
Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Jackel LD, LeCun Y, Muller UA, et al (1994) Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR international conference on pattern recognition, conference B: computer vision and image processing, IEEE, vol 2, pp 77–82
Brain G (2017) Tensorflow: an open-source software library for machine intelligence
Chen Y-H, Emer J, Sze V (2017) Using dataflow to optimize energy efficiency of deep neural network accelerators. IEEE Micro 37:12–21
Chollet F (2018) Keras: the python deep learning library. https://keras.io/#keras-the-python-deep-learning-library
CIFAR10, Lenet-cifar10. http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html. Accessed 08 Jan 2018
Cimpoi M, Maji S, Kokkinos I, Vedaldi A (2016) Deep filter banks for texture recognition, description, and segmentation. Int J Comput Vis 118:65–94
Cohen BH, Lea RB (2004) Essentials of statistics for the social and behavioral sciences, vol 3. Wiley, New York
Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, Long Papers, vol. 1, pp 1107–1116
core team P (2017) Pytorch: Tensors and dynamic neural networks in python with strong GPU acceleration
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Deng J, et al (2012) Large scale visual recognition challenge 2012 (ilsvrc2012). http://image-net.org/challenges/LSVRC/2012/index. Accessed Apr 2018
Deshpande A (2018) The 9 deep learning papers you need to know about (understanding cnns part 3). https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html. Accessed 15 Apr 2018
Dietterich T (1995) Overfitting and undercomputing in machine learning. ACM Comput Surv 27:326–327
Dixit M, Chen S, Gao D, Rasiwasia N, Vasconcelos N (2015) Scene classification with semantic fisher vectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2974–2983
Ebner M (2009) Color constancy based on local space average color. Mach Vis Appl 20:283–301
Erhan D, Bengio Y, Courville A, Manzagol P-A, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
Gijsenij A, Lu R, Gevers T (2012) Color constancy for multiple light sources. IEEE Trans Image Process 21:697–707
Goodfellow IJ, Warde-Farley D, Mirza M, Courville M, Bengio Y (2013) Maxout networks. arXiv preprint arXiv:1302.4389
Gravetter F, Wallnau L (2015). Statistics for the behavioral sciences. Cengage Learning
Hassaballah M, Abdelmgeid AA, Alshazly HA (2016) Image features detection, description and matching. Image feature detectors and descriptors. Springer, Cham, pp 11–45
He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 5353–5360
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1916
Hensman P, Masko D (2015) The impact of imbalanced training data for convolutional neural networks. Degree Project in Computer Science, KTH Royal Institute of Technology
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Hsi-Shou W (2018) Energy-efficient neural network architectures
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and $<$ 0.5 mb model size. arXiv preprint arXiv:1602.07360
ImageNet
Jianhua L (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37:145–151
Jolicoeur P (2012) Introduction to biometry. Springer, New York
Jupyter P (2017) Jupyter
Krizhevsky A (2017) The cifar10/100 datasets
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
lab FAR (2017) Torch: a scientific computing framework for luajit
Land EH (1977) The retinex theory of color vision. Sci Am 237:108–129
LeCun Y (2017) The MNIST database of handwritten digits
LeCun Y, Bengio Y (2015) Deep learning. Nature 521:436–444
LeCun Y, et al (2018) Lenet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet. Accessed 2018
Lee C-Y, Gallagher P, Tu Z (2017) Generalizing pooling functions in CNNs: mixed, gated, and tree. IEEE Trans Pattern Anal Mach Intell PP(99):1
Li D, Chen X, Becchi M, Zong Z (2016) Evaluating the energy efficiency of deep convolutional neural networks on CPUs and GPUs. In: 2016 IEEE international conferences on big data and cloud computing (BDCloud), social computing and networking (SocialCom), sustainable computing and communications, IEEE, pp 477–484
Liu L, Fieguth P, Guo Y, Wang X, Pietikäinen M (2017) Local binary features for texture classification: taxonomy and experimental study. Pattern Recogn 62:135–160
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110
Mittal S (2012) A survey of architectural techniques for dram power management. Int J High Perform Syst Archit 4:110–119
MNIST (2018) Lenet-mnist. https://github.com/shawpan/lenet/blob/master/README.md. Accessed 06 Jan 2018
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: IEEE conference on computer vision and pattern recognition, CVPR’07, IEEE, pp 1–8
Perronnin F, Sánchez J (2010) Improving the fisher kernel for large-scale image classification. Comput Vis ECCV 2010:143–156
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: 2007 IEEE conference on computer vision and pattern recognition, IEEE, pp 1–8
Ren M, Liao R, Urtasun R, Sinz FH, Zemel RS (2016) Normalizing the normalizers: comparing and extending network normalization schemes. arXiv preprint arXiv:1611.04520
Sánchez J, Perronnin F, Mensink T, Verbeek J (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105:222–245
Scardapane S, Comminiello D, Hussain A (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89
Simonyan K, Vedaldi A, Zisserman A (2013) Deep fisher networks for large-scale image classification. In: Advances in neural information processing systems, pp 163–171
Song Y, Zhang F, Li Q, Huang H (2017) Locally-transferred fisher vectors for texture classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4912–4920
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Stehlík M, Kisel’ák J, Bukina E, Lu Y, Baran S (2020) Fredholm integral relation between compound estimation and prediction (FIRCEP). Stoch Anal Appl 38:427–459
Sumner R (2014) Processing raw images in MATLAB. University of California Sata Cruz, Department of Electrical Engineering, Sata Cruz
Sun M, Song Z, Jiang X, Pan J, Pang Y (2017) Learning pooling for convolutional neural network. Neurocomputing 224:96–104
Sydorov V et al (2014) Deep fisher kernels-end to end learning of the fisher kernel GMM parameters. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1402–1409
Sze V, Chen Y-H, Yang T-J, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105:2295–2329
Tang P, Wang X, Shi B, Bai X, Liu W, Tu Z (2016) Deep fishernet for object classification. arXiv preprint arXiv:1608.00182
Tong Z, Aihara K, Tanaka G (2016) A hybrid pooling method for convolutional neural networks. In: International conference on neural information processing, Springer, pp 454–461
Wager S, Wang S, Liang PS (2013) Dropout training as adaptive regularization. In: Advances in neural information processing systems, pp 351–359
Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066
Wu H, Gu X (2015) Max-pooling dropout for regularization of convolutional neural networks. In: International conference on neural information processing, Springer, pp 46–54
Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1249–1258
Xie L, Tian Q, Zhang B (2016) Simple techniques make sense: feature pooling and normalization for image classification. IEEE Trans Circuits Syst Video Technol 26:1251–1264
Xu Z, Yang Y, Hauptmann AG (2015) A discriminative CNN video representation for event detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1798–1807
Yu Z, Ni D, Chen S, Qin J, Li S, Wang T, Lei B (2017) Hybrid dermoscopy image classification framework based on deep convolutional neural network and fisher vector. In: 2017 IEEE 14th international symposium on biomedical imaging (ISBI, 2017), IEEE, pp 301–304
Zeiler M, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. In: Proceedings of the international conference on learning representation (ICLR)
Acknowledgements
We thank anonymous reviewers for their very useful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mansouri, D.E.K., Kaddar, B., Benkabou, SE. et al. The Mode-Fisher pooling for time complexity optimization in deep convolutional neural networks. Neural Comput & Applic 33, 6443–6465 (2021). https://doi.org/10.1007/s00521-020-05406-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05406-4