Abstract
Recent years, more and more attractive results are achieved by deep learning. However, large numbers of parameters generally cause overfitting in the training stage. Hinton [17] proposed dropout to address this problem in 2012. During our research, we find that there is a balance between generalization and accuracy. Dropout can increase generalization, decrease overfitting and thus increase accuracy by using appropriate dropout rate. However, too high generalization may lead to relatively low accuracy. So, we propose a continuous dropout rate strategy that we gradually decrease the dropout rate during training instead of a constant one. In this way, we can obtain high generalization in the beginning and high accuracy in the end. Experiment results show that our proposed strategy can achieve higher accuracy compared to the traditional dropout.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al Rahhal, M.M., Bazi, Y., AlHichri, H., et al.: Deep learning approach for active classification of electrocardiogram signals. Inf. Sci. 345, 340–354 (2016)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53. Author, F., Author, S., Author, T.: Book title, 2nd edn. Publisher, Location (1999)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Shin, H.C., Roth, H.R., Gao, M., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)
Hong, R., Hu, Z., Wang, R., Wang, M., Tao, D.: Multi-view object retrieval via multi-scale topic models. IEEE Trans. Image Process. 25(12), 5814–5827 (2016)
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)
Chen, L.C., Papandreou, G., Kokkinos, I., et al.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. arXiv preprint arXiv:1606.00915 (2016)
Hong, R., Yang, Y., Wang, M., Hua, X.-S.: Learning visual semantic relationships for efficient visual retrieval. IEEE Trans. Big Data 1(4), 152–161 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press (2001)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-2010), pp. 807–814 (2010)
Hawkins, D.M.: The problem of overfitting. J. Chem. Inf. Comput. Sci. 44(1), 1–12 (2004)
Caruana, R., Lawrence, S., Giles, C.L.: Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: Advances in Neural Information Processing Systems, pp. 402–408 (2001)
Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems, pp. 950–957 (1992)
Hinton, G.E., Srivastava, N., Krizhevsky, A., et al.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Hinton, G.E., Van Camp, D.: Keeping the neural networks simple by minimizing the description length of the weights. In: Proceedings of the Sixth Annual Conference on Computational Learning Theory, pp. 5–13. ACM (1993)
Ye, C., Zhao, C., Yang, Y., et al.: LightNet: a versatile, standalone Matlab-based environment for deep learning. arXiv preprint arXiv:1605.02766 (2016)
Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Acknowledgment
This work was supported in part by National Natural Science Foundation of China (Grant No. 61473444). This foundation is mainly focus on multimedia and machine learning.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Fei, J., Rui, T., Song, X., Zhou, Y., Zhang, S. (2018). Continuous Dropout Strategy for Deep Learning Network. In: Huet, B., Nie, L., Hong, R. (eds) Internet Multimedia Computing and Service. ICIMCS 2017. Communications in Computer and Information Science, vol 819. Springer, Singapore. https://doi.org/10.1007/978-981-10-8530-7_26
Download citation
DOI: https://doi.org/10.1007/978-981-10-8530-7_26
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8529-1
Online ISBN: 978-981-10-8530-7
eBook Packages: Computer ScienceComputer Science (R0)