Skip to main content

Information Theory-Based Curriculum Learning Factory to Optimize Training

  • Conference paper
  • First Online:
  • 1393 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Abstract

We present a new system to optimize feature extraction from 2D-topological data like images in the context of deep learning using correlation among training samples and curriculum learning optimization (CLO). The system treats every sample as 2D random variable, where a pixel contained in the sample is modelled as an independent and identically distributed random variable (i.i.d) realization. With this modelling we utilize information-theoretic and statistical measures of random variables to rank individual training samples and relationship between samples to construct syllabus. The rank of each sample is then used when the sample is fed to the network during training. Comparative evaluation of multiple state-of-the-art networks, including, ResNet, GoogleNet, and VGG, on benchmark datasets demonstrate a syllabus that ranks samples using measures such as Joint Entropy between adjacent samples, can improve learning and significantly reduce the amount of training steps required to achieve desirable training accuracy. We present results that indicate our approach can produce robust feature maps that in turn contribute to reduction of loss by as much as factors of 9 compared to conventional, no-curriculum, training.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Graves, A., Bellemare, M.G., Menick, J., Munos, R., Kavukcuoglu, K.: Automated curriculum learning for neural networks (2017). 10 p.

    Google Scholar 

  2. Kim, T.-H., Choi, J.: ScreenerNet: learning self-paced curriculum for deep neural networks. arXiv:1801.00904 Cs, January 2018

  3. Hinton, G.E.: To recognize shapes, first learn to generate images. Prog. Brain Res. 165, 535–547 (2007)

    Article  Google Scholar 

  4. Zhou, H.-Y., Gao, B.-B., Wu, J.: Adaptive feeding: achieving fast and accurate detections by adaptively combining object detectors. arXiv:1707.06399 Cs, July 2017

  5. Graves, A., Bellemare, M.G., Menick, J., Munos, R., Kavukcuoglu, K.: Automated curriculum learning for neural networks. arXiv:1704.03003 Cs, April 2017

  6. Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1189–1197. Curran Associates, Inc. (2010)

    Google Scholar 

  7. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv:1511.05952 Cs, November 2015

  8. Loshchilov, I., Hutter, F.: Online batch selection for faster training of neural networks. arXiv:1511.06343 Cs Math, November 2015

  9. Koh, P.W., Liang, P.: Understanding black-box predictions via influence functions. arXiv:1703.04730 Cs Stat, March 2017

  10. Ghebrechristos, H., Alaghband, G.: Expediting training using information theory-based patch ordering algorithm (2018). 6 p.

    Google Scholar 

  11. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  12. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27 (1948). 55 p.

    Article  MathSciNet  Google Scholar 

  13. Bonev, B.I.: Feature selection based on information theory (2010). 200 p.

    Google Scholar 

  14. Feixas, M., Bardera, A., Rigau, J., Xu, Q., Sbert, M.: Information theory tools for image processing. Synthesis Lectures on Computer Graphics and Animation, vol. 6, no. 1, pp. 1–164 (2014)

    Article  Google Scholar 

  15. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2006). 774 p.

    MATH  Google Scholar 

  16. Horé, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM, pp. 2366–2369 (2010)

    Google Scholar 

  17. Deming, W.E., Morgan, S.L.: The Elements of Statistical Learning. Elsevier, Amsterdam (1993)

    Google Scholar 

  18. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. arXiv:1605.08695 Cs, May 2016

  19. Krizhevsky, A.: Learning multiple layers of features from tiny images (2009). 60 p.

    Google Scholar 

  20. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. arXiv:1409.0575 Cs, September 2014

  21. Szegedy, C., et al.: Going deeper with convolutions. arXiv:1409.4842 Cs, September 2014

  22. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. arXiv:1512.00567 Cs, December 2015

  23. ImageNet Large Scale Visual Recognition Competition (ILSVRC). http://image-net.org/challenges/LSVRC/. Accessed 29 Apr 2017

  24. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770–778 (2016)

    Google Scholar 

  25. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:14091556 Cs, September 2014

  26. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:170404861 Cs, April 2017

  27. TensorFlow: TensorFlow. https://www.tensorflow.org/. Accessed 14 Mar 2019

  28. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, New York (1995). 498 p.

    MATH  Google Scholar 

  29. Lang, K.J., Hinton, G.E.: Dimensionality reduction and prior knowledge in E-set recognition (1990). 8 p.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henok Ghebrechristos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ghebrechristos, H., Alaghband, G. (2020). Information Theory-Based Curriculum Learning Factory to Optimize Training. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41404-7_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41403-0

  • Online ISBN: 978-3-030-41404-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics