Skip to main content

Neural Networks as Model Selection with Incremental MDL Normalization

  • Conference paper
  • First Online:
  • 626 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1072))

Abstract

If we consider the neural network optimization process as a model selection problem, the implicit space can be constrained by the normalizing factor, the minimum description length of the optimal universal code. Inspired by the adaptation phenomenon of biological neuronal firing, we propose a class of reparameterization of the activation in the neural network that take into account the statistical regularity in the implicit space under the Minimum Description Length (MDL) principle. We introduce an incremental version of computing this universal code as normalized maximum likelihood and demonstrated its flexibility to include data prior such as top-down attention and other oracle information and its compatibility to be incorporated into batch normalization and layer normalization. The empirical results showed that the proposed method outperforms existing normalization methods in tackling the limited and imbalanced data from a non-stationary distribution benchmarked on computer vision and reinforcement learning tasks. As an unsupervised attention mechanism given input data, this biologically plausible normalization has the potential to deal with other complicated real-world scenarios as well as reinforcement learning setting where the rewards are sparse and non-uniform. Further research is proposed to discover these scenarios and explore the behaviors among different variants.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In continuous data streams or time series analysis, the incrementation step can be replaced by integrating over the seen territory of the probability distribution X of the data.

  2. 2.

    The raw data and code to reproduce the results can be downloaded at https://app.box.com/s/ruycgz8p7rh30taj38d8dkc0h1ptltg1.

References

  1. Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)

  2. Blakemore, C., Campbell, F.W.: Adaptation to spatial stimuli. J. physiol. 200(1), 11P–13P (1969)

    Google Scholar 

  3. Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)

  4. Ding, S., Cueva, C.J., Tsodyks, M., Qian, N.: Visual perception as retrospective Bayesian decoding from high-to low-level features. Proc. Nat. Acad. Sci. 114(43), E9115–E9124 (2017)

    Article  Google Scholar 

  5. Dragoi, V., Sharma, J., Sur, M.: Adaptation-induced plasticity of orientation tuning in adult visual cortex. Neuron 28(1), 287–298 (2000)

    Article  Google Scholar 

  6. Grünwald, P.D.: The Minimum Description Length Principle. MIT press, Cambridge (2007)

    Book  Google Scholar 

  7. Hinton, G., Van Camp, D.: Keeping neural networks simple by minimizing the description length of the weights. In: In Proceedings of the 6th Annual ACM Conference on Computational Learning Theory. Citeseer (1993)

    Google Scholar 

  8. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

  9. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  10. LeCun, Y.: The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/ (1998)

  11. Lin, B., Bouneffouf, D., Cecchi, G.A., Rish, I.: Contextual bandit with adaptive feature extraction. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 937–944. IEEE (2018)

    Google Scholar 

  12. Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)

    Google Scholar 

  13. Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  14. Myung, J.I., Navarro, D.J., Pitt, M.A.: Model selection by normalized maximum likelihood. J. Math. Psychol. 50(2), 167–179 (2006)

    Article  MathSciNet  Google Scholar 

  15. Qian, N., Zhang, J.: Neuronal firing rate as code length: a hypothesis. Comput. Behav. pp. 1–20 (2019)

    Google Scholar 

  16. Rissanen, J.: Modeling by shortest data description. Automatica 14(5), 465–471 (1978)

    Article  Google Scholar 

  17. Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)

    MATH  Google Scholar 

  18. Rissanen, J.: Strong optimality of the normalized ML models as universal codes and information in data. IEEE Trans. Inf. Theory 47(5), 1712–1717 (2001)

    Article  MathSciNet  Google Scholar 

  19. Salimans, T., Kingma, D.P.: Weight normalization: A simple reparameterization to accelerate training of deep neural networks. In: Advances in Neural Information Processing Systems, pp. 901–909 (2016)

    Google Scholar 

  20. Zemel, R.S., Hinton, G.E.: Learning population coes by minimizing description length. In: Unsupervised Learning, pp. 261–276. Bradford Company (1999)

    Google Scholar 

  21. Zhang, J.: Model selection with informative normalized maximum likelihood: Data prior and model prior. In: Descriptive and Normative Approaches To Human Behavior, pp. 303–319. World Scientific (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baihan Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lin, B. (2019). Neural Networks as Model Selection with Incremental MDL Normalization. In: Zeng, A., Pan, D., Hao, T., Zhang, D., Shi, Y., Song, X. (eds) Human Brain and Artificial Intelligence. HBAI 2019. Communications in Computer and Information Science, vol 1072. Springer, Singapore. https://doi.org/10.1007/978-981-15-1398-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1398-5_14

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1397-8

  • Online ISBN: 978-981-15-1398-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics