Skip to main content

The Regression of MNIST Dataset Based on Convolutional Neural Network

  • Conference paper
  • First Online:
Book cover The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019) (AMLTA 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 921))

Abstract

The MNIST dataset of handwritten digits has been widely used for validating the effectiveness and efficiency of machine learning methods. Although this dataset was primarily used for classification and results of very high accuracy (99.3%+) on it have been obtained, its important application of regression is not directly applicable, thus substantially deteriorates its usefulness and the development of regression methods for such types of data. In this paper, to allow MNIST to be usable for regression, we firstly apply its class/label with normal distribution thereby convert the original discrete class numbers into float ones. Modified Convolutional Neural Networks (CNN) is then applied to generate a regression model. Multiple experiments have been conducted in order to select optimal parameters and layer settings for this application. Experimental results suggest that, optimal outcome of mean-absolute-error (MAE) value can be obtained when ReLu function is adopted for the first layer with other layers activated by the softplus functions. In the proposed approach, two indicators of MAE and Log-Cosh loss have been applied to optimize the parameters and score the predictions. Experiments on 10-fold cross-validation demonstrate that, desired low values of MAE and Log-Cosh error respectively at 0.202 and 0.079 can be achieved. Furthermore, multiple values of standard deviation of the normal distribution have been applied to verify the applicability when data of label number at varied distributions is used. The experimental results suggest that a positive correlation exists between the adopted standard deviation and the loss value, that is, the higher concentration degree of data will contribute to the lower MAE value.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  2. Grother, P.J.: NIST special database 19. Handprinted forms and characters database, National Institute of Standards and Technology (1995)

    Google Scholar 

  3. Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw. 16(5–6), 555–559 (2003)

    Article  Google Scholar 

  4. CireÅŸan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification (2012). arXiv preprint arXiv:1202.2745

  5. Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)

    Article  Google Scholar 

  6. Le Callet, P., Viard-Gaudin, C., Barba, D.: A convolutional neural network approach for objective video quality assessment. IEEE Trans. Neural Netw. 17(5), 1316–1327 (2006)

    Article  Google Scholar 

  7. van den Oord, A., Dieleman, S., Schrauwenvan, B.: Deep content-based music recommendation. Curran Associates, Inc., pp. 2643–2651 (2013)

    Google Scholar 

  8. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine learning, pp. 160–167. ACM (2008)

    Google Scholar 

  9. Pyrkov, T.V., Slipensky, K., Barg, M., Kondrashin, A., Zhurov, B., Zenin, A., Fedichev, P.O.: Extracting biological age from biomedical data via deep learning: too much of a good thing? Sci. Reports 8(1), 5210 (2018)

    Article  Google Scholar 

  10. Zang, J., Wang, L., Liu, Z., Zhang, Q., Hua, G., Zheng, N.: Attention-based temporal weighted convolutional neural network for action recognition. In: IFIP International Conference on Artificial Intelligence Applications and Innovations, pp. 97–108. Springer, Cham (2018)

    Google Scholar 

  11. Wald, A.: Statistical decision functions (1950)

    Google Scholar 

  12. Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends\(^{\textregistered }\) in Signal Process. 7(3–4), 197–387 (2014)

    Google Scholar 

  13. LeCun, Y.: LeNet-5, convolutional neural networks (2015). http://yann.lecun.com/exdb/lenet, 20

  14. Zhang, W.: Shift-invariant pattern recognition neural network and its optical architecture. In: Proceedings of Annual Conference of the Japan Society of Applied Physics (1988)

    Google Scholar 

  15. Zhang, W., Itoh, K., Tanida, J., Ichioka, Y.: Parallel distributed processing model with local space-invariant interconnections and its optical architecture. Appl. Opt. 29(32), 4790–4797 (1990)

    Article  Google Scholar 

  16. McLachlan, G., Do, K.A., Ambroise, C.: Analyzing Microarray Gene Expression Data, vol. 422. Wiley, London (2005)

    MATH  Google Scholar 

  17. Keras backends. keras.io. Accessed 23 Feb 2018

    Google Scholar 

  18. Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30(1), 79–82 (2005)

    Article  Google Scholar 

  19. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980

  20. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)

    Google Scholar 

  21. Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., Garcia, R.: Incorporating second-order functional knowledge for better option pricing. In: Advances in Neural Information Processing Systems, pp. 472–478 (2001)

    Google Scholar 

  22. Iglovikov, V.I., Rakhlin, A., Kalinin, A.A., Shvets, A.A.: Paediatric Bone age assessment using deep convolutional neural networks. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 300–308. Springer, Cham (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Xiao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Z., Wu, S., Liu, C., Wu, S., Xiao, K. (2020). The Regression of MNIST Dataset Based on Convolutional Neural Network. In: Hassanien, A., Azar, A., Gaber, T., Bhatnagar, R., F. Tolba, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_7

Download citation

Publish with us

Policies and ethics