The Regression of MNIST Dataset Based on Convolutional Neural Network

Wang, Ziheng; Wu, Su; Liu, Chang; Wu, Shaozhi; Xiao, Kai

doi:10.1007/978-3-030-14118-9_7

Ziheng Wang¹⁹,
Su Wu²⁰,
Chang Liu²⁰,
Shaozhi Wu²¹ &
…
Kai Xiao²²

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 921))

Included in the following conference series:

International Conference on Advanced Machine Learning Technologies and Applications

2100 Accesses
5 Citations

Abstract

The MNIST dataset of handwritten digits has been widely used for validating the effectiveness and efficiency of machine learning methods. Although this dataset was primarily used for classification and results of very high accuracy (99.3%+) on it have been obtained, its important application of regression is not directly applicable, thus substantially deteriorates its usefulness and the development of regression methods for such types of data. In this paper, to allow MNIST to be usable for regression, we firstly apply its class/label with normal distribution thereby convert the original discrete class numbers into float ones. Modified Convolutional Neural Networks (CNN) is then applied to generate a regression model. Multiple experiments have been conducted in order to select optimal parameters and layer settings for this application. Experimental results suggest that, optimal outcome of mean-absolute-error (MAE) value can be obtained when ReLu function is adopted for the first layer with other layers activated by the softplus functions. In the proposed approach, two indicators of MAE and Log-Cosh loss have been applied to optimize the parameters and score the predictions. Experiments on 10-fold cross-validation demonstrate that, desired low values of MAE and Log-Cosh error respectively at 0.202 and 0.079 can be achieved. Furthermore, multiple values of standard deviation of the normal distribution have been applied to verify the applicability when data of label number at varied distributions is used. The experimental results suggest that a positive correlation exists between the adopted standard deviation and the loss value, that is, the higher concentration degree of data will contribute to the lower MAE value.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Grother, P.J.: NIST special database 19. Handprinted forms and characters database, National Institute of Standards and Technology (1995)
Google Scholar
Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw. 16(5–6), 555–559 (2003)
Article Google Scholar
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification (2012). arXiv preprint arXiv:1202.2745
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Article Google Scholar
Le Callet, P., Viard-Gaudin, C., Barba, D.: A convolutional neural network approach for objective video quality assessment. IEEE Trans. Neural Netw. 17(5), 1316–1327 (2006)
Article Google Scholar
van den Oord, A., Dieleman, S., Schrauwenvan, B.: Deep content-based music recommendation. Curran Associates, Inc., pp. 2643–2651 (2013)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine learning, pp. 160–167. ACM (2008)
Google Scholar
Pyrkov, T.V., Slipensky, K., Barg, M., Kondrashin, A., Zhurov, B., Zenin, A., Fedichev, P.O.: Extracting biological age from biomedical data via deep learning: too much of a good thing? Sci. Reports 8(1), 5210 (2018)
Article Google Scholar
Zang, J., Wang, L., Liu, Z., Zhang, Q., Hua, G., Zheng, N.: Attention-based temporal weighted convolutional neural network for action recognition. In: IFIP International Conference on Artificial Intelligence Applications and Innovations, pp. 97–108. Springer, Cham (2018)
Google Scholar
Wald, A.: Statistical decision functions (1950)
Google Scholar
Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends\(^{\textregistered }\) in Signal Process. 7(3–4), 197–387 (2014)
Google Scholar
LeCun, Y.: LeNet-5, convolutional neural networks (2015). http://yann.lecun.com/exdb/lenet, 20
Zhang, W.: Shift-invariant pattern recognition neural network and its optical architecture. In: Proceedings of Annual Conference of the Japan Society of Applied Physics (1988)
Google Scholar
Zhang, W., Itoh, K., Tanida, J., Ichioka, Y.: Parallel distributed processing model with local space-invariant interconnections and its optical architecture. Appl. Opt. 29(32), 4790–4797 (1990)
Article Google Scholar
McLachlan, G., Do, K.A., Ambroise, C.: Analyzing Microarray Gene Expression Data, vol. 422. Wiley, London (2005)
MATH Google Scholar
Keras backends. keras.io. Accessed 23 Feb 2018
Google Scholar
Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30(1), 79–82 (2005)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)
Google Scholar
Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., Garcia, R.: Incorporating second-order functional knowledge for better option pricing. In: Advances in Neural Information Processing Systems, pp. 472–478 (2001)
Google Scholar
Iglovikov, V.I., Rakhlin, A., Kalinin, A.A., Shvets, A.A.: Paediatric Bone age assessment using deep convolutional neural networks. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 300–308. Springer, Cham (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Aerospace Engineering and Applied Mechanics, Tongji University, No. 1239 Siping Road, Yangpu District, Shanghai, China
Ziheng Wang
School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
Su Wu & Chang Liu
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
Shaozhi Wu
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China
Kai Xiao

Authors

Ziheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Su Wu
View author publications
You can also search for this author in PubMed Google Scholar
Chang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shaozhi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Kai Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Xiao .

Editor information

Editors and Affiliations

Faculty of Computers and Information, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Faculty of Computers and Information, Benha University, Benha, Egypt
Ahmad Taher Azar
School of Computing, Science and Engineering, University of Salford, Salford, Greater Manchester, UK
Tarek Gaber
Department of Computer Science and Engineering, School of Computing and IT, Faculty of Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India
Roheet Bhatnagar
Faculty of Computer and Information Science, Ain Shams University, Cairo, Egypt
Mohamed F. Tolba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Z., Wu, S., Liu, C., Wu, S., Xiao, K. (2020). The Regression of MNIST Dataset Based on Convolutional Neural Network. In: Hassanien, A., Azar, A., Gaber, T., Bhatnagar, R., F. Tolba, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-14118-9_7
Published: 17 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14117-2
Online ISBN: 978-3-030-14118-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics