Skip to main content

Analyses on Influence of Training Data Set to Neural Network Supervised Learning Performance

  • Conference paper
Advances in Computer Science, Intelligent System and Environment

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 106))

Abstract

The influence of training data set on the supervised learning performance of artificial neural network (ANN) is studied in detail in this paper. First, some illustrative experiments are conducted, which verify that different training data set can lead to different supervised learning performance of ANN; secondly, the necessity of carrying data preprocessing to training data set is analyzed, and how training data set affect the supervised learning is summarized; at last, the existing methods about improving performance of ANN by using high-quality training data are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Philip, N.S.: Optimal selection of training data for the difference boosting neural networks. In: Proc. of iAstro, Nice, France, October 16-17, pp. 1–9 (2003)

    Google Scholar 

  2. Liu, G., Zhang, H.G., Guo, J.: The influence of different training samples to recognition system. Chinese Journal of Computers 28(11), 1923–1928 (2005)

    MathSciNet  Google Scholar 

  3. Zhu, X.J.: Semi-supervised learning literature survey, Technical report 1530, Computer Science. University of Wisconsin Madiso (2008)

    Google Scholar 

  4. Pereira, C.E.: Pereira Learning Vector Quantization with Training Data Selection. IEEE Trans. Pattern and analysis and machine intelligence 28(1), 157–162 (2006)

    Article  Google Scholar 

  5. Vijayakumar, S., Sugiyama, M., Ogawa, H.: Training Data Selection for Optimal Generalization with Noise Variance Reduction in Neural Networks. In: Processings of Neural Nets WIRN Vietri-1998, Salerno, Italy, pp. 153–166 (1998)

    Google Scholar 

  6. Fukumizu, K.: Stastical active learning in multilayer perceptrons. IEEE Transactions on Neural Networks 11(1), 17–26 (2000)

    Article  Google Scholar 

  7. Wang, S.B., Chai, Y.L., Liang, X.P.: Comparison of methods to produce sample points in training ANN. Journal of Computers Zhengzhou University (Engineering Science) 24(1), 63–69 (2003)

    Google Scholar 

  8. He, Y., Xiang, L.G.: Study and application of BP neural network model based on fuzzy clustering. Systems Engineering - Theory and Practice (2), 79–82 (2004)

    Google Scholar 

  9. Hao, H.W., Jiang, R.R.: Training sample selection method for neural networks based on nearest neighbor rule. Acta Automatica Sinica 33(12), 1247–1251 (2007)

    MATH  Google Scholar 

  10. Ma, X., Chen, X.C., Wang, S.B.: Application of the uniform design to the optimal selection of samples for RBF neural networks. PR & AI 18(2), 252–255 (2005)

    Google Scholar 

  11. Zhang, L., Guo, J.: A method for the selection of training samples based on boundary samples. Journal of Beijing Univerisity of Posts and Telecommunications 29(4), 77–80 (2006)

    Google Scholar 

  12. Hara, K., Nakayama, K.: A training method with small computation for classification. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), pp. 3543–3547 (2000)

    Google Scholar 

  13. Schohn, G., Cohn, D.: Less is more: active learning with support vector machines. In: Proceedings of 17th International Conference on Machine Learning, pp. 839–846 (2000)

    Google Scholar 

  14. Guan, D., Yuan, W., Lee, Y.-K., Gavrilov, A., Lee, S.: Improving supervised learning performance by using fuzzy clustering method to select training data. Journal of Intelligent & Fuzzy Systems 19(4), 321–334 (2008)

    MATH  Google Scholar 

  15. Li, C.H., Liu, C.M., Cai, G.: Approach to eliminating morbid samples in forward neural networks. Journal of Jilin University (Information Science Edition) 27(5), 514–519 (2009)

    Google Scholar 

  16. Ai, J.J., Zhou, C.G., Gong, C.C.: Algorithm of voting to eliminate morbid samples in forward feed neural networks. Mini - Micro System 11(11), 1371–1374 (2002)

    Google Scholar 

  17. Muhlenbach, F., Lallich, S., Zighed, D.A.: Identifying and handling mislabeled instances. Journal of Intelligent Information Systems 22(1), 89–109 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhou, Y., Wu, Y. (2011). Analyses on Influence of Training Data Set to Neural Network Supervised Learning Performance. In: Jin, D., Lin, S. (eds) Advances in Computer Science, Intelligent System and Environment. Advances in Intelligent and Soft Computing, vol 106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23753-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23753-9_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23752-2

  • Online ISBN: 978-3-642-23753-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics