Abstract
In this chapter, a comprehensive methodology is presented to address important data-driven challenges within the context of classification. First, it is demonstrated that challenges, such as heterogeneity and noise observed with big/large data-sets, affect the efficiency of a deep neural network (DNN)-based classifiers. To obviate these issues, a two-step classification framework is introduced where unwanted attributes (variables) are systematically removed through a preprocessing step and a DNN-based classifier is introduced to address heterogeneity in the learning process. Specifically, a multi-stage nonlinear dimensionality reduction (NDR) approach is described in this chapter to remove unwanted variables and a novel optimization framework is presented to address heterogeneity. In NDR, the dimensions are first divided into groups (grouping stage) and redundancies are then systematically removed in each group (transformation stage). The two-stage NDR procedure is repeated until a user-defined criterion controlling information loss is satisfied. The reduced dimensional data is finally used for classification with a DNN-based framework where direct error-driven learning regime is introduced. Within this framework, an approximation of generalization error is obtained by generating additional samples from the data. An overall error, which consists of learning and approximation of generalization error, is determined and utilized to derive a performance measure for each layer in the DNN. A novel layer-wise weight-tuning law is finally obtained through the gradient of this layer-wise performance measure where the overall error is directly utilized for learning. The efficiency of this two-step classification approach is demonstrated using various data-sets.
This research was supported in part by an NSF I/UCRC award IIP 1134721 and Intelligent Systems Center.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adragni, K.P., Al-Najjar, E., Martin, S., Popuri, S.K., Raim, A.M.: Group-wise sufficient dimension reduction with principal fitted components. Comput. Stat. 31(3), 923–941 (2016)
Balasubramanian, M., Schwartz, E.L.: The isomap algorithm and topological stability. Science 295(5552), 7 (2002)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Bulatov, Y.: Notmnist dataset. Google (Books/OCR), Technical Report. http://yaroslavvb.blogspot.it/2011/09/notmnist-dataset.html (2011)
Clarke, R., Ressom, H.W., Wang, A., Xuan, J., Liu, M.C., Gehan, E.A., Wang, Y.: The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nat. Rev. Cancer 8(1), 37–49 (2008)
David, S., Ruey, S., et al.: Independent component analysis via distance covariance. J. Am. Stat. Assoc. (2017)
Donoho, D.L., Grimes, C.: Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100(10), 5591–5596 (2003)
Fan, J., Han, F., Liu, H.: Challenges of big data analysis. Natl. Sci. Rev. 1(2), 293–314 (2014)
Feng, H.: et al.: Gene classification using parameter-free semi-supervised manifold learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(3), 818–827 (2012)
Fodor, I.K.: A survey of dimension reduction techniques (2002)
Giraud, C.: Introduction to High-Dimensional Statistics, vol. 138. CRC Press (2014)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Goldberger, J., Ben-Reuven, E.: Training deep neural-networks using a noise adaptation layer (2016)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 1. MIT Press, Cambridge (2016)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014). arXiv preprint arXiv:1412.6572
Guo, Z., Li, L., Lu, W., Li, B.: Groupwise dimension reduction via envelope method. J. Am. Stat. Assoc. 110(512), 1515–1527 (2015)
Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Advances in Neural Information Processing Systems, pp. 545–552 (2005)
Hardt, M.: 3.12 train faster, generalize better: stability of stochastic gradient descent. Math. Comput. Found. Learn. Theor. 64 (2015)
Ing, C.K., Lai, T.L., Shen, M., Tsang, K., Yu, S.H.: Multiple testing in regression models with applications to fault diagnosis in big data era. Technometrics (just-accepted) (2016)
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, vol. 4. Prentice Hall, Englewood Cliffs (1992)
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, vol. 5. Prentice Hall, Upper Saddle River (2002)
Jolliffe, I.: Principal Component Analysis. Wiley Online Library (2002)
Khan, F., Kari, D., Karatepe, I.A., Kozat, S.S.: Universal nonlinear regression on high dimensional data using adaptive hierarchical trees. IEEE Trans. Big Data 2(2), 175–188 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Lee, D.H., Zhang, S., Fischer, A., Bengio, Y.: Difference target propagation. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 498–515. Springer, Cham (2015)
Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)
Lillicrap, T.P., Cownden, D., Tweed, D.B., Akerman, C.J.: Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 713276 (2016)
Mishkin, D., Matas, J.: All you need is a good init (2015). arXiv preprint arXiv:1511.06422
Niyogi, P., Girosi, F.: On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions. Neural Comput. 8(4), 819–842 (1996)
Nøkland, A.: Direct feedback alignment provides learning in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 1037–1045 (2016)
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)
Krishnan, R., Samaranayake, V.A., Jagannathan, S.: A multi-step nonlinear dimension-reduction approach with applications to big data. IEEE Trans. Knowl. Data Eng. (2018)
Reed, S., Lee, H., Anguelov, D., Szegedy, C., Erhan, D., Rabinovich, A.: Training deep neural networks on noisy labels with bootstrapping (2014). arXiv preprint arXiv:1412.6596
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Nello (2004)
Soylemezoglu, A., Jagannathan, S., Saygin, C.: Mahalanobis taguchi system (MTS) as a prognostics tool for rolling element bearing failures. J. Manuf. Sci. Eng. 132(5), 051014 (2010)
Soylemezoglu, A., Jagannathan, S., Saygin, C.: Mahalanobis-taguchi system as a multi-sensor based decision making prognostics tool for centrifugal pump failures. IEEE Trans. Reliab. 60(4), 864–878 (2011)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sun, K., Huang, S.H., Wong, D.S.H., Jang, S.S.: Design and application of a variable selection method for multilayer perceptron neural network with LASSO. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1386–1396, June 2017. ISSN 2162-237X. https://doi.org/10.1109/TNNLS.2016.2542866
Székely, G.J., Rizzo, M.L., Bakirov, N.K., et al.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(Dec), 3371–3408 (2010)
Ward, A.D., Hamarneh, G.: The groupwise medial axis transform for fuzzy skeletonization and pruning. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1084–1096 (2010)
Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)
Wu, X., Zhu, X., Wu, G.Q., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)
Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks. In: Advances in Neural Information Processing Systems, pp. 341–349 (2012)
Xu, N., Hong, J., Fisher, T.C.: Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression (2016). arXiv preprint arXiv:1610.05448
Yu, Z., Li, L., Liu, J., Han, G.: Hybrid adaptive classifier ensemble. IEEE Trans. Cybern. 45(2), 177–190 (2015). ISSN 2168-2267. https://doi.org/10.1109/TCYB.2014.2322195
Zhang, L., Lin, J., Karim, R.: Sliding window-based fault detection from high-dimensional data streams. IEEE Trans. Syst. Man Cybern. Syst. (2016)
Zhou, J.K., Wu, J., Zhu, L.: Overlapped groupwise dimension reduction. Sci. China Math. 59(12), 2543–2560 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Krishnan, R., Jagannathan, S., Samaranayake, V.A. (2020). Direct Error Driven Learning for Classification in Applications Generating Big-Data. In: Pedrycz, W., Chen, SM. (eds) Development and Analysis of Deep Learning Architectures. Studies in Computational Intelligence, vol 867. Springer, Cham. https://doi.org/10.1007/978-3-030-31764-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-31764-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31763-8
Online ISBN: 978-3-030-31764-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)