Abstract
One of the most used Artificial Neural Networks models is the Multi-Layer Perceptron, which is capable to fit any function as long as they have enough number of neurons and network layers. The process of obtaining a properly trained Artificial Neural Network usually requires a great effort in determining the parameters that will make it to learn. Currently there are a variety of algorithms for Artificial Neural Networks’s training working, simply, in order to minimize the sum of mean square error. However, even if the network reaches the global minimum error, it does not imply that the model response is optimal. Basically, a network with large number of weights but with small amplitudes behaves as an underfitted model that gradually overfits data during training. Solutions that have been overfitting are unnecessary complexity solutions.Moreover, solutions with low norm of the weights are those that present underfitting, with low complexity. The Multi-Objective Algorithmcontrols the weights amplitude by optimizing two objective functions: the error function and norm function. The high generalization capability of the Multi-Objective Algorithm and an automatic weight selection is aggregated by the LASSO approach, which generates networks with reduced number of weights when compared with Multi-Objective Algorithm solutions. Four data sets were chosen in order to compare and evaluate MOBJ, LASSO and Early-Stopping solutions. One generated from a function and tree available from a Machine Learning Repository. Additionally, the MOBJ and LASSO algorithms are applied to a microarray data set, which samples correspond to a genetic expression profile from DNA microarray technology of neoadjuvant chemotherapy (treatment given prior to surgery) for patients with breast cancer. Originally, the dataset is composed of 133 samples with 22283 attributes. By applying e probe section method described in the literature, 30 attributes were selected and used to train the Artificial Neural Networks. In average, the MOBJ and LASSO solutions were the same, the main difference is the simplified topology achieve by LASSO training method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Medeiros, T., Braga, A.P.: A new decision strategy in multi-objective training of artificial neural networks. In: European Symposium on Neural Networks (ESANN 2007), pp. 555–560 (2007)
Kokshenev, I., Braga, A.P.: Complexity bounds for radial basis functions and multi- objective learning. In: European Symposium on Neural Networks (ESANN 2007), pp. 73–78 (2007)
Natowicz, R., Incitti, R., Horta, E.G., Charles, B., Guinot, P., Yan, K., Coutant, C., Andre, F., Pusztai, L., Rouzier, R.: Prediction of the outcome of preoperative chemotherapy in breast cancer by DNA probes that convey information on both complete and non complete responses. BMC Bioinformatics 9, 149 (2008)
Simon, R.: Development and validation of therapeutically relevant multi-gene biomarker classifiers. J. Natl. Cancer Inst. 97, 866–867 (2005)
Maglietta, R., D’Addabbo, A., Piepoli, A., Perri, F., Liuni, S., Pesole, G., Ancona, N.: Selection of relevant genes in cancer diagnosis based on their prediction accuracy. Artif. Intell. Med. 40, 29–44 (2007)
Verducci, J.S., Melfi, V.F., Lin, S., Wang, Z., Roy, S., Sen, C.K.: Microarray analysis of gene expression: Considerations in data mining and statistical treatment. Physiol. Genomics 25, 355–363 (2006)
Ancona, N., Maglietta, R., Piepoli, A., D’Addabbo, A., Cotugno, R., Savino, M., Liuni, S., Carella, M., Pesole, G., Perri, F.: On the statistical assessment of classifiers using DNA microarray data. BMC Bioinformatics 7, 387 (2006)
Chen, J.J., Wang, S.J., Tsai, C.A., Lin, C.J.: Selection of differentially expressed genes in microarray data analysis. Pharmacogenomics 7, 212–220 (2007)
Tozlu, S., Girault, I., Vacher, S., Vendrell, J., Andrieu, C., Spyratos, F., Cohen, P., Lidereau, R., Bieche, I.: Identification of novel genes that co-cluster with estrogen receptor alpha in breast tumor biopsy specimens, using a large-scale real-time reverse transcription-PCR approach. Endocr. Relat. Cancer 13, 1109–1120 (2006)
Rouzier, R., Pusztai, L., Delaloge, S., Gonzalez-Angulo, A.M., Andre, F., Hess, K.R., Buzdar, A.U., Garbay, J.R., Spielmann, M., Mathieu, M.C., Symmans, W.F., Wagner, P., Atallah, D., Valero, V., Berry, D.A., Hortobagyi, G.N.: Nomograms to predict pathologic complete response and metastasis-free survival after preoperative chemotherapy for breast cancer. J. Clin. Oncol. 23, 8331–8339 (2005)
Rouzier, R., Rajan, R., Wagner, P., Hess, K.R., Gold, D.L., Stec, J., Ayers, M., Ross, J.S., Zhang, P., Buchholz, T.A., Kuerer, H., Green, M., Arun, B., Hortobagyi, G.N., Symmans, W.F., Pusztai, L.: Microtubule-associated protein tau: A marker of paclitaxel sensitivity in breast cancer. Proc. Natl. Acad. Sci. USA 102, 8315–8320 (2005)
Chang, J.C., Hilsenbeck, S.G., Fuqua, S.A.: The promise of microarrays in the management and treatment of breast cancer. Breast Cancer Res. 7, 100–104 (2005)
Parma, G.G., Menezes, B.R., Braga, A.P.: Sliding mode algorithm for training multi-layer neural network. IEEE Letters 38(1), 97–98 (1998)
Hagan, M.T., Menhaj, M.B.: Training feedforward networks with Marquardt algorithm. IEEE Transactions on Neural Networks 5(6), 989–993 (1994)
Riedmiller, M., Braun, H.: A direct adaptative method for faster backpropagation learning: The RPROP algorthm. In: Proceddings of the IEEE Intl. Conf. on Neural Networks, San Francisco, pp. 586–591 (1993)
Fahlman, S.E.: Faster-learning variations on backpropagation: an empirical study. In: Proceddings of a 1988 Connectionist Models Summer School, Pittsburg, pp. 38–51 (1988)
Hinton, G.E.: Connectionist learning procedures. Artificial Inteligence 40, 185–234 (1989)
Rumelhart, D., Hinton, G., Williams, R.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Haykin, S.: Neural networks: A comprehensive foundation. Macmillan, New York (1994)
Rumelhart, D., McClelland, J.: Parallel Distributed Processing, vol. 1. MIT Press, Cambridge (1986)
Bartlett, P.L.: For valid generalization, the size of the weights is more important than the size of the network. In: Proceedings of NIPS 9, pp. 134–140 (1997)
Prechelt, L.: Automatic early stopping using cross validation: quantifying the criteria. Neural Networks 11(4), 761–767 (1998)
Krogh, A., Hertz, J.A.: A Simple Weight Decay Can Improve Generalization. In: Proceedings of NIPS, vol. 4, pp. 950–957 (1991)
Reed, R.: Pruning algorithms - a survey. IEEE Transactions on Neural Networks 4(5), 740–746 (1993)
Teixeira, R.A., Braga, A.P., Takahashi, R.H.C., Saldanha, R.R.: Recent Advances in the MOBJ Algorithm for training Artificial Neural Networks. International Journal of Neural Systems 11, 265–270 (2001)
Bland, R.G., Goldfarb, D., Todd, M.J.: The ellipsoid method: a survey. Operations Research 29(6), 1039–1091 (1981)
Costa, M.A., Braga, A.P., Menezes, B.R., Teixeira, R.A., Parma, G.G.: Training neural networls with a multi-objective sliding mode control algorithm. Neurocomputing 51, 467–473 (2003)
Costa, M.A., Braga, A.P., Menezes, B.R.: Improved Generalization Learning with Sliding Mode Control and the Levenberg-Marquadt Algorithm. In: Proceedings of VII Brazilian Symposium on Neural Networks (2002)
Jin, Y., Okabe, T., Sendhoff, B.: Neural network regularization and ensembling using multi-objective algorithms. In: Proceedings of the 2004 IEEE Congress on Evolutionary Computation, pp. 1–8 (2004)
Fieldsend, J.E., Singh, S.: Optimizing forecast model complexity using multi-objective evolutionary algorithms. In: Applications of Multi-Objective Evolutionary Algorithms, pp. 675–700. World Scientific, Singapore (2004)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer Series in Statistics (2001)
Tibshirani, R.: Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological) 58(1), 267–288 (1996)
Costa, M.A., Braga, A.P., Menezes, B.R.: Improving neural networks generalization with new constructive and pruning methods. Journal of intelligent & Fuzzy Systems 13, 73–83 (2003)
Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. of Eviron. Economics & Management 5, 81–102 (1978)
May, K.O.: A Set of Independent Necessary and Sufficient Conditions for Simple Majority Decision. Econometrica 20(4), 680–684 (1952)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Costa, M. et al. (2009). New Multi-Objective Algorithms for Neural Network Training Applied to Genomic Classification Data. In: Hassanien, AE., Abraham, A., Vasilakos, A.V., Pedrycz, W. (eds) Foundations of Computational, Intelligence Volume 1. Studies in Computational Intelligence, vol 201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01082-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-01082-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01081-1
Online ISBN: 978-3-642-01082-8
eBook Packages: EngineeringEngineering (R0)