Abstract
Missing value imputation for microarray data is important for gene expression analysis algorithms, such as clustering, classification and network design. A number of algorithms have been proposed to solve this problem, but most of them are only limited in linear analysis methods, such as including the estimation in the linear combination of other no-missing-value genes. It may result from the fact that microarray data often comprises of huge size of genes with only a small number of observations, and nonlinear regression techniques are prone to overfitting. In this paper, a quasi-linear SVR model is proposed to improve the linear approaches, and it can be explained in a piecewise linear interpolation way. Two real datasets are tested and experimental results show that the quasi-linear approach for missing value imputation outperforms both the linear and nonlinear approaches.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Liew, A.W.C., Law, N.F., Yan, H.: Missing value imputation for gene expression data:computational techniques to recover missing data from available information. Briefings in Bioinformatics 12(3), 1–16 (2010)
Kim, H., Golub, G.H., Park, H.: Missing value estimation for dna microarray gene expression data: local least squares imputation. Bioinformatics 21(2), 187–198 (2005)
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.: Missing value estimation methods for dna microarrays. Bioinformatics 17(6), 520–525 (2001)
Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)
Tarca, A.L., Romero, R., Draghici, S.: Analysis of microarray experiments of gene expression profiling. American Journal of Obstetrics and Gynaecology 195(2), 373–388 (2006)
Sahu, M.A., Swarnkar, M.T., Das, M.K.: Estimation methods for microarray data with missing values: a review. International Journal of Computer Science and Information Technologies 2(2), 614–620 (2011)
Cheng, Y., Wang, L., Hu, J.: Quasi-ARX wavelet network for SVR based nonlinear system identification. Nonlinear Theory and its Applications (NOLTA), IEICE 2(2), 165–179 (2011)
Hu, J., Kumamaru, K., Inoue, K., Hirasawa, K.: A hybrid Quasi-ARMAX modeling scheme for identification of nonlinear systems. Transections of the Society of Instrument and Control Engineers 34(8), 997–985 (1998)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Berlin (1999)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheng, Y., Wang, L., Hu, J. (2011). A Quasi-linear Approach for Microarray Missing Value Imputation. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7062. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24955-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-24955-6_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24954-9
Online ISBN: 978-3-642-24955-6
eBook Packages: Computer ScienceComputer Science (R0)