Skip to main content

A Novel Adaptive Multiple Imputation Algorithm

  • Conference paper
Bioinformatics Research and Development (BIRD 2008)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 13))

Included in the following conference series:

  • 752 Accesses

Abstract

The accurate estimation of missing values is important for efficient use of DNA microarray data since most of the analysis and clustering algorithms require a complete data matrix. Several imputation algorithms have already been proposed in the biological literature. Most of these approaches identify, in one or another way, a fixed number of neighbouring genes for the estimation of each missing value. This increases the possibility of involving in the evaluation process gene expression profiles, which are rather distant from the profile of the target gene. The latter may significantly affect the performance of the applied imputation algorithm. We propose in this article a novel adaptive multiple imputation algorithm, which uses a varying number of neighbouring genes for the estimation of each missing value. The algorithm generates for each missing value a list of multiple candidate estimation values and then selects the most suitable one, according to some well-defined criteria, in order to replace the missing entry. The similarity between the expression profiles can be estimated either with the Euclidean metric or with the Dynamic Time Warping (DTW) distance measure. In this way, the proposed algorithm can be applied for the imputation of missing values for both non-time series and time series data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aach, J., Church, G.M.: Aligning gene expression time series with time warping algorithms. Bioinformatics 17, 495–508 (2001)

    Article  Google Scholar 

  2. Criel, J., Tsiporkova, E.: Gene Time Expression Warper: A tool for alignment, template matching and visualization of gene expression time series. Bioinformatics 22(2), 251–252 (2006)

    Article  Google Scholar 

  3. Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G., Botstein, D., Brown, P.O.: Genomic expression programs in the response of yeast cells to environmental changes. Molecular Biology of the Cell 11, 4241–4257 (2000)

    Google Scholar 

  4. Hermans, F., Tsiporkova, E.: Merging microarray cell synchronization experiments through curve alignment. Bioinformatics 23, e64–e70 (2007)

    Article  Google Scholar 

  5. Hastie, T., Tibshirani, R., Sherlock, G., Eisen, M., Brown, P., Botsein, D.: Imputing missing data for gene expression arrays. Technical report, Division of Biostatistics, Standford University (1999)

    Google Scholar 

  6. Kim, H., Golub, G.H., Park, H.: Missing value estimation for DNA microarray gene expression data: Local least squares imputation. Bioinformatics 21, 187–198 (2005)

    Article  Google Scholar 

  7. Kim, K., Kim, B.J., Yi, G.S.: Reuse of imputed data in microarray analysis increases imputation efficiency. BMC Bioinformatics 5, 160 (2004)

    Article  Google Scholar 

  8. Little, R., Rubin, D.: Statistical analysis with missing data. Wiley, New York (1987)

    MATH  Google Scholar 

  9. Nguyen, D., Wang, N., Carroll, R.: Evaluation of missing value estimation for microarray data. Journal of Data Science 2, 347–370 (2004)

    Google Scholar 

  10. Oba, S., Sato, M., Takemasa, I., Monden, M., Matsubara, K., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19, 2088–2096 (2003)

    Article  Google Scholar 

  11. Rubin, D.B.: Multiple imputation for nonresponse in surveys. John Wiley & Sons, Inc., New York (1987)

    Google Scholar 

  12. Rustici, G., Mata, J., Kivinen, K., Lio, P., Penkett, C.J., Burns, G., Hayles, J., Brazma, A., Nurse, P., Bähler, J.: Periodic gene expression program of the fission yeast cell cycle. Nat. Genet.

    Google Scholar 

  13. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. on Acoust., Speech, and Signal Proc. ASSP 26, 43–49 (1978)

    Article  MATH  Google Scholar 

  14. Sankoff, D., Kruskal, J.: Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. AddisonWesley, Reading Mass. (1983)

    Google Scholar 

  15. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)

    Article  Google Scholar 

  16. Tsiporkova, E., Boeva, V.: Dynamic time warping techniques for missing value estimation in gene expression time series. In: Proc. of the 15th Dutch-Belgium Conference on Machine Learning, pp. 97–104 (2006)

    Google Scholar 

  17. Tsiporkova, E., Boeva, V.: Two-pass imputation algorithm for missing value estimation in gene expression time series. Journal of Bioinformatics and Computational Biology 5(5), 1005–1022 (2007)

    Article  Google Scholar 

  18. http://www.tu-plovdiv.bg/Container/bi/DTWimpute

  19. Zhipeng, C., Maysam, H., Guohui, L.: Iterated local least squares microarray missing value imputation. Journal of Bioinformatics and Computational Biology 4(5), 935–957 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Mourad Elloumi Josef Küng Michal Linial Robert F. Murphy Kristan Schneider Cristian Toma

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Boeva, V., Tsiporkova, E. (2008). A Novel Adaptive Multiple Imputation Algorithm. In: Elloumi, M., Küng, J., Linial, M., Murphy, R.F., Schneider, K., Toma, C. (eds) Bioinformatics Research and Development. BIRD 2008. Communications in Computer and Information Science, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70600-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70600-7_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70598-7

  • Online ISBN: 978-3-540-70600-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics