Skip to main content
Log in

Automated analysis of DNA hybridization images for high-throughput genomics

  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract.

The design and implementation of a computer vision system called DNAScan for the automated analysis of DNA hybridization images is presented. The hybridization of a DNA clone with a radioactively tagged probe manifests itself as a spot on the hybridization membrane. The imaging of the hybridization membranes and the automated analysis of the resulting images are imperative for high-throughput genomics experiments. A recursive segmentation procedure is designed and implemented to extract spotlike features in the hybridization images in the presence of a highly inhomogeneous background. Positive hybridization signals (hits) are extracted from the spotlike features using grouping and decomposition algorithms based on computational geometry. A mathematical model for the positive hybridization patterns and a Bayesian pattern classifier based on shape-based moments are proposed and implemented to distinguish between the clone-probe hybridization signals. Experimental results on real hybridization membrane images are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Affymetrix Inc (2002) GeneChip CYP 450 Assay. Santa Clara, CA

  2. Agilent Technologies (2002) www.chem.agilent.com. Palo Alto, CA

  3. Arnold J (1997) Editorial. Fungal Genet Biol 21:254-257

    Article  Google Scholar 

  4. Audic S, Zanetti G (1995) Automatic reading of hybridization filter images. Comput Appl Biol Sci(5):489-495

    Google Scholar 

  5. Azuaje F (2002) A cluster validity framework for genome expression data. Bioinformatics 18:319-320

    Article  Google Scholar 

  6. Ben-Dor A, Yakhini Z (1999) Clustering gene expression patterns. In: Proceedings of the ACM conference on research in comparative molecular biology (RECOMB), Lyon, France, April 1999, pp 33-42

  7. Bennett JW (1997) White paper: genomics for filamentous fungi. Fungal Genet Biol 21:3-7

    Article  MATH  Google Scholar 

  8. Bhalla US, Iyengar R (1999) Emergent properties of networks of biological signaling pathways. Science 283:381-387

    Article  Google Scholar 

  9. Bhandarkar SM, Chirravuri S, Machaka S, Arnold J (1998) Parallel computing for chromosome reconstruction via ordering of DNA sequences. Parallel Comput 24(8):1177-1204

    Article  Google Scholar 

  10. BioDiscovery Inc (2002) AutoGene v2.5. www.biodiscovery.com. Los Angeles

  11. Brandle N, Chen H-Y, Bischof H, Lapp H (2000) Robust parametric and semi-parametric spot fitting for spot array images. In: Proceedings of the 8th international conference on intelligent systems for molecular biology, La Jolla, CA, 20-23 August 2000, pp 46-56

  12. Brandle N, Bischof H, Lapp H (2001) A generic and robust approach for the analysis of spot array images. In: Proceedings of the SPIE conference on progress in biomedical optics and imaging: microarrays: optical technologies and informatics. San Jose, CA, 20-21 January 2001, 4266:1-12

  13. Brown T (1999) Genomes. Wiley, New York

  14. Brown CS, Goodwin PC, Sorger PK (2001) Image metrics in the statistical analysis of DNA microarray data. Proc Natl Acad Sci USA 98(16):8944-8949

    Article  Google Scholar 

  15. Bouton CMLS, Pevsneri J (2002) DRAGON View: information visualization for annotated microarray data. Bioinformatics 18:323-324

    Article  Google Scholar 

  16. Bumm K, Zhang M, Bailey C, Zhan F, Chiriva-Internati M, Eddlemon P, Terry J, Barlogie B, Shaughnessy Jr JD (2002) CGO: utilizing and integrating gene expression microarray data in clinical research and data management. Bioinformatics 18:327-328

    Article  Google Scholar 

  17. Cai WW, Reneker J, Chow CW, Vaishnav M, Bradley A (1998) An anchored framework BAC map of mouse chromosome 11 assembled using multiplex oligonucleotide hybridization. Genomics 54:387-397

    Article  Google Scholar 

  18. Chapman S, Schenk P, Kazan K, Manners J (2001) Using biplots to interpret gene expression patterns in plants. Bioinformatics 18(1):202-204

    Article  Google Scholar 

  19. Chen T, He HL, Church GM (1999) Modeling gene expression with differential equations. In: Proceedings of the Pacific symposium on biocomputing, Big Island, HI, January 1999, pp 29-40

  20. Chen H-Y, Brandle N, Bischof H, Lapp H (2000) Robust spot fitting for genetic spot array images. In: Proceedings of the international conference on image processing (ICIP), Vancouver, BC, Canada, 10-13 September 2000, pp 412-415

  21. Chen T, Filkov V, Skiena SS (2001) Identifying gene regulatory networks from experimental data. Parallel Comput 27:141-162

    Article  MATH  Google Scholar 

  22. Clemson University (2002) Clemson University Genomics Institute. www.genome.clemson.edu

  23. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc B39:1-38

    Google Scholar 

  24. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95:14863-14868

    Article  Google Scholar 

  25. Evans GA, Lewis KA (1989) Physical mapping of complex genomes by cosmid multiplex analysis. Proc Natl Acad Sci USA 86:5030-5034

    Google Scholar 

  26. Filkov V, Skiena S, Zhi J (2001) Analysis techniques for microarray time-series data. In: Proceedings of the ACM conference on research in computational molecular biology (RECOMB), Montreal, pp 124-131

  27. Friedman M, Kandel A (1999) Introduction to pattern recognition: statistical, structural, neural and fuzzy logic approaches. World Scientific, New York

    Google Scholar 

  28. Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. In: Proceedings of the ACM conference research in computational molecular biology (RECOMB), Tokyo, Japan, pp 127-135

  29. Galil Z, Micali S, Gabow H (1986) An \(O(EV \log V)\) algorithm for finding a maximal weighted matching in general graphs. SIAM J Comput 15(1):120-130

    MathSciNet  MATH  Google Scholar 

  30. Garey MS, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York

    Google Scholar 

  31. Ghosh D, Chinnaiyan AL (2002) Mixture modelling of gene expression data from microarray experiments. Bioinformatics 18:275-286

    Article  Google Scholar 

  32. Hall D, Bhandarkar SM, Arnold J, Jiang T (2001) Physical mapping with automatic capture of hybridization data. Bioinformatics 17(3):205-213

    Article  Google Scholar 

  33. Hartuv E, Schmitt A, Lange J, Meier-Ewert S, Lehrach H, Shamir R (1999) An algorithm for clustering cDNAs for gene expression analysis. In: Proceedings of the ACM conference on research in computational molecular biology (RECOMB), Lyon, France, April 1999, pp 188-197

  34. Hu MK (1962) Visual pattern recognition by moment invariants. IRE Trans Inf Theory IT-8:179-187

    Google Scholar 

  35. Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs, NJ

  36. Jain RC, Kasturi R, Schunk BG (1995) Machine vision. McGraw-Hill, New York

  37. Jain AN, Tokuyasu TA, Snijders AM, Segraves R, Albertson DG, Pinkel D (2002) Fully automatic quantification of microarray image data. Genome Res 12:325-332

    Article  Google Scholar 

  38. Jansen R, Greenbaum D, Gerstein M (2002) Relating whole-genome expression data with protein-protein interactions. Genome Res 12:37-46

    Article  Google Scholar 

  39. Kass RE, Raftery JE (1995) Bayes factors. J Am Stat Assoc 90:773-795

    MATH  Google Scholar 

  40. Kececioglu JD, Myers EW (1995) Combinatorial algorithms for DNA sequence assembly. Algorithmica 13:7-51

    MathSciNet  MATH  Google Scholar 

  41. Lashkari DA, DeRisi JL, McCusker JH, Namath AF, Gentile C, Hwang SY, Brown PO, Davis, RW (1997) Yeast microarrays for genome wide parallel genetic and gene expression analysis. Proc Natl Acad Sci USA 95:13057-13062

    Article  Google Scholar 

  42. Leach S, Hunter L (2000) Compartive study of clustering techniques for gene expression microarray data. In: Miyano S, Shamir R, Takagi T (eds) Currents in computational molecular biology. Universal Academy Press, Tokyo, pp 1-2

  43. Manduchi E, Grant GR, McKenzie SE, Overton GC, Surrey S, Stoeckert CJ (2000) Generations of patterns from gene expression data by assigning confidence to differentially expressed genes. Bioinformatics 16(8):685-698

    Article  Google Scholar 

  44. McLachlan GJ, Bean RW, Peel D (2002) A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18:413-422

    Article  Google Scholar 

  45. Michaels GS, Carr DB, Askenazi M, Fuhrman S, Wen X, Somogyi R (1998) Cluster analysis and data visualization of large scale gene expression data. In: Proceedings of the Pacific symposium on biocomputing, Big Island, HI, 3:42-53

  46. Pan W (2002) A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics 18(4):546-554

    Article  Google Scholar 

  47. Pan W, Lin J, Le C (2002) Model-based cluster analysis of microarray gene expression data. Genome biology variables and stochastic processes. McGraw-Hill, New York

  48. Papoulis A (1965) Probability, random variables and stochastic processes. McGraw-Hill, New York

  49. Piper J, Rutovitz D, Sudar D, Kallioniemi A, Kallioniemi O, Waldman F, Gray J, Pinkel D (1995) Computer image analysis of comparative genomic hybridization. Cytometry 19:10-26; 3(2):research0009.1-research0009.8

    Google Scholar 

  50. Preparata FP, Shamos MI (1991) Computational geometry: an introduction. Springer, Berlin Heidelberg New York

    MathSciNet  Google Scholar 

  51. Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1988) Numerical recipes in C. Cambridge University Press, Cambridge, UK

  52. Roth K, Wolf G, Dietel M, Peterson I (1997) Image analysis for comparative genomic hybridization based on a karyotyping program for Windows. Anal Quantit Cytol Histol 19(6):461-473

    Google Scholar 

  53. Sahibsingh AD, Breeding KJ, McGhee RB (1977) Aircraft identification by moment invariants. IEEE Trans Comput 26(1):39-45

    Google Scholar 

  54. Samet H (1990) The design and analysis of spatial data structures. Addison-Wesley, Reading, MA

  55. Scanalytics Inc(2002) Scanalytics Inc, Fairfax, VA. www.scanalytics.com

  56. Schena M, Shalon D, Heller R, Chai A, Brown PO, Davis RW (1996) Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Proc Natl Acad Sci USA 93:10614-10619

    Article  Google Scholar 

  57. Shamir R, Sharan R (2000) CLICK: a clustering algorithm for gene expression analysis. In: Miyano S, Shamir R, Takagi T (eds) Currents in computational molecular biology. Universal Academy Press, Tokyo, pp 6-7

  58. Sigma-Genosys Inc (2002) Sigma-Genosys, The Woodlands, TX. www.sigma-genosys.com

  59. Spectral Genomics (2002) Spectral Genomics, Houston, TX. www.spectralgenomics.com

  60. Stanford University (2002) Stanford University Genomic Resources. www-genome.stanford.edu

  61. Steinfath M, Wruck W, Seidel H, Lehrach H, Radelof U, O’Brien J (2001) Automated image analysis for array hybridization experiments. Bioinformatics 17(7):634-641

    Article  Google Scholar 

  62. Sturn A, Quackenbush J, Trajanoski Z (2002) Genesis: cluster analysis of microarray data. Bioinformatics 18:207-208

    Article  Google Scholar 

  63. Theodoridis S, Koutroumbas K (1999) Pattern recognition. Academic, San Diego

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suchendra M. Bhandarkar.

Additional information

Received: 25 June 2002, Accepted: 11 November 2003, Published online: 17 February 2004

Correspondence to: Suchendra M. Bhandarkar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhandarkar, S.M., Jiang, T., Verma, K. et al. Automated analysis of DNA hybridization images for high-throughput genomics. Machine Vision and Applications 15, 121–138 (2004). https://doi.org/10.1007/s00138-003-0134-1

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-003-0134-1

Keywords:

Navigation