Abstract.
The design and implementation of a computer vision system called DNAScan for the automated analysis of DNA hybridization images is presented. The hybridization of a DNA clone with a radioactively tagged probe manifests itself as a spot on the hybridization membrane. The imaging of the hybridization membranes and the automated analysis of the resulting images are imperative for high-throughput genomics experiments. A recursive segmentation procedure is designed and implemented to extract spotlike features in the hybridization images in the presence of a highly inhomogeneous background. Positive hybridization signals (hits) are extracted from the spotlike features using grouping and decomposition algorithms based on computational geometry. A mathematical model for the positive hybridization patterns and a Bayesian pattern classifier based on shape-based moments are proposed and implemented to distinguish between the clone-probe hybridization signals. Experimental results on real hybridization membrane images are presented.
Similar content being viewed by others
References
Affymetrix Inc (2002) GeneChip CYP 450 Assay. Santa Clara, CA
Agilent Technologies (2002) www.chem.agilent.com. Palo Alto, CA
Arnold J (1997) Editorial. Fungal Genet Biol 21:254-257
Audic S, Zanetti G (1995) Automatic reading of hybridization filter images. Comput Appl Biol Sci(5):489-495
Azuaje F (2002) A cluster validity framework for genome expression data. Bioinformatics 18:319-320
Ben-Dor A, Yakhini Z (1999) Clustering gene expression patterns. In: Proceedings of the ACM conference on research in comparative molecular biology (RECOMB), Lyon, France, April 1999, pp 33-42
Bennett JW (1997) White paper: genomics for filamentous fungi. Fungal Genet Biol 21:3-7
Bhalla US, Iyengar R (1999) Emergent properties of networks of biological signaling pathways. Science 283:381-387
Bhandarkar SM, Chirravuri S, Machaka S, Arnold J (1998) Parallel computing for chromosome reconstruction via ordering of DNA sequences. Parallel Comput 24(8):1177-1204
BioDiscovery Inc (2002) AutoGene v2.5. www.biodiscovery.com. Los Angeles
Brandle N, Chen H-Y, Bischof H, Lapp H (2000) Robust parametric and semi-parametric spot fitting for spot array images. In: Proceedings of the 8th international conference on intelligent systems for molecular biology, La Jolla, CA, 20-23 August 2000, pp 46-56
Brandle N, Bischof H, Lapp H (2001) A generic and robust approach for the analysis of spot array images. In: Proceedings of the SPIE conference on progress in biomedical optics and imaging: microarrays: optical technologies and informatics. San Jose, CA, 20-21 January 2001, 4266:1-12
Brown T (1999) Genomes. Wiley, New York
Brown CS, Goodwin PC, Sorger PK (2001) Image metrics in the statistical analysis of DNA microarray data. Proc Natl Acad Sci USA 98(16):8944-8949
Bouton CMLS, Pevsneri J (2002) DRAGON View: information visualization for annotated microarray data. Bioinformatics 18:323-324
Bumm K, Zhang M, Bailey C, Zhan F, Chiriva-Internati M, Eddlemon P, Terry J, Barlogie B, Shaughnessy Jr JD (2002) CGO: utilizing and integrating gene expression microarray data in clinical research and data management. Bioinformatics 18:327-328
Cai WW, Reneker J, Chow CW, Vaishnav M, Bradley A (1998) An anchored framework BAC map of mouse chromosome 11 assembled using multiplex oligonucleotide hybridization. Genomics 54:387-397
Chapman S, Schenk P, Kazan K, Manners J (2001) Using biplots to interpret gene expression patterns in plants. Bioinformatics 18(1):202-204
Chen T, He HL, Church GM (1999) Modeling gene expression with differential equations. In: Proceedings of the Pacific symposium on biocomputing, Big Island, HI, January 1999, pp 29-40
Chen H-Y, Brandle N, Bischof H, Lapp H (2000) Robust spot fitting for genetic spot array images. In: Proceedings of the international conference on image processing (ICIP), Vancouver, BC, Canada, 10-13 September 2000, pp 412-415
Chen T, Filkov V, Skiena SS (2001) Identifying gene regulatory networks from experimental data. Parallel Comput 27:141-162
Clemson University (2002) Clemson University Genomics Institute. www.genome.clemson.edu
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc B39:1-38
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95:14863-14868
Evans GA, Lewis KA (1989) Physical mapping of complex genomes by cosmid multiplex analysis. Proc Natl Acad Sci USA 86:5030-5034
Filkov V, Skiena S, Zhi J (2001) Analysis techniques for microarray time-series data. In: Proceedings of the ACM conference on research in computational molecular biology (RECOMB), Montreal, pp 124-131
Friedman M, Kandel A (1999) Introduction to pattern recognition: statistical, structural, neural and fuzzy logic approaches. World Scientific, New York
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. In: Proceedings of the ACM conference research in computational molecular biology (RECOMB), Tokyo, Japan, pp 127-135
Galil Z, Micali S, Gabow H (1986) An \(O(EV \log V)\) algorithm for finding a maximal weighted matching in general graphs. SIAM J Comput 15(1):120-130
Garey MS, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York
Ghosh D, Chinnaiyan AL (2002) Mixture modelling of gene expression data from microarray experiments. Bioinformatics 18:275-286
Hall D, Bhandarkar SM, Arnold J, Jiang T (2001) Physical mapping with automatic capture of hybridization data. Bioinformatics 17(3):205-213
Hartuv E, Schmitt A, Lange J, Meier-Ewert S, Lehrach H, Shamir R (1999) An algorithm for clustering cDNAs for gene expression analysis. In: Proceedings of the ACM conference on research in computational molecular biology (RECOMB), Lyon, France, April 1999, pp 188-197
Hu MK (1962) Visual pattern recognition by moment invariants. IRE Trans Inf Theory IT-8:179-187
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs, NJ
Jain RC, Kasturi R, Schunk BG (1995) Machine vision. McGraw-Hill, New York
Jain AN, Tokuyasu TA, Snijders AM, Segraves R, Albertson DG, Pinkel D (2002) Fully automatic quantification of microarray image data. Genome Res 12:325-332
Jansen R, Greenbaum D, Gerstein M (2002) Relating whole-genome expression data with protein-protein interactions. Genome Res 12:37-46
Kass RE, Raftery JE (1995) Bayes factors. J Am Stat Assoc 90:773-795
Kececioglu JD, Myers EW (1995) Combinatorial algorithms for DNA sequence assembly. Algorithmica 13:7-51
Lashkari DA, DeRisi JL, McCusker JH, Namath AF, Gentile C, Hwang SY, Brown PO, Davis, RW (1997) Yeast microarrays for genome wide parallel genetic and gene expression analysis. Proc Natl Acad Sci USA 95:13057-13062
Leach S, Hunter L (2000) Compartive study of clustering techniques for gene expression microarray data. In: Miyano S, Shamir R, Takagi T (eds) Currents in computational molecular biology. Universal Academy Press, Tokyo, pp 1-2
Manduchi E, Grant GR, McKenzie SE, Overton GC, Surrey S, Stoeckert CJ (2000) Generations of patterns from gene expression data by assigning confidence to differentially expressed genes. Bioinformatics 16(8):685-698
McLachlan GJ, Bean RW, Peel D (2002) A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18:413-422
Michaels GS, Carr DB, Askenazi M, Fuhrman S, Wen X, Somogyi R (1998) Cluster analysis and data visualization of large scale gene expression data. In: Proceedings of the Pacific symposium on biocomputing, Big Island, HI, 3:42-53
Pan W (2002) A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics 18(4):546-554
Pan W, Lin J, Le C (2002) Model-based cluster analysis of microarray gene expression data. Genome biology variables and stochastic processes. McGraw-Hill, New York
Papoulis A (1965) Probability, random variables and stochastic processes. McGraw-Hill, New York
Piper J, Rutovitz D, Sudar D, Kallioniemi A, Kallioniemi O, Waldman F, Gray J, Pinkel D (1995) Computer image analysis of comparative genomic hybridization. Cytometry 19:10-26; 3(2):research0009.1-research0009.8
Preparata FP, Shamos MI (1991) Computational geometry: an introduction. Springer, Berlin Heidelberg New York
Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1988) Numerical recipes in C. Cambridge University Press, Cambridge, UK
Roth K, Wolf G, Dietel M, Peterson I (1997) Image analysis for comparative genomic hybridization based on a karyotyping program for Windows. Anal Quantit Cytol Histol 19(6):461-473
Sahibsingh AD, Breeding KJ, McGhee RB (1977) Aircraft identification by moment invariants. IEEE Trans Comput 26(1):39-45
Samet H (1990) The design and analysis of spatial data structures. Addison-Wesley, Reading, MA
Scanalytics Inc(2002) Scanalytics Inc, Fairfax, VA. www.scanalytics.com
Schena M, Shalon D, Heller R, Chai A, Brown PO, Davis RW (1996) Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Proc Natl Acad Sci USA 93:10614-10619
Shamir R, Sharan R (2000) CLICK: a clustering algorithm for gene expression analysis. In: Miyano S, Shamir R, Takagi T (eds) Currents in computational molecular biology. Universal Academy Press, Tokyo, pp 6-7
Sigma-Genosys Inc (2002) Sigma-Genosys, The Woodlands, TX. www.sigma-genosys.com
Spectral Genomics (2002) Spectral Genomics, Houston, TX. www.spectralgenomics.com
Stanford University (2002) Stanford University Genomic Resources. www-genome.stanford.edu
Steinfath M, Wruck W, Seidel H, Lehrach H, Radelof U, O’Brien J (2001) Automated image analysis for array hybridization experiments. Bioinformatics 17(7):634-641
Sturn A, Quackenbush J, Trajanoski Z (2002) Genesis: cluster analysis of microarray data. Bioinformatics 18:207-208
Theodoridis S, Koutroumbas K (1999) Pattern recognition. Academic, San Diego
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 25 June 2002, Accepted: 11 November 2003, Published online: 17 February 2004
Correspondence to: Suchendra M. Bhandarkar
Rights and permissions
About this article
Cite this article
Bhandarkar, S.M., Jiang, T., Verma, K. et al. Automated analysis of DNA hybridization images for high-throughput genomics. Machine Vision and Applications 15, 121–138 (2004). https://doi.org/10.1007/s00138-003-0134-1
Issue Date:
DOI: https://doi.org/10.1007/s00138-003-0134-1