A hierarchical refinement algorithm for fully automatic gridding in spotted DNA microarray image processing
Introduction
DNA microarray technology has enabled massive gene expression profiling and genotyping [9], [10], [27]. Using DNA microarrays, scientists can determine how active thousands of genes are at any given time. These microarrays are created by robots, which print small amounts of fragmented gene sequences on a microscope slide (e.g., a glass substrate with dimension of 25 × 76 × 1 mm3) to form regularly arrayed spots. To determine which genes are expressed in a given cell, normal or diseased, a researcher collects the messenger RNA (mRNA) molecules in that cell, fragments and amplifies them, and labels the fragments with a fluorescent dye. The researcher then injects the labeled mRNA sequences into the spots on the slide. The mRNA will then hybridize to its complementary DNA (cDNA) and stay on the microarray after being washed, leaving its fluorescent tag. The researcher then uses a special scanner to capture the fluorescent areas on the microarray to obtain single-, double-, or multi-channel digital images. An example of a 16-bit double-channel DNA microarray image is shown in Fig. 1. This two-channel image was generated with a scanning microscope operating at wavelengths 532 nm (green) and 635 nm (red). These images must be processed to yield the intensity of each spot [2], [3], [15], [21], which correlates with the underlying expression level of each gene [4], [13].
Feature extraction from a microarray image consists of gridding, segmentation and quantification [24], [25]. Gridding, the process of determining the coordinates of the center of each spot on the slide, is the first step. Segmentation classifies each pixel into either foreground or background after noise removal [10]. Quantification computes the “true signal” intensity values for all spots, typically by computing background and foreground values, and subtracting the former from the latter.
Ideally, a microarray would consist of high-density spots arranged in a perfectly regular grid [20], which would make analysis of the image straightforward. However, in practice many factors complicate the analysis, e.g., inaccuracies in spot printing, hybridization inconsistency, very dense arrays designed to achieve maximum throughput, and environmental effects such as contamination [1], [25]. Thus, one needs to deal with the following issues to make gridding robust and reproducible [6], [8], [23], [26]:
- (a)
The exact location of grids varies from slide to slide.
- (b)
The relative position of subarrays may be different.
- (c)
The size and the shape of spots vary, and may be highly irregular (such as donut shaped).
- (d)
Some subarrays may be rotated.
- (e)
The substrate may be contaminated by dirt or chemicals.
- (f)
The signal intensities may be weak.
Researchers have been developing suitable gridding methods for over a decade. For example, Jain et al. [7] developed an algorithm based on horizontal and vertical projections of spot intensities, in which they made strong assumptions about the distribution of spot intensities, but allowed projective distortion of the subarrays. This approach does not perform well when the grids are rotated. Bajcsy [2] developed an automatic gridding method, Gridline, based on detecting both the intensity discontinuities at each spot location and the homogeneous intensities of background regions. This method accurately aligns spots with different radii or blurred edges, missing spots and rotated subarrays, but does not adjust for misaligned spots in a single line or high background noise.
Many free and commercial software packages deal with some of these issues nicely such as ScanAlyze (LBNL, UCB, CA 1999), Spot (CSIRO, Australia 2000), UCSF Spot (Jain Lab, UCSF) and GenePix (Molecular Devices Corporation, CA 2004). However, all these packages require a certain level of human intervention to achieve the desired accuracy, imposing a considerable burden on users.
We have developed a fully automatic DNA microarray gridding method that yields accurate and robust results. This algorithm employs morphological reconstruction along with global and local rotation detection, non-parametric optimal thresholding and local fine-tuning. We demonstrate the performance of this algorithm on synthetic data and on real microarray images of different size and with different degrees of rotation. The results suggest that this hierarchical refinement algorithm can reliably and automatically detect and compensate for alignment and rotation problems without human intervention.
Section snippets
The new algorithm for automatic gridding
This algorithm contains the following steps: image preprocessing, detection and correction of global rotation, global reference layout generation, detection and correction of local (individual subarray) rotation, and local refinement. A flowchart for this procedure is shown in Fig. 2.
Experimental results and comparisons
We have implemented our algorithm using Matlab R14 (The MathWorks, MA 2005) application development package. We have tested our program on real and synthetic microarray images exhibiting different global or local rotation. The real microarray images are in two-channel, 16-bit TIFF format, generated by a GenePix 4000 scanner, scanning spotted oligonucleotide arrays at the Center for Applied Genomics located in Newark, New Jersey, as shown in Figs. 11(a), (c), (e) and 12(a). The images in Fig. 12
Conclusions
In this paper, we have presented a novel hierarchical refinement algorithm for gridding by applying morphological reconstruction based image processing methods along with global rotation detection, non-parametric optimal threshold selection, subarray rotation detection and local fine-tuning. This method is suitable for performing accurate and fully automatic gridding in microarray image analysis even for images with alignment and rotation problems.
Because this algorithm uses non-parametric
Acknowledgements
The authors would like to thank Dr. Patricia Soteropoulos and Dr. Hui-Yun Wang for providing real microarray images for the current work, and Mr. Anthony Galante and Dr. Barry Cohen for valuable discussions and comments.
References (27)
- et al.
A dynamic model with adaptive pixel moving for microarray images segmentation
Real-Time Imaging: Special Issue on Imaging in Bioinformatics
(2004) - et al.
Noise filtering and nonparametric analysis of microarray data underscores discriminating markers of oral, prostate, lung, ovarian and breast cancer
BMC Bioinform.
(2004) Gridline: automatic grid alignment in DNA microarray scans
IEEE Trans. Image Process.
(2004)- et al.
Bioinformatics: The Machine Learning Approach
(2001) - et al.
Exploring the new world of the genome with DNA microarrays
Nat. Genetics
(1999) A computational approach to edge detection
IEEE Trans. Pattern Anal. Mach. Intelligence
(1986)- et al.
Fully automatic quantification of microarray image data
Genome Res.
(2002) - et al.
Methods for automatic microarray image segmentation
IEEE Trans. Nanobiosci.
(2003) Laboratory automation in microarray image processing
Am. Lab.
(2000)- et al.
A multichannel order-statistic technique for cDNA microarray image processing
IEEE Trans. Nanobiosci.
(2004)
Vector filtering for color imaging
IEEE Signal Processing Magazine: Special Issue on Color Image Processing
Theory of edge detection
Proc. Roy. Soc. London
Intensity-based segmentation of microarrays images
IEEE Trans. Med. Imaging
Cited by (20)
FPGA based system for automatic cDNA microarray image processing
2012, Computerized Medical Imaging and GraphicsCitation Excerpt :In Refs. [22,23] a hill climbing algorithm is used for grid alignment with increased computational complexity. Morphological operators are used in [24] and for [25] automatic microarray image addressing, with a computational complexity of O(2SeMN) where Se is the size in pixels (1 k) of the structural element for dilation and erosion. Moreover, based on a selection of microarray spots, in [26] and [8] a SVM (support vector machine) approach is reported for automatic microarray griding which overcomes the genetic algorithm presented in [27] with nearly one order of magnitude in terms of computational complexity.
An adaptable threshold detector
2011, Information SciencesCitation Excerpt :Moreover, ADT can give more suitable thresholds to satisfy various application requirements than OTM, VEM, and MCVTM. Object segmentation, an essential image processing technique, draws the expected parts out from an image to facilitate the follow-up processes [12], such as the segmentation of breast tumor from an ultrasound image [2], blood vessel from a retinal image [9], or DNA spots from a microarray image [22]. There was a simple and effective threshold-based image segmentation technique [1,7] using the feature variances of objects, i.e., gray-scales [17,18], to divide an image into several non-overlapping regions.
A soft multi-core architecture for edge detection and data analysis of microarray images
2010, Journal of Systems ArchitectureAutomated microarray image gridding using image projection vectors coupled with power spectrum model
2010, International Journal of Pattern Recognition and Artificial IntelligenceHardware architectures for iterative algorithms implementations
2018, Signals and Communication Technology