Elsevier

Information Sciences

Volume 177, Issue 4, 15 February 2007, Pages 1123-1135
Information Sciences

A hierarchical refinement algorithm for fully automatic gridding in spotted DNA microarray image processing

https://doi.org/10.1016/j.ins.2006.07.004Get rights and content

Abstract

Gridding, the first step in spotted DNA microarray image processing, usually requires human intervention to achieve acceptable accuracy. We present a new algorithm for automatic gridding based on hierarchical refinement to improve the efficiency, robustness and reproducibility of microarray data analysis. This algorithm employs morphological reconstruction along with global and local rotation detection, non-parametric optimal thresholding and local fine-tuning without any human intervention. Using synthetic data and real microarray images of different sizes and with different degrees of rotation of subarrays, we demonstrate that this algorithm can detect and compensate for alignment and rotation problems to obtain reliable and robust results.

Introduction

DNA microarray technology has enabled massive gene expression profiling and genotyping [9], [10], [27]. Using DNA microarrays, scientists can determine how active thousands of genes are at any given time. These microarrays are created by robots, which print small amounts of fragmented gene sequences on a microscope slide (e.g., a glass substrate with dimension of 25 × 76 × 1 mm3) to form regularly arrayed spots. To determine which genes are expressed in a given cell, normal or diseased, a researcher collects the messenger RNA (mRNA) molecules in that cell, fragments and amplifies them, and labels the fragments with a fluorescent dye. The researcher then injects the labeled mRNA sequences into the spots on the slide. The mRNA will then hybridize to its complementary DNA (cDNA) and stay on the microarray after being washed, leaving its fluorescent tag. The researcher then uses a special scanner to capture the fluorescent areas on the microarray to obtain single-, double-, or multi-channel digital images. An example of a 16-bit double-channel DNA microarray image is shown in Fig. 1. This two-channel image was generated with a scanning microscope operating at wavelengths 532 nm (green) and 635 nm (red). These images must be processed to yield the intensity of each spot [2], [3], [15], [21], which correlates with the underlying expression level of each gene [4], [13].

Feature extraction from a microarray image consists of gridding, segmentation and quantification [24], [25]. Gridding, the process of determining the coordinates of the center of each spot on the slide, is the first step. Segmentation classifies each pixel into either foreground or background after noise removal [10]. Quantification computes the “true signal” intensity values for all spots, typically by computing background and foreground values, and subtracting the former from the latter.

Ideally, a microarray would consist of high-density spots arranged in a perfectly regular grid [20], which would make analysis of the image straightforward. However, in practice many factors complicate the analysis, e.g., inaccuracies in spot printing, hybridization inconsistency, very dense arrays designed to achieve maximum throughput, and environmental effects such as contamination [1], [25]. Thus, one needs to deal with the following issues to make gridding robust and reproducible [6], [8], [23], [26]:

  • (a)

    The exact location of grids varies from slide to slide.

  • (b)

    The relative position of subarrays may be different.

  • (c)

    The size and the shape of spots vary, and may be highly irregular (such as donut shaped).

  • (d)

    Some subarrays may be rotated.

  • (e)

    The substrate may be contaminated by dirt or chemicals.

  • (f)

    The signal intensities may be weak.

Researchers have been developing suitable gridding methods for over a decade. For example, Jain et al. [7] developed an algorithm based on horizontal and vertical projections of spot intensities, in which they made strong assumptions about the distribution of spot intensities, but allowed projective distortion of the subarrays. This approach does not perform well when the grids are rotated. Bajcsy [2] developed an automatic gridding method, Gridline, based on detecting both the intensity discontinuities at each spot location and the homogeneous intensities of background regions. This method accurately aligns spots with different radii or blurred edges, missing spots and rotated subarrays, but does not adjust for misaligned spots in a single line or high background noise.

Many free and commercial software packages deal with some of these issues nicely such as ScanAlyze (LBNL, UCB, CA 1999), Spot (CSIRO, Australia 2000), UCSF Spot (Jain Lab, UCSF) and GenePix (Molecular Devices Corporation, CA 2004). However, all these packages require a certain level of human intervention to achieve the desired accuracy, imposing a considerable burden on users.

We have developed a fully automatic DNA microarray gridding method that yields accurate and robust results. This algorithm employs morphological reconstruction along with global and local rotation detection, non-parametric optimal thresholding and local fine-tuning. We demonstrate the performance of this algorithm on synthetic data and on real microarray images of different size and with different degrees of rotation. The results suggest that this hierarchical refinement algorithm can reliably and automatically detect and compensate for alignment and rotation problems without human intervention.

Section snippets

The new algorithm for automatic gridding

This algorithm contains the following steps: image preprocessing, detection and correction of global rotation, global reference layout generation, detection and correction of local (individual subarray) rotation, and local refinement. A flowchart for this procedure is shown in Fig. 2.

Experimental results and comparisons

We have implemented our algorithm using Matlab R14 (The MathWorks, MA 2005) application development package. We have tested our program on real and synthetic microarray images exhibiting different global or local rotation. The real microarray images are in two-channel, 16-bit TIFF format, generated by a GenePix 4000 scanner, scanning spotted oligonucleotide arrays at the Center for Applied Genomics located in Newark, New Jersey, as shown in Figs. 11(a), (c), (e) and 12(a). The images in Fig. 12

Conclusions

In this paper, we have presented a novel hierarchical refinement algorithm for gridding by applying morphological reconstruction based image processing methods along with global rotation detection, non-parametric optimal threshold selection, subarray rotation detection and local fine-tuning. This method is suitable for performing accurate and fully automatic gridding in microarray image analysis even for images with alignment and rotation problems.

Because this algorithm uses non-parametric

Acknowledgements

The authors would like to thank Dr. Patricia Soteropoulos and Dr. Hui-Yun Wang for providing real microarray images for the current work, and Mr. Anthony Galante and Dr. Barry Cohen for valuable discussions and comments.

References (27)

  • A.P.G. Damiance et al.

    A dynamic model with adaptive pixel moving for microarray images segmentation

    Real-Time Imaging: Special Issue on Imaging in Bioinformatics

    (2004)
  • V.M. Aris et al.

    Noise filtering and nonparametric analysis of microarray data underscores discriminating markers of oral, prostate, lung, ovarian and breast cancer

    BMC Bioinform.

    (2004)
  • P. Bajcsy

    Gridline: automatic grid alignment in DNA microarray scans

    IEEE Trans. Image Process.

    (2004)
  • P. Baldi et al.

    Bioinformatics: The Machine Learning Approach

    (2001)
  • P.O. Brown et al.

    Exploring the new world of the genome with DNA microarrays

    Nat. Genetics

    (1999)
  • J.F. Canny

    A computational approach to edge detection

    IEEE Trans. Pattern Anal. Mach. Intelligence

    (1986)
  • A.N. Jain et al.

    Fully automatic quantification of microarray image data

    Genome Res.

    (2002)
  • M. Katzer et al.

    Methods for automatic microarray image segmentation

    IEEE Trans. Nanobiosci.

    (2003)
  • A. Kuklin

    Laboratory automation in microarray image processing

    Am. Lab.

    (2000)
  • R. Lukac et al.

    A multichannel order-statistic technique for cDNA microarray image processing

    IEEE Trans. Nanobiosci.

    (2004)
  • R. Lukac et al.

    Vector filtering for color imaging

    IEEE Signal Processing Magazine: Special Issue on Color Image Processing

    (2005)
  • D. Marr et al.

    Theory of edge detection

    Proc. Roy. Soc. London

    (1980)
  • R. Nagarajan

    Intensity-based segmentation of microarrays images

    IEEE Trans. Med. Imaging

    (2003)
  • Cited by (20)

    • FPGA based system for automatic cDNA microarray image processing

      2012, Computerized Medical Imaging and Graphics
      Citation Excerpt :

      In Refs. [22,23] a hill climbing algorithm is used for grid alignment with increased computational complexity. Morphological operators are used in [24] and for [25] automatic microarray image addressing, with a computational complexity of O(2SeMN) where Se is the size in pixels (1 k) of the structural element for dilation and erosion. Moreover, based on a selection of microarray spots, in [26] and [8] a SVM (support vector machine) approach is reported for automatic microarray griding which overcomes the genetic algorithm presented in [27] with nearly one order of magnitude in terms of computational complexity.

    • An adaptable threshold detector

      2011, Information Sciences
      Citation Excerpt :

      Moreover, ADT can give more suitable thresholds to satisfy various application requirements than OTM, VEM, and MCVTM. Object segmentation, an essential image processing technique, draws the expected parts out from an image to facilitate the follow-up processes [12], such as the segmentation of breast tumor from an ultrasound image [2], blood vessel from a retinal image [9], or DNA spots from a microarray image [22]. There was a simple and effective threshold-based image segmentation technique [1,7] using the feature variances of objects, i.e., gray-scales [17,18], to divide an image into several non-overlapping regions.

    • Automated microarray image gridding using image projection vectors coupled with power spectrum model

      2010, International Journal of Pattern Recognition and Artificial Intelligence
    • Hardware architectures for iterative algorithms implementations

      2018, Signals and Communication Technology
    View all citing articles on Scopus
    View full text