Abstract
In this paper, the problem of classifying the quality of microarray data spots is addressed, using concepts derived from the supervised learning theory. The proposed method, after extracting spots from the microarray image, computes several features, which take into account shape, color and variability. The features are classified using support vector machines, a recent statistical classification technique that is being employed widely. The proposed method does not make any assumptions on the problem and does not require any a priori information. The proposed system has been tested in a real case, for several different parameters’ configurations. Experimental results show the effectiveness of the proposed approach, also in comparison with state-of-the-art methods.
Similar content being viewed by others
Notes
The B-Course method used by the authors to infer the structure of the Naive Bayesian Networks merely represents a feature selection step.
The bleeding could be defined as the phenomenon in which a spot spreads so much that it is mixed with its neighbors should be carefully avoided.
In classification, only the sign is used, not the magnitude.
Data, together with experiments’ descriptions, data specifications, figures, experts’ classifications and labels are available on the web site http://sigwww.cs.tut.fi/TICSP/SpotQuality/.
References
Model F, König T, Piepenbrock C, Adorján P (2002) Statistical process control for large scale microarray experiments. Bioinformatics 18:155S–163S
Brändle N, Bischof H, Lapp H (2003) Robust dna microarray image analysis. Machine Vision and Applications 15:11 – 28
Valafar F (2002) Pattern recognition techniques in microarray data analysis: A survey. Ann N Y Acad Sci, Special issue on Tech Bioinform Med Inform 980:41–64
Mukherjee S (2003) Classifying microarray data using support vector machines. In: Berrar D, Dubitzky W, Granzow M (eds) A Practical Approach to Microarray Data Analysis. Kluwer, pp 166–185
Wang X, Ghosh S, Guo S (2001) Quantitative quality control in microarray image processing and data acquisition. Nucleic Acids Res 29:75
Chen Y, Kamat V, Dougherty E, Bittner M, Meltzer P, Trent J (2002) Ratio statistics of gene expression levels and applications to microarray data analysis. Bioinformatics 18:1207–1215
Brown C, Goodwin P, Sorger P (2001) Image metrics in the statistical analysis of DNA microarray data. Proc Natl Acad Sci USA 98:8944–8949
Buhler J, Ideker T, Haynor D (2000) Dapple: improved techniques for finding spots on DNA microarrays. Technical Report UWTR 2000-08-05, Department of Computer Science and Engineering, University of Washington
Hautaniemi S, Edgren H, Vesanen P, Wolf M, Järvinen A, Yli-Harja O, Astola J, Kallioniemi O, Monni O (2003) A novel strategy for microarray quality control using Bayesian networks. Bioinformatics 19:2031–2038
Heckerman D (1995) A tutorial on learning with bayesian networks. Technical Report MSR-TR-95-06, Microsoft Research Revised November, 1996 - dowloadable from family ftp://ftp.research.microsoft.com/pub/tr/tr-95-06.pdf
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin Heidelberg New York
Burges C (1998) A tutorial on support vector machine for pattern recognition. Data Min Knowl Disc 2:121–167
Jollife I (1986) Principal component analysis. Springer, Berlin Heidelberg New York
Jonsson K, Kittler J, Li YP, Matas J (1993) Support vector machines for face authentication. In: Proceeding of Brit Machine Vision Conf, Nottingham, UK pp 543–553
Bicego M, Iacono G, Murino V (2003) Face recognition with multilevel B-splines and support vector machines. In: Proceeding of ACM SIGMM multimedia biometrics methods and applications workshop, pp 17–24
Pontil M, Verri A (1998) Support vector machines for 3-D object recognition. IEEE T Pattern Anal Mach Intell 20:637–646
Murino V, Bicego M, Rossi I (2004) Statistical classification of raw textile defects. In: Proceedings of IEEE International conference on pattern recognition 4:311–314
Castellani A, Botturi D, Bicego M, Fiorini P (2004) Hybrid HMM/SVM model for the analysis and segmentation of teleoperation tasks. In: Proceeding of IEEE International conference on robotics and automation 3:2918–2923
Platt JC (1998) Fast training of support vector machines using sequential minimal optimization. Advances in Kernel methods-support vector learning
Hartelius K, Carstensen J (2003) Bayesian grid matching. IEEE T Pattern Anal Mach Intell 25:162–173
Draper N, Smith H (1998) Applied regression analysis, 3rd edn. Wiley, New York
Duda R, Hart P, Stork D (2001) Pattern Classification, 2nd edn. Wiley, New York
Bicego M, Grosso E, Tistarelli M (2005) Probabilistic face authentication using hidden markov models. In: Proc of SPIE International Workshop on Biometric Technology for Human Identification
Gunn S (1997) Support vector machines for classification and regression. Technical report, Image Speech and Intelligent Systems Research Group, University of Southampton
Vapnik V (1998) Statistical Learning Theory. John Wiley, New York
Mallat S, Zhang Z (1993) Matching pursuit with time-frequency dictionaries. IEEE T Signal Proces 41:3397–3415
Acknowledgements
The authors would like to thank very much Dr. Sampsa Hautaniemi of Tampere University of Technology (Finland), for kindly supplied the microarray data and the features used for testing. The authors would like to thank also Dr. S. Barbi and Prof. A. Scarpa of the Department of Pathology of the University of Verona (Italy) for helpful discussions. Finally, the authors would like to thank G.E. Felis for carefully reading the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bicego, M., Del Rosario Martinez, M. & Murino, V. A supervised data-driven approach for microarray spot quality classification. Pattern Anal Applic 8, 181–187 (2005). https://doi.org/10.1007/s10044-005-0254-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-005-0254-5