Abstract
This paper proposes a non-parametric method for the classification of thin-layer chromatographic (TLC) images from patterns represented in a dissimilarity space. Each pattern corresponds to a mixture of Gaussian approximation of the intensity profile. The methodology comprises various phases, including image processing and analysis steps to extract the chromatographic profiles and a classification phase to discriminate among two groups, one corresponding to normal cases and the other to three pathological classes. We present an extensive study of several dissimilarity-based approaches analysing the influence of the dissimilarity measure and the prototype selection method on the classification performance. The main conclusions of this paper are that, Match and Profile-difference dissimilarity measures present better results, and a new prototype selection methodology achieves a performance similar or even better than conventional methods. Furthermore, we also concluded that simplest classifiers, such as k-NN and linear discriminant classifiers (LDCs), present good performance being the overall classification error less than 10% for the four-class problem.










Similar content being viewed by others
Abbreviations
- TLC:
-
Thin-layer chromatography
- MoG:
-
Mixture of Gaussians
- LDS:
-
Lysosomal storage disorders
- k-NN:
-
k-Nearest neighbour
- SIMCA:
-
Soft independent modelling of class analogy
- LDA:
-
Linear discriminant analysis
- WMDP:
-
Within-class most dissimilar prototype
- WLDP:
-
Within-class less dissimilar prototype
- BMDP:
-
Between-class most dissimilar prototype
- BLDP:
-
Between-class less dissimilar prototype
- ROI:
-
Image region of interest
- ML:
-
Maximum likelihood
- EM:
-
Expectation–maximization
- GM1:
-
Gangliosidosis
- HUR:
-
Hurler disorder
- MAN:
-
Mannosidosis
- LDC:
-
Linear discriminant classifier
- QDC:
-
Quadratic discriminant classifier
- SVM:
-
Support vector machines
- LOO:
-
Leave-one-out
References
Reich E, Blatter A (2004) Modern TLC: a key technique for identification and quality control of botanicals and dietary supplements. Inside laboratory management—AOC international
Sousa AV, Aguiar R, Mendonça AM, Campilho A (2004) Automatic lane and band detection in images of thin layer chromatography. In: Proceedings of image analysis and recognition: international conference, ICIAR 2004, Porto. LNCS, vol 3212. Springer, Heidelberg, pp 158–165
Keir G, Winchester BG, Clayton P (1999) Carbohydrate-deficient glycoprotein syndromes: inborn errors of protein glycosylation. Ann Clin Biochem 36(Pt 1):20–36
Durand G, Seta N (2000) Protein glycosylation and diseases: blood and urinary oligosaccharides as markers for diagnosis and therapeutic monitoring. Clin Chem 46:795–805
Schwedt G (1997) The essential guide to analytical chemistry. Wiley, New York
Bajla I, Hollander I, Fluch S, Burg K, Kollar M (2005) An alternative method for electrophoretic gel image analysis in the GelMaster software. Comput Methods Programs Biomed 77:209–231
Eibrand R, Kennedy P, Cotter D, MacEvilly U, Wu B (2003) Analysis of atlantic salmon skin mucus: COPS—a computer-based system for protein pattern analysis of 1D SDS-PAGE gels. In: Proceedings of the third IEEE symposium on bioinformatics and bioengeneering
Gerasimov AV (2004) Use of the software processing of scanned chromatogram images in quantitative planar chromatography. J Anal Chem 59:348–353
Goulding electrophoresis PN (2000) Gel analysis software: important aspects. UVItec Ltd
Jedra M, El Khattabi N, Limouri M, Essaid A (1999) Recognition of seed varieties using a time-delay neural network: analysis of electrophoretic images. Comput Electron Agric 22:1–10
Machado AMC, Campos MFM, Siqueira AM, De Carvalho OSF (1997) An iterative algorithm for segmenting lanes in gel electrophoresis images. Computer graphics and image processing. In: Proceedings of X Brazilian symposium, pp 140–146
Ye X, Suen CY, Cheriet M, Wang E (1999) A recent development in image analysis of electrophoresis gels. Vision Iterface 99. Trois-Rivieres, Canada
Wang D, Keller JM, Carson CA (2001) Pulsed-field gel electrophoresis pattern recognition of bacterial DNA: a systemic approach. Pattern Anal Appl 4:244–255
Shadle SE, Allen DF, Guo H, Pogozelski WK, Bashkin JS, Tullius TD (1997) Quantitative analysis of electrophoresis data: novel curve fitting methodology and its application to the determination of a protein–NA binding constant. Nucleic Acids Res 25:850–860
Bajla I, Hollander I, Burg K (2001) Improvement of electrophoretic gel image analysis. Measurement science review. vol 1
Bajla I, Hollander I, Burg K, Fluch S (2002) A novel approach to quantitative analysis of electrophoretic gel images of DNA fragments. IEEE international symposium on biomedical imaging, Washington, pp 899–902
Lavine BK (2000) Clustering and classification of analytical data. In: Meyers RA (ed) Encyclopedia of analytical chemistry: instrumentation and applications. Wiley, Chichester, pp 9689–9710
Lonni A, Scarminio I, Silva L, Ferreira D (2003) Differentiation of species of bacharis genus by HPLC and chemometrics methods. Analytical sciences. vol 19
Beltrán NH, Duarte-Mermoud MA, Salah SA, Bustos MA (2005) Feature selection algorithms using Chilean wine chromatograms as examples. J Food Eng 67:483–490
Landgrebe D (1999) Information extraction principles and methods for multispectral and hyperspectral image data. In: Proceedings of information processing for remote sensing. World Scientific, Singapore
Landgrebe D (2002) Hyperspectral image data analysis as a high dimensional signal processing problem. IEEE Signal Process Mag 19:17–28
Jimenez L, Landgrebe D (1999) Hyperspectral data analysis and feature reduction via projection pursuit. IEEE Trans Geosci Remote Sens 37:2653–2667
Jimenez LO, Landgrebe DA (1998) Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data. IEEE transaction on systems man, and cybernetics—Part C: applications and reviews, vol 28
Paclik P, Duin RPW (2003) Dissimilarity-based classification of spectra: computational issues. Real-Time Imaging 9:237–244
Paclik P, Duin RPW (2003) Classifying spectral data using relational representation. In: Proceedings of spectral imaging workshop, Graz
Pekalska E, Paclik P, Duin RPW (2001) A generalized kernel approach to dissimilarity-based classification. J Mach Learn Res 2:175–211
Pekalska E, Duin RPW (2002) Dissimilarity representations allow for building good classifiers. Pattern Recogn Lett 23:943–956
Pekalska E, Duin RPW (2005) The dissimilarity representation for pattern recognition—foundations and applications, vol 64. World Scientific, Singapore
Pekalska E, Duin RPW, Paclik P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogn 39:189–208
Harol A, Lai C, Pekalska E, Duin RPW (2007) Pairwise feature evaluation for constructing reduced representations. Pattern Anal Appl 10:55–68
Dubuisson M-P, Jain AK (1994) A modified Hausdorff distance for object matching. In: Proceedings of international conference on pattern recognition, Jerusalem, pp 566–568
Heijden F, Robert PWD, Ridder D, Tax DMJ (2004) Classification, parameter estimation and state estimation. Wiley, New York
Dasarathy BV, Sánchez JS, Townsend S (2000) Nearest neighbour editing and condensing tools-synergy exploitation. Pattern Anal Appl 3:19–30
Devijver P, Kittler J (1982) Pattern recognition: a statistical approach. Prentice-Hall, New Jersey
Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17:790–799
Vijaya PA, Murty MN, Subramanian DK (2006) Efficient median based clustering and classification techniques for protein sequences. Pattern Anal Appl 9:243–255
Sousa AV, Mendonça AM, Campilho A, Aguiar R, Miranda CS (2005) Feature extraction for classification of thin-layer chromatography images. In: Proceedings of image analysis and recognition, second international conference, ICIAR 2005. LNCS, vol 3656. Springer, Toronto, pp 974–981
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B-39(1):1–38
Moré JJ, Sorensen DC (1983) Computing a trust region step. SIAM J Sci Stat Comput 4:553–572
Byrd RH, Schnabel RB, Shultz GA (1988) Approximate solution of the trust region problem by minimization over two-dimensional subspaces. Math Program 40:247–263
Visa S, Ralescu A (2003) Learning imbalanced and overlapping classes using fuzzy sets. Workshop on learning from imbalanced datasets II, ICML, Washington DC
Sousa AV, Mendonça AM, Campilho AC (2006) The class imbalance problem in TLC image classification. In: Proceedings of image analysis and recognition, third international conference, ICIAR 2006, September 18–20, 2006, proceedings, Part II, Póvoa de Varzim. LNCS, vol 4142. Springer, Portugal, pp 513–523
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sousa, A.V., Mendonça, A.M. & Campilho, A. Dissimilarity-based classification of chromatographic profiles. Pattern Anal Applic 11, 409–423 (2008). https://doi.org/10.1007/s10044-008-0113-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-008-0113-2