Skip to main content
Log in

Dissimilarity-based classification of chromatographic profiles

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a non-parametric method for the classification of thin-layer chromatographic (TLC) images from patterns represented in a dissimilarity space. Each pattern corresponds to a mixture of Gaussian approximation of the intensity profile. The methodology comprises various phases, including image processing and analysis steps to extract the chromatographic profiles and a classification phase to discriminate among two groups, one corresponding to normal cases and the other to three pathological classes. We present an extensive study of several dissimilarity-based approaches analysing the influence of the dissimilarity measure and the prototype selection method on the classification performance. The main conclusions of this paper are that, Match and Profile-difference dissimilarity measures present better results, and a new prototype selection methodology achieves a performance similar or even better than conventional methods. Furthermore, we also concluded that simplest classifiers, such as k-NN and linear discriminant classifiers (LDCs), present good performance being the overall classification error less than 10% for the four-class problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Abbreviations

TLC:

Thin-layer chromatography

MoG:

Mixture of Gaussians

LDS:

Lysosomal storage disorders

k-NN:

k-Nearest neighbour

SIMCA:

Soft independent modelling of class analogy

LDA:

Linear discriminant analysis

WMDP:

Within-class most dissimilar prototype

WLDP:

Within-class less dissimilar prototype

BMDP:

Between-class most dissimilar prototype

BLDP:

Between-class less dissimilar prototype

ROI:

Image region of interest

ML:

Maximum likelihood

EM:

Expectation–maximization

GM1:

Gangliosidosis

HUR:

Hurler disorder

MAN:

Mannosidosis

LDC:

Linear discriminant classifier

QDC:

Quadratic discriminant classifier

SVM:

Support vector machines

LOO:

Leave-one-out

References

  1. Reich E, Blatter A (2004) Modern TLC: a key technique for identification and quality control of botanicals and dietary supplements. Inside laboratory management—AOC international

  2. Sousa AV, Aguiar R, Mendonça AM, Campilho A (2004) Automatic lane and band detection in images of thin layer chromatography. In: Proceedings of image analysis and recognition: international conference, ICIAR 2004, Porto. LNCS, vol 3212. Springer, Heidelberg, pp 158–165

  3. Keir G, Winchester BG, Clayton P (1999) Carbohydrate-deficient glycoprotein syndromes: inborn errors of protein glycosylation. Ann Clin Biochem 36(Pt 1):20–36

    Google Scholar 

  4. Durand G, Seta N (2000) Protein glycosylation and diseases: blood and urinary oligosaccharides as markers for diagnosis and therapeutic monitoring. Clin Chem 46:795–805

    Google Scholar 

  5. Schwedt G (1997) The essential guide to analytical chemistry. Wiley, New York

    Google Scholar 

  6. Bajla I, Hollander I, Fluch S, Burg K, Kollar M (2005) An alternative method for electrophoretic gel image analysis in the GelMaster software. Comput Methods Programs Biomed 77:209–231

    Article  Google Scholar 

  7. Eibrand R, Kennedy P, Cotter D, MacEvilly U, Wu B (2003) Analysis of atlantic salmon skin mucus: COPS—a computer-based system for protein pattern analysis of 1D SDS-PAGE gels. In: Proceedings of the third IEEE symposium on bioinformatics and bioengeneering

  8. Gerasimov AV (2004) Use of the software processing of scanned chromatogram images in quantitative planar chromatography. J Anal Chem 59:348–353

    Article  Google Scholar 

  9. Goulding electrophoresis PN (2000) Gel analysis software: important aspects. UVItec Ltd

  10. Jedra M, El Khattabi N, Limouri M, Essaid A (1999) Recognition of seed varieties using a time-delay neural network: analysis of electrophoretic images. Comput Electron Agric 22:1–10

    Article  Google Scholar 

  11. Machado AMC, Campos MFM, Siqueira AM, De Carvalho OSF (1997) An iterative algorithm for segmenting lanes in gel electrophoresis images. Computer graphics and image processing. In: Proceedings of X Brazilian symposium, pp 140–146

  12. Ye X, Suen CY, Cheriet M, Wang E (1999) A recent development in image analysis of electrophoresis gels. Vision Iterface 99. Trois-Rivieres, Canada

  13. Wang D, Keller JM, Carson CA (2001) Pulsed-field gel electrophoresis pattern recognition of bacterial DNA: a systemic approach. Pattern Anal Appl 4:244–255

    Article  MATH  MathSciNet  Google Scholar 

  14. Shadle SE, Allen DF, Guo H, Pogozelski WK, Bashkin JS, Tullius TD (1997) Quantitative analysis of electrophoresis data: novel curve fitting methodology and its application to the determination of a protein–NA binding constant. Nucleic Acids Res 25:850–860

    Article  Google Scholar 

  15. Bajla I, Hollander I, Burg K (2001) Improvement of electrophoretic gel image analysis. Measurement science review. vol 1

  16. Bajla I, Hollander I, Burg K, Fluch S (2002) A novel approach to quantitative analysis of electrophoretic gel images of DNA fragments. IEEE international symposium on biomedical imaging, Washington, pp 899–902

    Google Scholar 

  17. Lavine BK (2000) Clustering and classification of analytical data. In: Meyers RA (ed) Encyclopedia of analytical chemistry: instrumentation and applications. Wiley, Chichester, pp 9689–9710

    Google Scholar 

  18. Lonni A, Scarminio I, Silva L, Ferreira D (2003) Differentiation of species of bacharis genus by HPLC and chemometrics methods. Analytical sciences. vol 19

  19. Beltrán NH, Duarte-Mermoud MA, Salah SA, Bustos MA (2005) Feature selection algorithms using Chilean wine chromatograms as examples. J Food Eng 67:483–490

    Article  Google Scholar 

  20. Landgrebe D (1999) Information extraction principles and methods for multispectral and hyperspectral image data. In: Proceedings of information processing for remote sensing. World Scientific, Singapore

  21. Landgrebe D (2002) Hyperspectral image data analysis as a high dimensional signal processing problem. IEEE Signal Process Mag 19:17–28

    Article  Google Scholar 

  22. Jimenez L, Landgrebe D (1999) Hyperspectral data analysis and feature reduction via projection pursuit. IEEE Trans Geosci Remote Sens 37:2653–2667

    Article  Google Scholar 

  23. Jimenez LO, Landgrebe DA (1998) Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data. IEEE transaction on systems man, and cybernetics—Part C: applications and reviews, vol 28

  24. Paclik P, Duin RPW (2003) Dissimilarity-based classification of spectra: computational issues. Real-Time Imaging 9:237–244

    Article  Google Scholar 

  25. Paclik P, Duin RPW (2003) Classifying spectral data using relational representation. In: Proceedings of spectral imaging workshop, Graz

  26. Pekalska E, Paclik P, Duin RPW (2001) A generalized kernel approach to dissimilarity-based classification. J Mach Learn Res 2:175–211

    Article  MathSciNet  Google Scholar 

  27. Pekalska E, Duin RPW (2002) Dissimilarity representations allow for building good classifiers. Pattern Recogn Lett 23:943–956

    Article  MATH  Google Scholar 

  28. Pekalska E, Duin RPW (2005) The dissimilarity representation for pattern recognition—foundations and applications, vol 64. World Scientific, Singapore

  29. Pekalska E, Duin RPW, Paclik P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogn 39:189–208

    Article  MATH  Google Scholar 

  30. Harol A, Lai C, Pekalska E, Duin RPW (2007) Pairwise feature evaluation for constructing reduced representations. Pattern Anal Appl 10:55–68

    Article  MathSciNet  Google Scholar 

  31. Dubuisson M-P, Jain AK (1994) A modified Hausdorff distance for object matching. In: Proceedings of international conference on pattern recognition, Jerusalem, pp 566–568

  32. Heijden F, Robert PWD, Ridder D, Tax DMJ (2004) Classification, parameter estimation and state estimation. Wiley, New York

  33. Dasarathy BV, Sánchez JS, Townsend S (2000) Nearest neighbour editing and condensing tools-synergy exploitation. Pattern Anal Appl 3:19–30

    Article  Google Scholar 

  34. Devijver P, Kittler J (1982) Pattern recognition: a statistical approach. Prentice-Hall, New Jersey

  35. Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17:790–799

    Article  Google Scholar 

  36. Vijaya PA, Murty MN, Subramanian DK (2006) Efficient median based clustering and classification techniques for protein sequences. Pattern Anal Appl 9:243–255

    Article  MathSciNet  Google Scholar 

  37. Sousa AV, Mendonça AM, Campilho A, Aguiar R, Miranda CS (2005) Feature extraction for classification of thin-layer chromatography images. In: Proceedings of image analysis and recognition, second international conference, ICIAR 2005. LNCS, vol 3656. Springer, Toronto, pp 974–981

  38. Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B-39(1):1–38

    MathSciNet  Google Scholar 

  39. Moré JJ, Sorensen DC (1983) Computing a trust region step. SIAM J Sci Stat Comput 4:553–572

    Article  MATH  Google Scholar 

  40. Byrd RH, Schnabel RB, Shultz GA (1988) Approximate solution of the trust region problem by minimization over two-dimensional subspaces. Math Program 40:247–263

    Article  MATH  MathSciNet  Google Scholar 

  41. Visa S, Ralescu A (2003) Learning imbalanced and overlapping classes using fuzzy sets. Workshop on learning from imbalanced datasets II, ICML, Washington DC

  42. Sousa AV, Mendonça AM, Campilho AC (2006) The class imbalance problem in TLC image classification. In: Proceedings of image analysis and recognition, third international conference, ICIAR 2006, September 18–20, 2006, proceedings, Part II, Póvoa de Varzim. LNCS, vol 4142. Springer, Portugal, pp 513–523

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to António V. Sousa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sousa, A.V., Mendonça, A.M. & Campilho, A. Dissimilarity-based classification of chromatographic profiles. Pattern Anal Applic 11, 409–423 (2008). https://doi.org/10.1007/s10044-008-0113-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-008-0113-2

Keywords

Navigation