Abstract
The genetic defects in the humans are uncovered by studying the chromosomes, as they are the genetic information carriers. They are non-rigid objects and they appear in different orientations when they are imaged. To find out the genetic defects, the chromosomes are pre-processed so that they are not touching, overlapping, and bent, and the noise is also discarded. The presence of bends, overlaps, or touches makes it difficult to uncover the genetic abnormalities. So there is a need for development of an efficient technique to classify the segmented chromosomes into different types and then pre-process them in order to correct their orientation. In this work, a hybrid classification technique based upon correlation-based feature selection and classification via regression approach, which will classify the segmented chromosomes into five categories viz; straight, overlapping, bent, touching, or noise is presented. The performance evaluation has been done using 1592 segmented chromosomes from Advance Digital Imaging Research data set. The over-all accuracy of 94.78 % has been obtained for the five class problem. The performance of the proposed classifier has been compared with Bayes Net, Naïve Bayes, Radial Bias Feed Forward Network, and k-nearest-neighbour classifiers. Based upon this categorization, different pre-processing techniques will be applied to correct the orientation of the chromosomes.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11517-016-1553-2/MediaObjects/11517_2016_1553_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11517-016-1553-2/MediaObjects/11517_2016_1553_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11517-016-1553-2/MediaObjects/11517_2016_1553_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11517-016-1553-2/MediaObjects/11517_2016_1553_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11517-016-1553-2/MediaObjects/11517_2016_1553_Fig5_HTML.gif)
Similar content being viewed by others
References
Alberts B (2000) Basic genetic mechanism. In: Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD, Grimstone AV (eds) Molecular biology of the cell, 5th edn. Garland Publishing Inc, New York, pp 191–234
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175–185. doi:10.1080/00031305.1992.10475879
Arora T, Dhir R (2014) An efficient segmentation method for overlapping chromosome images. Int J Comput Appl 95(1):29–32
Arora T, Dhir R (2015) A review of metaphase chromosome image selection techniques for automatic karyotype generation. Med Biol Eng Comput. doi:10.1007/s11517-015-1419-z
Arora T, Dhir R (2016a) A novel approach for segmentation of human metaphase chromosome images using region based active contours. Int Arab J Inf Technol
Arora T, Dhir R (2016b) Segmentation approaches for human metaspread chromosome images using level set methods. In: International conference on mass data analysis of images and signals MDA 2016 in New York
Arora T, Dhir R (2016c) Segmentation of human metaspread images using region based active contours. In: International conference on recent trends in engineering and material science, Jaipur National University, Jaipur, India, Mar 2016
Bengio Y, Grandvalet Y (2004) No unbiased estimator of the variance of K-fold cross-validation. J Mach Learn Res 5:1089–1105
Bickmore W (2001) Karyotype analysis and chromosome banding. In: Bickmore WA (ed) Encylopedia of life sciences. M R C Human Genetics Unit, Edinburgh
Bors AG (1996) Introduction of the radial basis function (RBF) networks. University of Edinburg, Edinburg, pp 1–7
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. doi:10.1017/CBO9781107415324.004
Castleman HC, Bovik AC, Castleman KR (2006) Maximum-likelihood decomposition of overlapping and touching M-FISH chromosomes using geometry, size and color information. In: Twenty-eighth annual international conference of the IEEE engineering in medicine and society, New York
Devaraj S, Vijaykumar VR, Soundrarajan GR (2013) Leaf biometrics based karyotyping of g-band chromosomes. Int J Hum Genet 13:131–138
Frank E, Wang Y, Inglis S, Holmes G, Witten IH (1998) Using model trees for classification. Mach Learn 32(1):63–76
Friedman N, Geiger D, Goldszmit M (1997) Bayesian network classifiers. Mach Learn 29:131–163. doi:10.1023/a:1007465528199
Giraud-carrier C, Vilalta R, Brazdil P (2004) Is combining classifiers with stacking better than selecting the best one ? Mach Learn 54:255–273
Hall M (1999) Correlation-based feature selection for machine learning. PhD thesis, Department of Computer Science, Waikato University, New Zealand
Jahani S, Setarehdan SK, Fatemizadeh E (2011) Automatic identification of overlapping/touching chromosomes in microscopic images using morphological operators. In: 2011 7th Iranian conference on machine vision and image processing. doi:10.1109/IranianMVIP.2011.6121574
Jahani S, Setarehdan SK, Veronica M (2012) An automatic algorithm for identification and straightening images of curved human chromosomes. Biomed Eng: Appl Basis Commun 24:1–9. doi:10.1142/S1016237212500469
Lerner B, Guterman H, Dinstein I, Romem Y (1995) Medial axis transform-based features and a neural network for human chromosome classification. Pergamon Pattern Recognit 28:1673–1683
M-FISH database established by advanced digital imaging research. (http://www.adires.com/05/Project/MFISH_DB/MFISH_DB.shtml) [WWW.Document], n.d
Moallem P, Karimizadeh A, Yazdchi M (2013) Using shape information and dark paths for automatic recognition of touching and overlapping chromosomes in G-band images. Int J Image Graph Signal Process 5:22–28. doi:10.5815/ijigsp.2013.05.03
Moradi M, Setarehdan SK (2006) New features for automatic classification of human chromosomes: a feasibility study. Pattern Recognit Lett 27:19–28. doi:10.1016/j.patrec.2005.06.011
Oyang Y-J, Hwang S-C, Ou Y-Y, Chen C-Y, Chen Z-W (2005) Data classification with radial basis function networks based on a novel kernel density estimation algorithm. IEEE Trans Neural Netw 16:225–236. doi:10.1109/TNN.2004.836229
Piper J, Granum E (1989) On fully automatic feature measurement for banded chromosome classification. Cytometry 10:242–255
Rish I (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial, pp 41–46. doi:10.1039/b104835j
Sharma V, Singh S (2014) CFS–SMO based classification of breast density using multiple texture models. Med Biol Eng Comput 52:521–529. doi:10.1007/s11517-014-1158-6
Somasundaram D, Kumar VRV (2014) Separation of overlapped chromosomes and pairing of similar chromosomes for karyotyping analysis. Measurement 48:274–281. doi:10.1016/j.measurement.2013.11.024
Tjio JH, Levan A (1925) The chromosome number of man. Genetics 10:80–85
Uttamatanin R, Yuvapoositanon P, Intarapanich A, Kaewkamnerd S, Phuksaritanon R, Assawamakin A, Tongsima S (2013) MetaSel: a metaphase selection tool using a Gaussian-based classification technique. BMC Bioinform 14:S13. doi:10.1186/1471-2105-14-S16-S13
Uttamatanin R, Yuvapoositanon P, Intarapanich A, Kaewkamnerd S, Tongsima S (2013b) Band classification based on chromosome shapes. In: 13th international symposium on communications and information technologies (ISCIT). pp 464–468
Van Den Berg HTCM, De France HF, Habbema JDF, Raatgever JW (1981) Automated selection of metaphase cells by quality. Cytometry 1:363–368. doi:10.1002/cyto.990010602
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflict of interest.
Rights and permissions
About this article
Cite this article
Arora, T., Dhir, R. Correlation-based feature selection and classification via regression of segmented chromosomes using geometric features. Med Biol Eng Comput 55, 733–745 (2017). https://doi.org/10.1007/s11517-016-1553-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-016-1553-2