Abstract
The performance of a classifier depends on the exactness of the feature vectors extracted from the dataset. Here, a novel method for feature extraction from genome sequences is presented which combines Chaos Game Representation (CGR) and Hurst exponent. The former maps genome sequences into fractal images while the latter acts as a quantifier for such images. The suitability of the new feature vector is attested by classifying 8 categories of eukaryotic genomes accessed from NCBI. The classification results prove that application of Hurst exponent over Chaos Game Representation formats of genome sequences can extract signature features representative of the underlying sequences, thus presenting HCGR as a new feature for classification of genomes.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sandberg, R., Winberg, G., Branden, C.I., Kaske, A., Ernberg, I., Coster, J.: Capturing Whole – Genome characteristics in short sequences using a naive Bayesian classifier. Genome Res. 11, 1404–1409 (2001)
Narasimhan, S., Sen, S., Konar, A.: Species identification based on mitochondrial genomes. In: Proceedings of the International Conference of Cognition and Recognition, Mysore, India, December 22-23 (2005)
Deschavanne, P.J., Giron, A., Vilain, J., Fagot, G., Fertil, B.: Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Mol. Biol. Evol. 16, 1391–1399 (1999)
Almeida, J.S., Carrico, J.A., Maretzek, A., Noble, P.A., Fletcher, M.: Analysis of genomic sequences by chaos game representation. Bioinformatics 17, 429–437 (2001)
Nair, V.V., Nair, A.S.: Combined classifier for unknown genome classification using chaos game representation features. In: Proceedings of the International Symposium of Bio Computing, NITC Calicut, India, February 15-17 (2009)
Peitgen, H.O., Jurgens, H., Saupe, D.: Chaos and Fractals New Frontiers of Science, 2nd edn. Springer, Heidelberg (2004)
Nair, A.S., Nair, V.V., Arun, K.S., Kant, K., Dey, A.: Bio-sequence Signatures Using Chaos Game Representation. In: Fulekar, M.H. (ed.) Bioinformatics: Applications in Life and Environmental Sciences, pp. 62–76. Springer, New York (2009)
Hassan, S., Choudhury, P. P., Daya Sagar, B. S., Chakraborty, S., Guha, R., Goswam, A.: Understanding Genomic Evolution of Olfactory Receptors through Fractal and Mathematical Morphology. In: Nature Proceedings: hdl:10101/npre.2011.5674.1 (February 14, 2011)
Noble, W.S.: Support vector machine applications in computational biology. In: Kernel Methods in Computational Biology, pp. 71–92. MIT Press, Cambridge (2004)
Jeffrey, H.J.: Chaos game representation of gene structure. Nucleic Acids Res. 18, 2163–2170 (1990)
Joseph, J., Sasikumar, R.: Chaos game representation for comparison of whole genomes. BMC Bioinformatics 7, 243 (2006)
Mandelbrot, B.: The fractal geometry of nature. W. H. Freeman, New York (1982)
http://www.ncbi.nlm.nih.gov/Genomes/ORGANELLES/ organelles.html
Bassingthwaighte, J.B., Raymond, G.M.: Evaluation of the dispersional analysis method for fractal time series. Annals Biomed. Engg. 23, 491–505 (1995)
Qian, B., Rasheed, K.: Hurst exponent and financial market predictability. In: IASTED conference on Financial Engineering and Applications, pp. 203–209 (2004)
Cristianini, N., Taylor, J.S.: Support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nair, V.V., Mallya, A., Sebastian, B., Elizabeth, I., Nair, A.S. (2011). Hurst CGR (HCGR) – A Novel Feature Extraction Method from Chaos Game Representation of Genomes. In: Abraham, A., Lloret Mauri, J., Buford, J.F., Suzuki, J., Thampi, S.M. (eds) Advances in Computing and Communications. ACC 2011. Communications in Computer and Information Science, vol 190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22709-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-22709-7_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22708-0
Online ISBN: 978-3-642-22709-7
eBook Packages: Computer ScienceComputer Science (R0)