Abstract
Recent experiments have shown that some types of RNA may control gene expression and phenotype by themselves, besides their traditional role of allowing the protein synthesis. Roughly speaking, RNAs can be divided into two classes: mRNAs, that are translated into proteins, and non-coding RNAs (ncRNAs), which play several cellular important roles besides protein coding. In recent years, many computational methods based on different theories and models have been proposed to distinguish mRNAs from ncRNAs. Particularly, Self-Organizing Maps (SOM), a neural network model, is time efficient for the training step, and present a straightforward implementation that allow easily increasing of the number of classes for clustering the input data. In this work, we propose a method for identifying non-coding RNAs using Self Organizing Maps, named SOM-PORTRAIT. We implemented the method and applied it to a data set containing Assembled ESTs of the Paracoccidioides brasiliensis fungus transcriptome. The obtained results were promising, with the advantage that the time and memory requirements needed to our SOM-PORTRAIT are much less than those needed for methods based on the Support Vector Machine (SVM) paradigm, like PORTRAIT.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17) (1997)
Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O’Donovan, C., Redaschi, N., Yeh, L.S.: Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 32, D115–D119 (2004)
Arrial, R.T.: Predicting noncoding RNAs in the transcriptome of the Paracoccidioides brasiliensis fungus using machine learning. Master’s thesis, University of Brasilia (2008) (in Portuguese)
Arrial, R.T., Togawa, R.C., Brigido, M.M.: Outlining a Strategy for Screening Non-coding RNAs on a Transcriptome Through Support Vector Machines. In: Sagot, M.-F., Walter, M.E.M.T. (eds.) BSB 2007. LNCS (LNBI), vol. 4643, pp. 149–152. Springer, Heidelberg (2007)
Chang, C., Lin, C.: LIBSVM: a library for Support Vector Machines(2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Eddy, S., http://infernal.janelia.org/
Felipe, M.S.S., co authors: Transcriptional profiles of the human pathogenic fungus Paracoccidioides brasiliensis in mycelium and yeast cells. Journal of Biological Chemistry (2005) doi:10.1074/jbc.M500625200
Griffiths-Jones, S.: Annotating Noncoding RNA Genes. Annu. Rev. Genomics Hum. Genet. 8, 279–298 (2007)
Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., Bateman, A.: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Research 33, D121–D124 (2005), http://www.sanger.ac.uk/Software/Rfam/
Haykin, S.: Neural Networks. Macmillan College Publishing Company, New York (1994)
Hofacker, I.L., Fekete, M., Stadler, P.F.: Secondary Structure Prediction for Aligned RNA Sequences. Journal of Molecular Biology 319(5) (2002)
Kohonen, T.: Self-Organization and Associative Memory. Springer, New York (1998)
Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J.: SOM_PAK: The Self-Organizing Map Program Package. Technical report, Helsinki University of Technology, Espoo, Finland (1996)
Kong, L., Zhang, Y., Ye, Z.-Q., Liu, X.-O., Zhao, S.-O., Wei, L., Gao, G.: CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 35, 345–349 (2007)
Li, W., Godzik, A.: CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 22(13) (2006)
Liu, C., Bai, B., Skogerbo, G., Cai, L., Deng, W., Zhang, Y., Bu, D., Zhao, Y., Chen, R.: NONCODE: an integrated knowledge database of non-coding RNAs. Nucleic Acids Research 33, D112–D115 (2005), http://www.noncode.org/
Liu, J., Gough, J., Rost, B.: Distinguishing protein-coding from non-coding RNAs through Support Vector Machines. PLoS Genet. 2(4) (April 2006)
Liu, J., Gough, J., Rost, B.: Distinguishing protein-coding from non-coding rnas through support vector machines. PLoS Genet. 2(e), 29–36 (2006)
Mattick, J.S.: Non coding RNAs: the architects of eukaryotic complexity. EMBO reports 2, 986–990 (2001)
Mount, S.M., Gotea, V., Lin, C.-F., Hernandez, K., Makalowski, W.: Spliceosomal small nuclear RNA genes in eleven insect genomes. RNA 13(1), 5–14 (2007)
Pang, K.C., Stephen, S., Engstrom, P.G., Tajul-Arifin, K., Chen, W., Wahlestedt, C., Lenhard, B., Hayashizaki, Y., Mattick, J.S.: RNAdb - a comprehensive mammalian noncoding RNA database. Nucleic Acids Research 33, D125–D130 (2005), http://jsm-research.imb.uq.edu.au/rnadb/About/default.aspx
Rumelhart, D.E., Zipser, D.: Feature discovery by competitive learning. Cognitive Science 9 (1985)
Shimizu, K., Adachi, J., Muraoka, Y.: ANGLE: a sequencing errors resistant program for predicting protein coding regions in unfinished cDNA. Journal of Bioinformatics and Computational Biology 4(3), 649–664 (2006)
Washietl, S., Hofacker, I.L., Lukasser, M., Stadler, P.F., Hüttenhofer, A.: Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome. Nat. Biotechnol. 22, 1383–1390 (2005)
Watson, J.D., Crick, F.H.C.: A structure for deoxyribose nucleic acid. Nature, 171 (1953)
Zucker, M., Matthews, D.H., Turner, D.H.: Algorithms and thermodynamics for RNA secondary structure prediction: A practical guide. In: RNA Biochemistry and Biotechnology, NATO ASI Series, pp. 11–43. Kluwer Academic Publishers, Dordrecht (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Silva, T.C., Berger, P.A., Arrial, R.T., Togawa, R.C., Brigido, M.M., Walter, M.E.M.T. (2009). SOM-PORTRAIT: Identifying Non-coding RNAs Using Self-Organizing Maps. In: Guimarães, K.S., Panchenko, A., Przytycka, T.M. (eds) Advances in Bioinformatics and Computational Biology. BSB 2009. Lecture Notes in Computer Science(), vol 5676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03223-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-03223-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03222-6
Online ISBN: 978-3-642-03223-3
eBook Packages: Computer ScienceComputer Science (R0)