Abstract
Oligo kernels for biological sequence classification have a high discriminative power. A new parameterization for the K-mer oligo kernel is presented, where all oligomers of length K are weighted individually. The task specific choice of these parameters increases the classification performance and reveals information about discriminative features. For adapting the multiple kernel parameters based on cross-validation the covariance matrix adaptation evolution strategy is proposed. It is applied to optimize the trimer oligo kernel for the detection of prokaryotic translation initiation sites. The resulting kernel leads to higher classification rates, and the adapted parameters reveal the importance for classification of particular triplets, for example of those occurring in the Shine-Dalgarno sequence.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Schölkopf, B., Tsuda, K., Vert, J.P. (eds.): Kernel Methods in Computational Biology. Computational Molecular Biology. MIT Press, Cambridge (2004)
Meinicke, P., Tech, M., Morgenstern, B., Merkl, R.: Oligo kernels for datamining on biological sequences: A case study on prokaryotic translation initiation sites. BMC Bioinformatics 5 (2004)
Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning 46, 131–159 (2002)
Glasmachers, T., Igel, C.: Gradient-based adaptation of general Gaussian kernels. Neural Computation 17, 2099–2105 (2005)
Keerthi, S.S.: Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms. IEEE Transactions on Neural Networks 13, 1225–1229 (2002)
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation 9, 159–195 (2001)
Friedrichs, F., Igel, C.: Evolutionary tuning of multiple SVM parameters. Neurocomputing 64, 107–117 (2005)
Igel, C., Wiegand, S., Friedrichs, F.: Evolutionary optimization of neural systems: The use of self-adaptation. In: Trends and Applications in Constructive Approximation. International Series of Numerical Mathematics, vol. 151, pp. 103–123. Birkhäuser Verlag, Basel (2005)
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)
Degroeve, S., Beats, B.D., de Peer, Y.V., Rouzé, P.: Feature subset selection for splice site prediction. Bioinformatics 18, 75–83 (2002)
Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: A string kernel for SVM protein classification. In: Altman, R.B., et al. (eds.) Proceedings of the Pacific Symposium on Biocomputing, pp. 564–575. World Scientific, Singapore (2002)
Eads, D.R., et al.: Genetic algorithms and support vector machines for time series classification. In: Bosacchi, B., Fogel, D.B., Bezdek, J.C. (eds.) Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation V, Proceedings of the SPIE, vol. 4787, pp. 74–85 (2002)
Fröhlich, H., Chapelle, O., Schölkopf, B.: Feature selection for support vector machines using genetic algorithms. International Journal on Artificial Intelligence Tools 13, 791–800 (2004)
Igel, C.: Multi-objective model selection for support vector machines. In: Coello Coello, C.A., Hernández Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 534–546. Springer, Heidelberg (2005)
Jong, K., Marchiori, E., van der Vaart, A.: Analysis of proteomic pattern data for cancer detection. In: Raidl, G.R., Cagnoni, S., Branke, J., Corne, D.W., Drechsler, R., Jin, Y., Johnson, C.G., Machado, P., Marchiori, E., Rothlauf, F., Smith, G.D., Squillero, G. (eds.) EvoWorkshops 2004. LNCS, vol. 3005, pp. 41–51. Springer, Heidelberg (2004)
Miller, M.T., Jerebko, A.K., Malley, J.D., Summers, R.M.: Feature selection for computeraided polyp detection using genetic algorithms. In: Clough, A.V., Amini, A.A. (eds.) Medical Imaging 2003: Physiology and Function: Methods, Systems, and Applications, Proceedings of the SPIE, vol. 5031, pp. 102–110 (2003)
Pang, S., Kasabov, N.: Inductive vs. transductive inference, global vs. local models: SVM, TSVM, and SVMT for gene expression classification problems. In: International Joint Conference on Neural Networks (IJCNN), vol. 2, pp. 1197–1202. IEEE Press, Los Alamitos (2004)
Runarsson, T.P., Sigurdsson, S.: Asynchronous parallel evolutionary model selection for support vector machines. Neural Information Processing – Letters and Reviews 3, 59–68 (2004)
Shi, S.Y.M., Suganthan, P.N., Deb, K.: Multi-class protein fold recognition using multiobjective evolutionary algorithms. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 61–66. IEEE Press, Los Alamitos (2004)
Beyer, H.G., Schwefel, H.P.: Evolution strategies: A comprehensive introduction. Natural Computing 1, 3–52 (2002)
Hansen, N., Kern, S.: Evaluating the CMA evolution strategy on multimodal test functions. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervós, J.J., Bullinaria, J.A., Rowe, J.E., Tiňo, P., Kabán, A., Schwefel, H.-P., et al. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 282–291. Springer, Heidelberg (2004)
Gualerzi, C.O., Pon, C.L.: Initiation of mRNA translation in procaryotes. Biochemistry 29, 5881–5889 (1990)
Zien, A., et al.: Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics 16, 799–807 (2000)
Rudd, K.E.: Ecogene: a genome sequence database for Escherichia coli K-12. Nucleic Acids Research 28, 60–64 (2000)
Blattner, F.R., et al.: The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462 (1997)
Kozak, M.: Initiation of translation in prokaryotes and eukaryotes. Gene 234, 187–208 (1999)
Shine, J., Dalgarno, L.: The 3’-terminal sequence of Escherichia coli 16S ribosomal RNA: Complementarity to nonsense triplets and ribosome binding sites. PNAS 71, 1342–1346 (1974)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mersch, B., Glasmachers, T., Meinicke, P., Igel, C. (2006). Evolutionary Optimization of Sequence Kernels for Detection of Bacterial Gene Starts. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840930_86
Download citation
DOI: https://doi.org/10.1007/11840930_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38871-5
Online ISBN: 978-3-540-38873-9
eBook Packages: Computer ScienceComputer Science (R0)