Abstract
The rapid progress in proteomics has generated an increased interest in the full characterization of glycoproteins. Tandem mass spectrometry is a useful technique. One common problem of current bioinformatics tools for automated interpretation of tandem mass spectra of glycoproteins is that they often give many candidates of oligosaccharide structures with very close scores. We propose an alternative approach in which stochastic context-free graph grammars are used to model oligosaccharide structures. Our stochastic model receives as input structures of known glycans in the library to train the probability parameters of the grammar. After training, the method uses the learned rules to predict the structure of glycan given a composition of unknown glycoprotein. Preliminary results show that integrating such modelling with the automated interpretation software program, GlycoMaster, can very accurately elucidate oligosaccharide structures with tandem mass spectra. This paper describes the stochastic graph grammars modelling glycoproteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
Abe, N., Mamitsuka, H.: Predicting protein secondary structure using stochastic tree grammars. Machine Learning 29, 275–301 (1997)
Aebersold, K., Mann, M.: Mass spectrometry-based protemics. Nature 422, 198–207 (2003)
Bartels, C.: Fast algorithm for peptide sequencing by mass spectroscopy. Biomed. Environ. Mass Spectrom. 19, 363–368 (1990)
Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)
Cai, L., Malmberg, R.L., Wu, Y.: Stochastic modeling of RNA pseudoknotted structures: a grammatical approach. Bioinformatics 19, i66–i73 (2003)
Chen, T., et al.: A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. J. Comp. Biology 8, 325–337 (2001)
Churchill, G.A.: Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51, 79–94 (1989)
Cook, D.J., Holder, L.B.: Graph-Based Data Mining. IEEE Intelligent Systems 15, 32–41 (2000)
Cooper, C.A., Gasteiger, E., Packer, N.H.: GlycoMod - A software tool for determining glycosylation compositions from mass spectrometric data. Proteomics 1, 340–349 (2000)
Dehaspe, L., Toivonen, H., King, R.D.: Finding Frequent Substructures in Chemical Compounds. In: Proceeding of the Fourth International Conference of Knowledge Discovery and Data Mining, pp. 30–36 (1998)
Dell, A., Morris, H.R.: Glycoprotein Structure Determination by Mass Spectrometry. Science 291, 2351–2356 (2001)
Dupplaw, D., Lewis, P.H.: Content-Based Image Retrievel with Scale-spaced Object Trees. In: Proceedings of Storage and Retrieval for Media Databases. LNCS, vol. 3972, pp. 253–261. Springer, Heidelberg (2000)
Ethier, M., Saba, J.A., Ens, W., Standing, K.G., Perreault, H.: Automated structural assignment of derivated complex N-linked oligosaccharides from tandem mass spectra. Rapid Commun. Mass Spectrom. 16, 1743–1754 (2002)
Gaucher, S.P., Morrow, J., Leary, J.A.: A saccharide topology analysis tool used in combination with tandem mass spectrometry. Anal. Chem. 72, 2231–2236 (2000)
Goldman, N., Thorne, J.L., Jones, D.T.: Using Evolutionary Trees in Protein Secondary Structure Prediction and Other Comparative Sequence Analyses. J. Mol. Biol. 263, 196–208 (1996)
Harvey, D.H.: Collision-induced fragmentation of underivatized N-linked carbohydrates ionized by electrospray. J. Mass Spectrom. 35, 1178–1190 (2000)
KEGG LIGAND Database (Japan), http://www.genome.ad.jp/ligand/
Knudsen, B., Hein, J.: RNA Secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics 15, 446–454 (1999)
Konig, S., Leary, J.A.: Evidence for linkage position determination in cobalt coordinated pentasaccharides using ion trap mass spectrometry. J. Am. Soc. Mass Spectrom. 9, 1125–1134 (1999)
Krogh, A., Saira Mian, I., Haussler, D.: A hidden Markov model that finds genes in E.coli DNA. Nuc. Ac. Res. 22, 4768–4778 (1994)
Lukashin, A.V., Bodovsky, M.: GenMark. hmm: new solutions for gene finding. Nuc. Ac. Res. 26, 1107–1115 (1998)
Ma, B., et al.: PEAKS: Powerful Software for Peptide De Novo Sequencing by MS/MS. Commun. Mass Spectrom. 268, 78–94 (1997)
Mamitsuka, H., Abe, N.: Predicting Location and Structure of Beta-Sheet Regions Using Stochastic Tree Grammars. ISMB-94 17, 2337–2342 (2003)
Mizuno, Y., Sasagawa, T.: An automated interpretation of MALDI/TOF postsource decay spectra of oligosaccharides. 1. Automated Peak Assignment. Anal. Chem. 71, 4764–4771 (1999)
Rivas, E., Eddy, S.R.: The language of RNA: a formal grammar that includes pseudoknots. Bioinformatics 16, 334–340 (2000)
Rozenberg, G. (ed.): Handbook of Graph Grammars and Computing by Graph Transformation. World Scientific, Singapore (1997)
Shan, B., Zhang, K., Ma, B., Zhang, C., Lajorie, G.: Automated structural elucidation of glycopeptides from MS/MS spectra. In: 52nd ASMS Conference (2004) (to be appeared)
Tseng, K., et al.: Catalog-Library Approach for the Rapid and Sensitive Structural Elucidation of Oligosaccharides. Anal. Chem. 71, 3747–3754 (1999)
Zala, J.: Mass Spectrometry of Oligosaccharides. Mass Spectrometry Reviews 23, 161–227 (2004)
Clarke, F., Ekeland, I.: Nonlinear oscillations and boundary-value problems for Hamiltonian systems. Arch. Rat. Mech. Anal. 78, 315–333 (1982)
Clarke, F., Ekeland, I.: Solutions périodiques, du période donnée, des équations hamiltonienne. Note CRAS Paris 287, 1013–1015 (1978)
Michalek, R., Tarantello, G.: Subharmonic solutions with prescribed minimal period for nonautonomous Hamiltonian systems. J. Diff. Eq. 72, 28–55 (1988)
Tarantello, G.: Subharmonic solutions for Hamiltonian systems via a ZZp pseudoindex theory. Annali di Matematica Pura (to appear)
Zhang, C., Doherty-Kirby, A., Huystee, R.B., Lajoie, G.: Investigation of Cationic Peanut Peroxidase Glycans by Electrospray Ionization Mass Spectrometry. Phytochemistry (2004) (accepted)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shan, B. (2005). Stochastic Context-Free Graph Grammars for Glycoprotein Modelling. In: Domaratzki, M., Okhotin, A., Salomaa, K., Yu, S. (eds) Implementation and Application of Automata. CIAA 2004. Lecture Notes in Computer Science, vol 3317. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30500-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-30500-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24318-2
Online ISBN: 978-3-540-30500-2
eBook Packages: Computer ScienceComputer Science (R0)