Abstract
Searching top-down spectra against a protein database has been a mainstream method for intact protein identification. Ranking true Protein-Spectrum Matches (PrSMs) over their false counterparts is a feasible method for improving protein identification results. In this paper, we propose a novel model called RPML (Rerank PrSMs based on Machine Learning) to rerank PrSMs in top-down proteomics. The experimental results on real data sets show that RPML can distinguish more correct PrSMs from incorrect ones. The source codes of algorithm are available at https://github.com/dqiong/spectra_protein_match_rerank.
Access this chapter
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2007)
Cannon, J.R., Cammarata, M., Robotham, S.A., Cotham, V.C., Shaw, J.B., Fellers, R.T., Early, B.P., Thomas, P.M., Kelleher, N.L., Brodbelt, J.S.: Ultraviolet photodissociation for characterization of whole proteins on a chromatographic time scale. Anal. Chem. 86(4), 2185–2192 (2014)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Durbin, K.R., Fellers, R.T., Ntai, I., Kelleher, N.L., Compton, P.D.: Autopilot: an online data acquisition control system for the enhanced high-throughput characterization of intact proteins. Anal. Chem. 86(3), 1485–1492 (2014)
Elias, J.E., Gygi, S.P.: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4(3), 207–214 (2007)
He, Z., Yu, W.: Improving peptide identification with single-stage mass spectrum peaks. Bioinformatics 25(22), 2969–2974 (2009)
He, Z., Zhao, H., Yu, W.: Score regularization for peptide identification. Asia Pac. Bioinf. Conf. 12(1), 1–10 (2011)
Liu, X., Inbar, Y., Dorrestein, P.C., Wynne, C., Edwards, N., Souda, P., Whitelegge, J.P., Bafna, V., Pevzner, P.A.: Deconvolution and database search of complex tandem mass spectra of intact proteins a combinatorial approach. Mol. Cell. Proteomics 9(12), 2772–2782 (2010)
Liu, X., Sirotkin, Y., Shen, Y., Anderson, G., Tsai, Y.S., Ying, S.T., Goodlett, D.R., Smith, R.D., Bafna, V., Pevzner, P.A.: Protein identification using top-down spectra. Mol. Cell. Proteomics MCP 11(6), M111.008524 (2012)
Park, J., Piehowski, P.D., Wilkins, C., Zhou, M., Mendoza, J., Fujimoto, G.M., Gibbons, B.C., Shaw, J.B., Shen, Y., Shukla, A.K.: Informed-proteomics: open source software package for top-down proteomics. Nat. Methods 14(9), 909–914 (2017)
Storey, J.D.: A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B Stat. Methodol. 64(3), 479–498 (2002)
Sun, R., Luo, L., Wu, L., Wang, R., Zeng, W., Chi, H., Liu, C., He, S.: pTop 1.0: a high-accuracy and high-efficiency search engine for intact protein identification. Anal. Chem. 88(6), 3082–3090 (2016)
Tian, Z., Tolic, N., Zhao, R., Moore, R.J., Hengel, S.M., Robinson, E.W., Stenoien, D.L., Wu, S., Smith, R.D., Pasatolic, L.: Enhanced top-down characterization of histone post-translational modifications. Genome Biol. 13(10), 1–9 (2012)
Tsai, Y.S., Scherl, A., Shaw, J.L., Mackay, C.L., Shaffer, S.A., Langridgesmith, P.R.R., Goodlett, D.R.: Precursor ion independent algorithm for top-down shotgun proteomics. J. Am. Soc. Mass Spectrom. 20(11), 2154–2166 (2009)
Acknowledgements
This work was partially supported by the Natural Science Foundation of China (Nos. 61572094, 61502071), the Fundamental Research Funds for the Central Universities (Nos. DUT2017TB02, DUT14QY07) and the Science-Technology Foundation for Youth of Guizhou Province (No. KY[2017]250).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Duan, Q., Liang, H., Sheng, C., Wu, J., Xu, B., He, Z. (2018). RPML: A Learning-Based Approach for Reranking Protein-Spectrum Matches. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10954. Springer, Cham. https://doi.org/10.1007/978-3-319-95930-6_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-95930-6_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95929-0
Online ISBN: 978-3-319-95930-6
eBook Packages: Computer ScienceComputer Science (R0)