Skip to main content

RPML: A Learning-Based Approach for Reranking Protein-Spectrum Matches

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10954))

Included in the following conference series:

  • 2820 Accesses

Abstract

Searching top-down spectra against a protein database has been a mainstream method for intact protein identification. Ranking true Protein-Spectrum Matches (PrSMs) over their false counterparts is a feasible method for improving protein identification results. In this paper, we propose a novel model called RPML (Rerank PrSMs based on Machine Learning) to rerank PrSMs in top-down proteomics. The experimental results on real data sets show that RPML can distinguish more correct PrSMs from incorrect ones. The source codes of algorithm are available at https://github.com/dqiong/spectra_protein_match_rerank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Institutional subscriptions

References

  1. Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2007)

    MATH  Google Scholar 

  2. Cannon, J.R., Cammarata, M., Robotham, S.A., Cotham, V.C., Shaw, J.B., Fellers, R.T., Early, B.P., Thomas, P.M., Kelleher, N.L., Brodbelt, J.S.: Ultraviolet photodissociation for characterization of whole proteins on a chromatographic time scale. Anal. Chem. 86(4), 2185–2192 (2014)

    Article  Google Scholar 

  3. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)

    Google Scholar 

  4. Durbin, K.R., Fellers, R.T., Ntai, I., Kelleher, N.L., Compton, P.D.: Autopilot: an online data acquisition control system for the enhanced high-throughput characterization of intact proteins. Anal. Chem. 86(3), 1485–1492 (2014)

    Article  Google Scholar 

  5. Elias, J.E., Gygi, S.P.: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4(3), 207–214 (2007)

    Article  Google Scholar 

  6. He, Z., Yu, W.: Improving peptide identification with single-stage mass spectrum peaks. Bioinformatics 25(22), 2969–2974 (2009)

    Article  Google Scholar 

  7. He, Z., Zhao, H., Yu, W.: Score regularization for peptide identification. Asia Pac. Bioinf. Conf. 12(1), 1–10 (2011)

    Google Scholar 

  8. Liu, X., Inbar, Y., Dorrestein, P.C., Wynne, C., Edwards, N., Souda, P., Whitelegge, J.P., Bafna, V., Pevzner, P.A.: Deconvolution and database search of complex tandem mass spectra of intact proteins a combinatorial approach. Mol. Cell. Proteomics 9(12), 2772–2782 (2010)

    Article  Google Scholar 

  9. Liu, X., Sirotkin, Y., Shen, Y., Anderson, G., Tsai, Y.S., Ying, S.T., Goodlett, D.R., Smith, R.D., Bafna, V., Pevzner, P.A.: Protein identification using top-down spectra. Mol. Cell. Proteomics MCP 11(6), M111.008524 (2012)

    Google Scholar 

  10. Park, J., Piehowski, P.D., Wilkins, C., Zhou, M., Mendoza, J., Fujimoto, G.M., Gibbons, B.C., Shaw, J.B., Shen, Y., Shukla, A.K.: Informed-proteomics: open source software package for top-down proteomics. Nat. Methods 14(9), 909–914 (2017)

    Article  Google Scholar 

  11. Storey, J.D.: A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B Stat. Methodol. 64(3), 479–498 (2002)

    Article  MathSciNet  Google Scholar 

  12. Sun, R., Luo, L., Wu, L., Wang, R., Zeng, W., Chi, H., Liu, C., He, S.: pTop 1.0: a high-accuracy and high-efficiency search engine for intact protein identification. Anal. Chem. 88(6), 3082–3090 (2016)

    Article  Google Scholar 

  13. Tian, Z., Tolic, N., Zhao, R., Moore, R.J., Hengel, S.M., Robinson, E.W., Stenoien, D.L., Wu, S., Smith, R.D., Pasatolic, L.: Enhanced top-down characterization of histone post-translational modifications. Genome Biol. 13(10), 1–9 (2012)

    Article  Google Scholar 

  14. Tsai, Y.S., Scherl, A., Shaw, J.L., Mackay, C.L., Shaffer, S.A., Langridgesmith, P.R.R., Goodlett, D.R.: Precursor ion independent algorithm for top-down shotgun proteomics. J. Am. Soc. Mass Spectrom. 20(11), 2154–2166 (2009)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the Natural Science Foundation of China (Nos. 61572094, 61502071), the Fundamental Research Funds for the Central Universities (Nos. DUT2017TB02, DUT14QY07) and the Science-Technology Foundation for Youth of Guizhou Province (No. KY[2017]250).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Bo Xu or Zengyou He .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Duan, Q., Liang, H., Sheng, C., Wu, J., Xu, B., He, Z. (2018). RPML: A Learning-Based Approach for Reranking Protein-Spectrum Matches. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10954. Springer, Cham. https://doi.org/10.1007/978-3-319-95930-6_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-95930-6_54

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-95929-0

  • Online ISBN: 978-3-319-95930-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics