Skip to main content

Beating the Noise: New Statistical Methods for Detecting Signals in MALDI-TOF Spectra Below Noise Level

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4216))

Abstract

Background: The computer-assisted detection of small molecules by mass spectrometry in biological samples provides a snapshot of thousands of peptides, protein fragments and proteins in biological samples. This new analytical technology has the potential to identify disease associated proteomic patterns in blood serum. However, the presently available bioinformatic tools are not sensitive enough to identify clinically important low abundant proteins as hormons or tumor markers with only low blood concentrations.

Aim: Find, analyze and compare serum proteom patterns in groups of human subjects having different properties such as disease status with a new workflow to enhance sensitivity and specificity.

Problems: Mass data acquired from high-throughput platforms frequently are blurred and noisy. This complicates the reliable identification of peaks in general and very small peaks even below noise level in particular. However, this statement is only valid for single or few spectra. If the algorithm has access to a large number of spectra (e.g. N > 1000), new possibilities arise, one of such being a statistical approach.

Approach: Apply signal preprocessing steps followed by statistical analyses of the blurred data and the region below the typical noise threshold to identify signals usually hidden below this “barrier”.

Results: A new analysis workflow has been developed that is able to accurately identify, analyze and determine peaks and their parameters even below noise level which other tools can not detect. A Comparison to commercial software has clearly proven this gain in sensitivity. These additional peaks can be used in subsequent steps to build better peak patterns for proteomic pattern analysis. We belive that this new approach will foster identification of new biomarkers having not been detectable by most algorithms currently available.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kozak, K.R., Su, F., Whitelegge, J.P., Faull, K., Reddy, S., Farias-Eisner, R.: Characterization of serum biomarkers for detection of early stage ovarian cancer. Proteomics 5(17), 4589–4596 (2005)

    Article  Google Scholar 

  2. Becker, S., Cazares, L.H., Watson, P., Lynch, H., Semmes, O.J., Drake, R.R., Laronga, C.: Surfaced-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) differentiation of serum protein profiles of BRCA-1 and sporadic breast cancer. Ann. Surg. Oncol. 11(10), 907–914 (2004)

    Article  Google Scholar 

  3. Baumann, S., Ceglarek, U., Fiedler, G.M., Lembcke, J., Leichtle, A., Thiery, J.: Standardized approach to proteome profiling of human serum based on magnetic bead separation and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Clin. Chem. 51(6), 973–980 (2005)

    Article  Google Scholar 

  4. Hortin, G.L.: The MALDI TOF Mass Spectrometric View of the Plasma Proteome and Peptidome. Clin. Chem. (April 2006)

    Google Scholar 

  5. Breen, E.J., Hopwood, F.G., Williams, K.L., Wilkins, M.R.: Automatic poisson peak harvesting for high throughput protein identification. Electrophoresis 21(11), 2243–2251 (2000)

    Article  Google Scholar 

  6. Sauve, A.C., Speed, T.P.: Normalization, baseline correction and alignment of high-throughput mass spectrometry data. In: Proceedings Gensips 2004 (2004)

    Google Scholar 

  7. Gröpl, C., Hildebrandt, A., Kohlbacher, O., Lange, E., Lövenich, S., Sturm, M.: OpenMS - Software for Mass Spectrometry. In: MBI Workshop on Computational Proteomics and Mass Spectrometry 2005, Ohio State University (2005)

    Google Scholar 

  8. Mazet, V., Brie, D., Idier, J.: Baseline spectrum estimation using half-quadratic minimization. In: Proceedings of the European Signal Processing Conference, Vienna, Autriche (September 2004)

    Google Scholar 

  9. Wagner, M., Naik, D., Pothen, A.: Protocols for disease classification from mass spectrometry data. Proteomics 3(9), 1692–1698 (2003)

    Article  Google Scholar 

  10. Liu, Q., Krishnapuram, B., Pratapa, P., Liao, X., Hartemink, A., Carin, L.: Identification of differentially expressed proteins using maldi-tof mass spectra. In: Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1323–1327 (November 2003)

    Google Scholar 

  11. Louis, A.K., Maass, P., Rieder, A.: Wavelets: Theorie und Anwendungen. In: Teubner, B.G., Stuttgart (eds.), 2nd edn. (1998)

    Google Scholar 

  12. Nason, G.P., Silverman, B.W.: The stationary wavelet transform and some statistical applications. Lecture Notes in Statistics, vol. 103, pp. 281–300 (1995)

    Google Scholar 

  13. Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306), 572–577 (2002)

    Article  Google Scholar 

  14. Li, L., Tang, H., Wu, Z., Gong, J., Gruidl, M., Zou, J., Tockman, M., Clark, R.A.: Data mining techniques for cancer detection using serum proteomic profiling. Artif. Intell. Med. 32(2), 71–83 (2004)

    Article  Google Scholar 

  15. Norris, J.L., Cornett, D.S., Mobley, J.A., Schwartz, S.A., Roder, H., Caprioli, R.M.: Preparing maldi mass spectra for statistical analysis: A practical approach. In: Proceedings of the 53rd ASMS Conference on Mass Spectrometry and Allied Topics, San Antonio, TX (June 2005)

    Google Scholar 

  16. Fung, E.T., Enderwick, C.: ProteinChip clinical proteomics: computational challenges and solutions. Biotechniques (Suppl. 34–8), 40–41 (March 2002)

    Google Scholar 

  17. Baggerly, K.A., Morris, J.S., Wang, J., Gold, D., Xiao, L.-C., Coombes, K.R.: A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples. Proteomics 3(9), 1667–1672 (2003)

    Article  Google Scholar 

  18. McDonough, R.N., Whale, A.D.: Detection of Signals in Noise, 2nd edn. Academic Press, San Diego (1995)

    Google Scholar 

  19. Guiasu, S., Shenitzer, A.: The principle of maximum entropy. The Mathematical Intelligencer 7(1), 42–48 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  20. Verbeek, J.J., Vlassis, N., Kröse, B.: Efficient greedy learning of gaussian mixture models. Neural Comput. 15(2), 469–485 (2003)

    Article  MATH  Google Scholar 

  21. Paalanen, P., Kamarainen, J.-K., Ilonen, J., Kälviäinen, H.: Representation and discrimination based on gaussian mixture model probability densities - practices and algorithms. Technical Report 95. Lappeenranta University of Technology, Department of Information Technology (2005)

    Google Scholar 

  22. Zhang, J., Gao, W., Cai, J., He, S., Zeng, R., Chen, R.: Predicting molecular formulas of fragment ions with isotope patterns in tandem mass spectra. IEEE/ACM Transactions on Computational Biology and Bioinformatics 02(3), 217–230 (2005)

    Article  Google Scholar 

  23. Wolfson, H.J., Rigoutsos, I.: Geometric hashing: An overview. IEEE Computational Science & Engineering 4(4), 10–21 (1997)

    Article  Google Scholar 

  24. Tibshirani, R., Hastie, T., Narasimhan, B., Soltys, S., Shi, G., Koong, A., Le, Q.-T.: Sample classification from protein mass spectrometry, by ’peak probability contrasts’. Bioinformatics 20(17), 3034–3044 (2004)

    Article  Google Scholar 

  25. Ferguson, T.S.: A bayesian analysis of some nonparametric problems. The Annals of Statistics 1, 209–230 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  26. Blackwell, D., MacQueen, J.: Ferguson distributions via polya urn schemes. The Annals of Statistics 1, 353–355 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  27. Aldous, D.J.: Exchangeability and related topics. Lecture Notes in Math - Ecole d’ete de probabilites de Saint-Flour, vol. 1117. Springer, Berlin (1983)

    Google Scholar 

  28. Ishwaran, H., James, L.F.: Generalized weighted chinese restaurant process for species sampling mixture models. Statistica Sinica 3, 1211–1235 (2003)

    MathSciNet  Google Scholar 

  29. Lo, A.Y.: Weighted chinese restaurant processes. Cosmos 1(1), 107–111 (2005)

    Article  MathSciNet  Google Scholar 

  30. Scheather, S.J.: Density estimation. Statistical Science 19(4), 588–597 (2004)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Conrad, T.O.F. et al. (2006). Beating the Noise: New Statistical Methods for Detecting Signals in MALDI-TOF Spectra Below Noise Level. In: R. Berthold, M., Glen, R.C., Fischer, I. (eds) Computational Life Sciences II. CompLife 2006. Lecture Notes in Computer Science(), vol 4216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875741_12

Download citation

  • DOI: https://doi.org/10.1007/11875741_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45767-1

  • Online ISBN: 978-3-540-45768-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics