Abstract
Background: The computer-assisted detection of small molecules by mass spectrometry in biological samples provides a snapshot of thousands of peptides, protein fragments and proteins in biological samples. This new analytical technology has the potential to identify disease associated proteomic patterns in blood serum. However, the presently available bioinformatic tools are not sensitive enough to identify clinically important low abundant proteins as hormons or tumor markers with only low blood concentrations.
Aim: Find, analyze and compare serum proteom patterns in groups of human subjects having different properties such as disease status with a new workflow to enhance sensitivity and specificity.
Problems: Mass data acquired from high-throughput platforms frequently are blurred and noisy. This complicates the reliable identification of peaks in general and very small peaks even below noise level in particular. However, this statement is only valid for single or few spectra. If the algorithm has access to a large number of spectra (e.g. N > 1000), new possibilities arise, one of such being a statistical approach.
Approach: Apply signal preprocessing steps followed by statistical analyses of the blurred data and the region below the typical noise threshold to identify signals usually hidden below this “barrier”.
Results: A new analysis workflow has been developed that is able to accurately identify, analyze and determine peaks and their parameters even below noise level which other tools can not detect. A Comparison to commercial software has clearly proven this gain in sensitivity. These additional peaks can be used in subsequent steps to build better peak patterns for proteomic pattern analysis. We belive that this new approach will foster identification of new biomarkers having not been detectable by most algorithms currently available.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kozak, K.R., Su, F., Whitelegge, J.P., Faull, K., Reddy, S., Farias-Eisner, R.: Characterization of serum biomarkers for detection of early stage ovarian cancer. Proteomics 5(17), 4589–4596 (2005)
Becker, S., Cazares, L.H., Watson, P., Lynch, H., Semmes, O.J., Drake, R.R., Laronga, C.: Surfaced-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) differentiation of serum protein profiles of BRCA-1 and sporadic breast cancer. Ann. Surg. Oncol. 11(10), 907–914 (2004)
Baumann, S., Ceglarek, U., Fiedler, G.M., Lembcke, J., Leichtle, A., Thiery, J.: Standardized approach to proteome profiling of human serum based on magnetic bead separation and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Clin. Chem. 51(6), 973–980 (2005)
Hortin, G.L.: The MALDI TOF Mass Spectrometric View of the Plasma Proteome and Peptidome. Clin. Chem. (April 2006)
Breen, E.J., Hopwood, F.G., Williams, K.L., Wilkins, M.R.: Automatic poisson peak harvesting for high throughput protein identification. Electrophoresis 21(11), 2243–2251 (2000)
Sauve, A.C., Speed, T.P.: Normalization, baseline correction and alignment of high-throughput mass spectrometry data. In: Proceedings Gensips 2004 (2004)
Gröpl, C., Hildebrandt, A., Kohlbacher, O., Lange, E., Lövenich, S., Sturm, M.: OpenMS - Software for Mass Spectrometry. In: MBI Workshop on Computational Proteomics and Mass Spectrometry 2005, Ohio State University (2005)
Mazet, V., Brie, D., Idier, J.: Baseline spectrum estimation using half-quadratic minimization. In: Proceedings of the European Signal Processing Conference, Vienna, Autriche (September 2004)
Wagner, M., Naik, D., Pothen, A.: Protocols for disease classification from mass spectrometry data. Proteomics 3(9), 1692–1698 (2003)
Liu, Q., Krishnapuram, B., Pratapa, P., Liao, X., Hartemink, A., Carin, L.: Identification of differentially expressed proteins using maldi-tof mass spectra. In: Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1323–1327 (November 2003)
Louis, A.K., Maass, P., Rieder, A.: Wavelets: Theorie und Anwendungen. In: Teubner, B.G., Stuttgart (eds.), 2nd edn. (1998)
Nason, G.P., Silverman, B.W.: The stationary wavelet transform and some statistical applications. Lecture Notes in Statistics, vol. 103, pp. 281–300 (1995)
Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306), 572–577 (2002)
Li, L., Tang, H., Wu, Z., Gong, J., Gruidl, M., Zou, J., Tockman, M., Clark, R.A.: Data mining techniques for cancer detection using serum proteomic profiling. Artif. Intell. Med. 32(2), 71–83 (2004)
Norris, J.L., Cornett, D.S., Mobley, J.A., Schwartz, S.A., Roder, H., Caprioli, R.M.: Preparing maldi mass spectra for statistical analysis: A practical approach. In: Proceedings of the 53rd ASMS Conference on Mass Spectrometry and Allied Topics, San Antonio, TX (June 2005)
Fung, E.T., Enderwick, C.: ProteinChip clinical proteomics: computational challenges and solutions. Biotechniques (Suppl. 34–8), 40–41 (March 2002)
Baggerly, K.A., Morris, J.S., Wang, J., Gold, D., Xiao, L.-C., Coombes, K.R.: A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples. Proteomics 3(9), 1667–1672 (2003)
McDonough, R.N., Whale, A.D.: Detection of Signals in Noise, 2nd edn. Academic Press, San Diego (1995)
Guiasu, S., Shenitzer, A.: The principle of maximum entropy. The Mathematical Intelligencer 7(1), 42–48 (1985)
Verbeek, J.J., Vlassis, N., Kröse, B.: Efficient greedy learning of gaussian mixture models. Neural Comput. 15(2), 469–485 (2003)
Paalanen, P., Kamarainen, J.-K., Ilonen, J., Kälviäinen, H.: Representation and discrimination based on gaussian mixture model probability densities - practices and algorithms. Technical Report 95. Lappeenranta University of Technology, Department of Information Technology (2005)
Zhang, J., Gao, W., Cai, J., He, S., Zeng, R., Chen, R.: Predicting molecular formulas of fragment ions with isotope patterns in tandem mass spectra. IEEE/ACM Transactions on Computational Biology and Bioinformatics 02(3), 217–230 (2005)
Wolfson, H.J., Rigoutsos, I.: Geometric hashing: An overview. IEEE Computational Science & Engineering 4(4), 10–21 (1997)
Tibshirani, R., Hastie, T., Narasimhan, B., Soltys, S., Shi, G., Koong, A., Le, Q.-T.: Sample classification from protein mass spectrometry, by ’peak probability contrasts’. Bioinformatics 20(17), 3034–3044 (2004)
Ferguson, T.S.: A bayesian analysis of some nonparametric problems. The Annals of Statistics 1, 209–230 (1973)
Blackwell, D., MacQueen, J.: Ferguson distributions via polya urn schemes. The Annals of Statistics 1, 353–355 (1973)
Aldous, D.J.: Exchangeability and related topics. Lecture Notes in Math - Ecole d’ete de probabilites de Saint-Flour, vol. 1117. Springer, Berlin (1983)
Ishwaran, H., James, L.F.: Generalized weighted chinese restaurant process for species sampling mixture models. Statistica Sinica 3, 1211–1235 (2003)
Lo, A.Y.: Weighted chinese restaurant processes. Cosmos 1(1), 107–111 (2005)
Scheather, S.J.: Density estimation. Statistical Science 19(4), 588–597 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Conrad, T.O.F. et al. (2006). Beating the Noise: New Statistical Methods for Detecting Signals in MALDI-TOF Spectra Below Noise Level. In: R. Berthold, M., Glen, R.C., Fischer, I. (eds) Computational Life Sciences II. CompLife 2006. Lecture Notes in Computer Science(), vol 4216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875741_12
Download citation
DOI: https://doi.org/10.1007/11875741_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45767-1
Online ISBN: 978-3-540-45768-8
eBook Packages: Computer ScienceComputer Science (R0)