Abstract
Pearson correlation is one of the standards for comparisons in biomedical analyses, possessing yet unused potential. Substantial value is added by transferring Pearson correlation into the framework of adaptive similarity measures and by exploiting properties of the mathematical derivatives. This opens access to optimization-based data models applicable in tasks of attribute characterization, clustering, classification, and visualization. Modern high-throughput measuring equipment creates high demand for analysis of extensive biomedical data including spectra and high-resolution gel-electrophoretic images. In this study cDNA arrays are considered as data sources of interest. Recent computational methods are presented for the characterization and analysis of these huge-dimensional data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anscombe, F.J.: Graphs in statistical analysis. American Statistician 27, 17–21 (1973)
Azuaje, F., Dopazo, J.: Data Analysis and Visualization in Genomics and Proteomics. Wiley, Chichester (2005)
Balasubramaniyan, R., Hüllermeier, E., Weskamp, N., Kämper, J.: Clustering of gene expression data using a local shape-based similarity measure. Bioinformatics 21(7), 1069–1077 (2005)
Bar-Joseph, Z., Gifford, D.K., Jaakkola, T.S.: Fast optimal leaf ordering for hierarchical clustering. Bioinformatics 17(suppl. 1), S22–S29 (2001)
Blest, D.: Rank correlation – an alternative measure. Australian & New Zealand Journal of Statistics 42(1), 101–111 (2000)
Bloom, J., Adami, C.: Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evolutionary Biology 3(1), 21 (2003)
Buja, A., Swayne, D., Littman, M., Dean, N., Hofmann, H.: Interactive Data Visualization with Multidimensional Scaling. Report, University of Pennsylvania (2004), http://www-stat.wharton.upenn.edu/~buja/
Cottrell, M., Hammer, B., Hasenfuß, A., Villmann, T.: Batch NG. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 275–282. D-side Publications (2005)
Cox, M., Cox, M.: Multidimensional Scaling. Chapman and Hall, Boca Raton (2001)
Ferguson, T., Genest, C., Hallin, M.: Kendall’s Tau for autocorrelation. The Canadian Journal of Statistics 28(3), 587–604 (2000)
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Springer, Heidelberg (1992)
Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Applied Statistics 28, 100–108 (1979)
Johnson, S.: Hierarchical Clustering Schemes. Psychometrika 2, 241–254 (1967)
Kaski, S.: Dimensionality reduction by random mapping: Fast similarity computation for clustering. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN 1998), vol. 1, pp. 413–418. IEEE Service Center, Piscataway (1998)
Kaski, S., Nikkila, J., Oja, M., Venna, J., Toronen, P., Castren, E.: Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics 4(1), 48 (2003)
Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001)
Lee, J., Verleysen, M.: Nonlinear Dimension Reduction. Springer, Heidelberg (2007)
Lee, J., Verleysen, M.: Rank-based quality assessment of nonlinear dimensionality reduction. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 49–54. D-facto Publications (2008)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)
Lohninger, H.: Teach/Me Data Analysis. Springer, Heidelberg (1999)
Ma, Y., Lao, S., Takikawa, E., Kawade, M.: Discriminant analysis in correlation similarity measure space. In: Ghahramani, Z. (ed.) Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007), pp. 577–584. Omnipress (2007)
Mardia, K., Dryden, I.: Statistical Shape Analysis. Wiley, Chichester (1998)
Martinetz, T., Berkovich, S., Schulten, K.: “Neural-gas” network for vector quantization and its application to time-series prediction. IEEE Transactions on Neural Networks 4(4), 558–569 (1993)
Martinetz, T., Schulten, K.: A ”neural-gas” network learns topologies. Artificial Neural Networks I, 397–402 (1991)
Meuleman, W., Engwegen, J., Gast, M.-C., Beijnen, J., Reinders, M., Wessels, L.: Comparison of normalisation methods for surface-enhanced laser desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data. BMC Bioinformatics 9(1), 88 (2008)
Nielsen, N., Carstensen, J., Smedsgaard, J.: Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping. Journal of Chromatography 805, 17–35 (1998)
Sreenivasulu, N., Radchuk, V., Strickert, M., Miersch, O., Weschke, W., Wobus, U.: Gene expression patterns reveal tissue-specific signaling networks controlling programmed cell death and ABA-regulated maturation in developing barley seeds. The Plant Journal 47(2), 310–327 (2006)
Strickert, M., Schleif, F.-M., Seiffert, U., Villmann, T.: Derivatives of Pearson correlation for gradient-based analysis of biomedical data. Inteligencia Artificial, Revista Iberoamericana de IA 12(37), 37–44 (2008)
Strickert, M., Schleif, F.-M., Villmann, T.: Metric adaptation for supervised attribute rating. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 31–36. D-facto Publications (2008)
Strickert, M., Seiffert, U., Sreenivasulu, N., Weschke, W., Villmann, T., Hammer, B.: Generalized relevance LVQ (GRLVQ) with correlation measures for gene expression data. Neurocomputing 69, 651–659 (2006)
Strickert, M., Sreenivasulu, N., Seiffert, U.: Sanger-driven MDSLocalize - A comparative study for genomic data. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 265–270. D-facto Publications (2006)
Strickert, M., Sreenivasulu, N., Usadel, B., Seiffert, U.: Correlation-maximizing surrogate gene space for visual mining of gene expression patterns in developing barley endosperm tissue. BMC Bioinformatics 8(165) (2007)
Strickert, M., Sreenivasulu, N., Villmann, T., Hammer, B.: Robust centroid-based clustering using derivatives of Pearson correlation. In: Proc. Int. Joint Conf. Biomedical Engineering Systems and Technologies, BIOSIGNALS, Madeira (2008)
Strickert, M., Teichmann, S., Sreenivasulu, N., Seiffert, U.: High-Throughput Multi-Dimensional Scaling (HiT-MDS) for cDNA-array expression data. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 625–633. Springer, Heidelberg (2005)
Strickert, M., Witzel, K., Mock, H.-P., Schleif, F.-M., Villmann, T.: Supervised attribute relevance determination for protein identification in stress experiments. In: Proceedings of Machine Learning in Systems Biology (MLSB 2007), pp. 81–86 (2007)
Venna, J., Kaski, S.: Neighborhood preservation in nonlinear projection methods: An experimental study. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) Proceedings of the International Conference on Artificial Neural Networks (ICANN), pp. 485–591. Springer, Heidelberg (2001)
Villmann, T., Claussen, J.C.: Magnification control in self-organizing maps and neural gas. Neural Computation 18(2), 446–469 (2006)
Villmann, T., Schleif, F.-M., Hammer, B.: Comparison of Relevance Learning Vector Quantization with other Metric Adaptive Classification Methods. Journal of Neural Networks 19(5), 610–622 (2006)
Xu, W., Chang, C., Hung, Y., Kwan, S., Fung, P.: Order Statistics Correlation Coefficient as a Novel Association Measurement with Applications to Biosignal Analysis. IEEE Transactions on Signal Processing 55(12), 5552–5563 (2007)
Yang, L.: An overview of distance metric learning. Technical report, Department of Computer Science and Engineering, Michigan State University (2007)
Zhou, X., Kao, M.-C.J., Wong, W.H.: Transitive functional annotation by shortest-path analysis of gene expression data. PNAS 99(20), 12783–12788 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Strickert, M., Schleif, FM., Villmann, T., Seiffert, U. (2009). Unleashing Pearson Correlation for Faithful Analysis of Biomedical Data. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds) Similarity-Based Clustering. Lecture Notes in Computer Science(), vol 5400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01805-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-01805-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01804-6
Online ISBN: 978-3-642-01805-3
eBook Packages: Computer ScienceComputer Science (R0)