Abstract
Biomedical experts are increasingly confronted with what is often called Big Data, an important subclass of high-dimensional data. High-dimensional data analysis can be helpful in finding relationships between records and dimensions. However, due to data complexity, experts are decreasingly capable of dealing with increasingly complex data. Mapping higher dimensional data to a smaller number of relevant dimensions is a big challenge due to the curse of dimensionality. Irrelevant, redundant, and conflicting dimensions affect the effectiveness and efficiency of analysis. Furthermore, the possible mappings from high- to low-dimensional spaces are ambiguous. For example, the similarity between patients may change by considering different combinations of relevant dimensions (subspaces). We show the potential of subspace analysis for the interpretation of high-dimensional medical data. Specifically, we analyze relationships between patients, sets of patient attributes, and outcomes of a vaccination treatment by means of a subspace clustering approach. We present an analysis workflow and discuss future directions for high-dimensional (medical) data analysis and visual exploration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in bioinformatics - state-of-the-art, future challenges and research directions. BMC Bioinformatics 15, I1 (2014)
Holzinger, A.: Biomedical Informatics: Discovering Knowledge in Big Data. Springer, New York (2014)
Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? In: Proc. Int. Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., pp. 506–515 (2000)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. 3rd edn. Morgan Kaufmann Publishers Inc
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proc. Int. Conference on Database Theory, pp. 217–235 (1999)
Kriegel, H.P., Kröger, P., Zimek, A.: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery from Data (TKDD) 3, 1–58 (2009)
Fua, Y.H., Ward, M., Rundensteiner, E.: Hierarchical parallel coordinates for exploration of large data sets. In: Proc. Conference on Visualization, pp. 43–50. IEEE CS Press (1999)
Buja, A., Swayne, D.F., Littman, M.L., Dean, N., Hofmann, H., Chen, L.: Data visualization with multidimensional scaling. Journal of Computational and Graphical Statistics 17, 444–472 (2008)
Seo, J., Shneiderman, B.: Interactively exploring hierarchical clustering results. Computer 35, 80–86 (2002)
Tatu, A., Zhang, L., Bertini, E., Schreck, T., Keim, D., Bremm, S., von Landesberger, T.: Clustnails: Visual analysis of subspace clusters. Tsinghua Science and Technology 17, 419–428 (2012)
Tatu, A., Maaß, F., Färber, I., Bertini, E., Schreck, T., Seidl, T., Keim, D.: Subspace search and visualization to make sense of alternative clusterings in high-dimensional data. In: Proc. IEEE Conf. Visual Analytics Science and Technology, pp. 63–72 (2012)
Assent, I., Krieger, R., Müller, E., Seidl, T.: Visa: visual subspace clustering analysis. SIGKDD Explor. Newsl. 9, 5–12 (2007)
Turkay, C., Lex, A., Streit, M., Pfister, H., Hauser, H.: Characterizing cancer subtypes using dual analysis in caleydo StratomeX. IEEE Computer Graphics and Applications 34, 38–47 (2014)
Liu, H., Motoda, H.: Computational Methods of Feature Selection. Chapman & Hall/CRC (2007)
Aggarwal, C., Procopiuc, C., Wolf, J., Yu, P., Park, J.: Fast algorithms for projected clustering. In: Proc. ACM Int. Conf. on Management of Data, pp. 61–72 (1999)
Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data 2, 1270–1281 (2009)
Trtica-Majnaric, L., Zekic-Susac, M., Sarlija, N., Vitale, B.: Prediction of influenza vaccination outcome by neural networks and logistic regression. Journal of Biomedical Informatics 43, 774–781 (2010)
Berthold, M., Cebron, N., Dill, F., Gabriel, T., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: The Konstanz Information Miner. In: Studies in Classification, Data Analysis, and Knowledge Organization. Springer (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hund, M. et al. (2015). Analysis of Patient Groups and Immunization Results Based on Subspace Clustering. In: Guo, Y., Friston, K., Aldo, F., Hill, S., Peng, H. (eds) Brain Informatics and Health. BIH 2015. Lecture Notes in Computer Science(), vol 9250. Springer, Cham. https://doi.org/10.1007/978-3-319-23344-4_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-23344-4_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23343-7
Online ISBN: 978-3-319-23344-4
eBook Packages: Computer ScienceComputer Science (R0)