Computerized Medical Imaging and Graphics
Computer-aided prognosis: Predicting patient and disease outcome via quantitative fusion of multi-scale, multi-modal data☆
Introduction
Most researchers agree that cancer is a complex disease which we do not yet fully understand. Predictive, preventive, and personalized medicine (PPP) has the potential to transform clinical practice by decreasing morbidity due to diseases such as cancer by integrating multi-scale, multi-modal, and heterogeneous data to determine the probability of an individual contracting certain diseases and/or responding to a specific treatment regimen [3]. In the clinic, the same treatment applied to two patients with diseases that look very similar often have vastly different outcomes under the same treatment [4], [5]. A part of this difference is undoubtedly patient specific, but a part must also be a result of our limited understanding of the relationship between disease progression and clinical presentation.
An understanding of the interplays of different hierarchies of biological information from proteins, tissue, metabolites, and imaging will provide conceptual insights and practical innovations that will profoundly transform people's lives [3], [5], [6]. There is a consensus among clinicians and researchers that a more quantitative approach, using computerized imaging techniques to better understand tumor morphology, combined with the classification of disease into more meaningful molecular subtypes, will lead to better patient care and more effective therapeutics [5], [7], [8]. With the advent of digital pathology [5], [6], [9], multi-functional imaging, mass spectrometry, immuno-histochemical, and fluorescent in situ hybridization (FISH) techniques, the acquisition of multiple, orthogonal sources of genomic, proteomic, multi-parametric radiological, and histological information for disease characterization is becoming routine at several institutions [10], [11]. Computerized image analysis and high dimensional data fusion methods will likely constitute an important piece of the prognostic tool-set to enable physicians to predict which patients may be susceptible to a particular disease and also for predicting disease outcome and survival. These tools will also have important implications in theragnostics [12], [13], [14], the ability to predict how an individual may react to various treatments, thereby (1) providing guidance for developing customized therapeutic drugs and (2) enabling development of preventive treatments for individuals based on their potential health problems. A theragnostic profile that is a synthesis of various biomarker and imaging tests from different levels of the biological hierarchy (genomic, proteomic, metabolic) could be used to characterize an individual patient and her/his drug treatment outcome.
If multiple sensors or sources are used in the inference process, in principle, they could be fused at one of 3 levels in the hierarchy; (1) raw data-level fusion, (2) feature-level fusion, or (3) decision-level fusion [15], [16]. Several classifier ensemble or multiple classifier schemes have been previously proposed to associate and correlate data at the decision-level (combination of decisions (COD)) [17], [18], [19], [20], [21], [22], [23], [24]; a much easier task compared to data integration at the raw-data or feature level (combination of features (COF)). Traditional decision fusion based approaches have focused on combining either binary decisions , ranks, or probabilistic classifier outputs obtained via classification of each of the k individual data sources Fα(c), α ∈ {1, 2, …, k}, via a Bayesian framework [25], Dempster–Shafer evidence theory [26], fuzzy set theory, or via classical decision ensembles schemes, e.g. Adaboost [19], Support Vector Machines (SVM) [18], or Bagging [17]. At a given data scale (e.g. radiological images such as MRI and CT), several researchers [27], [28], [29], [30], [31], [32], [33], [34], [35] have developed techniques for combining imaging data sources (assuming the registration problem has been solved) by simply concatenating the individual image modality attributes FMRI(c) and FCT(c) at every spatial location c to create a combined feature vector [FMRI(c), FCT(c)] which can be input to a classifier. However when the individual modalities are heterogeneous (image and non-image based) and of different dimensions, e.g. a 256 dimensional vectorial spectral signal FMRS(c) and a scalar image intensity value FMRI(c), a simple concatenation [FMRI(c), FMRS(c)] will not provide a meaningful data fusion solution. Thus, a significant challenge in integrating heterogeneous imaging and non-imaging biological data has been the lack of a quantifiable knowledge representation framework to reconcile cross-modal, cross-dimensional differences in feature values.
While no general theory yet exists for domain data fusion, most researchers agree that heterogeneous data needs be represented in a way that will allow for confrontation of the different channels, an important prerequisite to fusion or classification. Bruno et al. [36] recently designed a multimodal dissimilarity space for retrieval of video documents. Lanckriet et al. [37] and Lewis et al. [38] both presented kernel based frameworks for representing heterogeneous data relating to protein sequences and then used the data representation in conjunction with a SVM classifier [18] for protein structure prediction. Mandic et al. [39] recently proposed a sequential data fusion approach for combining wind measurements via the representation of directional signals within the field of complex numbers. Coppock and Mazlack [40] extended Gower's metric [41] for nominal and ordinal data integration within an agglomerative hierarchical clustering algorithm to cluster mixed data.
In spite of the challenges, data fusion at the feature level aims at retrieving the interesting characteristics of the phenomenon being studied [39]. Kernel-based formulations have been used in combining multiple related datasets (such as gene expression, protein sequence, and protein–protein interaction data) for function prediction in yeast [37] as well as for heterogeneous data fusion for studying Alzheimer's disease [42]. However the selection and tuning of the kernels used in multi-kernel learning (MKL) play an important role in the performance of the approach. This selection proves to be non-trivial when considering completely heterogeneous, multi-scale data such as molecular protein-, and gene-expression signatures and imaging and metabolic phenotypes. Additionally these methods typically employ the same kernel or metric, across modalities, for estimating object similarity. Thus while the Euclidean kernel might be appropriate for image intensities, it might not be appropriate for all feature spaces (e.g. time series spectra or gene expression vectors) [43].
Recently, approaches involving the use of dimensionality reduction (DR) methods for representing high dimensional data in terms of embedding vectors in a reduced dimensional space have been proposed. Applications have included the fusion of heterogeneous dimensional data (e.g. scalar imaging (MRI) and vectorial information (e.g. magnetic resonance spectroscopy (MRS))) [44], [45], [46] by attempting to reduce the dimensionality of the higher dimensional data source to that of the lower dimensional modality via principal component analysis (PCA), independent component analysis (ICA), or a linear combination model (LCM) [47]. However, these strategies often lead to non-optimal fusion solutions due to (a) use of linear DR schemes, (b) dimensionality reduction of only the non-imaging data channel and (c) large scaling differences between the different modalities. Yu and Tresp proposed a generalized PCA model for representing real-world image painting data [48]. Recently, manifold learning (ML) methods such as isometric mapping (Isomap) [49] and locally linear embedding (LLE) [50] have become popular for mapping high dimensional information into a low dimensional representation for the purpose of visualization or classification. While these non-linear DR (NLDR) methods enjoy advantages compared to traditional linear DR methods such as PCA [51] and LCM [52] in that they are able to discover non-linear relationships in the data [53], [54], they are notoriously susceptible to the choice of optimal embedding parameters [49], [50].
Researchers have since been developing novel methods for overcoming the difficulties in obtaining an appropriate manifold representation of the data. Samko et al. [55] has developed an estimator for optimal neighborhood size for Isomap. However, in cases of varying neighborhood densities, an optimal neighborhood size may not exist on a global scale. Others have developed adaptive methods that select neighbors based on additional constraints such as local tangents [56], [57], intrinsic dimensionality [58], and estimating geodesic distances within a neighborhood [59]. The additional constraints in these adaptive methods aim to create a graph that does not contain spurious neighbors, but the use of additional constraints leaves the user with an additional degree of freedom to define when creating a manifold.
Along with other groups [60], [61], [62], the Rutgers Laboratory for Computational Imaging and Bioinformatics (LCIB) group has been working on developing NLDR schemes that have been shown to be more resistant to some of the failings of LLE [50] and Isomap [49]. C-Embed is a consensus NLDR scheme that [54], [63], [64], [65] combines multiple low dimensional multi-dimensional projections of the data to obtain a more robust low dimensional data representation, one which is not sensitive to careful selection of the neighborhood parameter (κ), unlike LLE and Isomap. These schemes [11], [63], [65], [66], [67], [68], [69] allow for non-linearly transforming each of the k individual high dimensional heterogeneous modalities into the common format of low dimensional embedding vectors thereby enabling direct, data-level fusion of structural, functional, metabolic, architectural, genomic, and proteomic information in the original space while overcoming the differences in scale, size, and dimensionality of individual feature spaces. This integrated representation of multiple modalities in the transformed space can be used to train meta-classifiers for studying and predicting biological activity.
While a diagnostic marker identifies diseased from normal tissue, a prognostic marker identifies subgroups of patients associated with different disease outcomes. With increasing early detection of diseases via improved diagnostic imaging methodologies [21], [64], [65], [69], [70], [71], [72], [73], it has become important to predict biologic behaviors and disease “aggressiveness”. Clinically applicable prognostic markers are urgently needed to assist in the selection of optimal therapy. In the context of prostate cancer (PCa), well established prognostic markers include histologic grade, prostate specific antigen (PSA), margin positivity, pathologic stage, intra-glandular tumor extent, and DNA ploidy [74], [75], [76]. Other recently promising prognostic indicators include tumor suppressor gene p53, cell proliferation marker Ki-67, Oncoantigen 519, microsatellite instability, angiogenesis and tumor vascularity (TVC), vascular endothelial growth factor (VEGF), and E-cadherin [76], [77]. None of these factors, however, have individually proven to be accurate enough to serve routinely as a prognostic marker [77], [78]. The problem is that men with early detected PCa have in 50% of cases [79], and in some cases 80% [80], a homogeneous pattern with respect to most standard prognostic variables (PSA < 10, T1c, Gleason score < 7). In this growing group of patients, the traditional markers seem to lose their efficacy and the subsequent therapy decision is complicated. Gao et al. [81] suggest that only a combination of multiple prognostic markers will prove superior to any individual marker. Graefen et al. [82] and Stephenson et al. [83], [84], [85] have suggested that better prognostic accuracy can be obtained by a combination of the individual markers via a machine classifier like an artificial neural network.
Graphs are effective techniques to represent spatial arrangement of structures by defining a large set of topological features. These features are quantified by definition of computable metrics. The use of spatial-relation features for quantifying cellular arrangement was proposed in the early 1990s [86], [87], but did not find application to biomedical imagery until recently [88], [89], [90], [91], [92], [93], [94]. However, with recent evidence demonstrating that for certain classes of tumors, tumor–host interactions correlate with clinical outcome [95], graph algorithms clearly have a role to play in modeling the tumor–host network and hence in predicting disease outcome.
Table 1 lists common spatial, graph based features that one can extract from the Voronoi Diagram (VD), Delaunay Triangulation (DT), and the Minimum Spanning Tree (MST) [96], [97], [98]. Additionally a number of features based off nuclear statistics can be similarly extracted. Using the nuclear centroids in a tissue region (Fig. 1(a)) as vertices, the DT graph (Fig. 1(b)), a unique triangulation of the centroids, and the MST (Fig. 1(c)), a graph that connects all centroids with the minimum possible graph length, can be constructed. These features quantify important biological information, such as the proliferation and structural arrangement of the cells in the tissue, which is closely tied to cancerous activity. Our hypothesis is that the genetic descriptors that define clinically relevant classes of cancer are reflected in the visual characteristics of the cellular morphology and tissue architecture, and that these characteristics can be measured by image analysis techniques. We believe that image-based classifiers of disease developed via comprehensive analysis of quantitative image-based information present in tissue histology will have strong correlation with gene-expression based prognostic classification.
At LCIB in Rutgers University, we have been developing an array of computerized image analysis and high dimensional data analysis, fusion tools for quantitatively integrating molecular features of a tumor (as measured by gene expression profiling or mass spectrometry) [54], [99], results from the imaging of the tumor cellular architecture and microenvironment (as captured in histological imaging) [6], [9], the tumor 3-d tissue architecture [100], and its metabolic features (as seen by metabolic or functional imaging modalities such as Magnetic Resonance Spectroscopy (MRS)) [21], [64], [65], [69], [70], [71], [72], [73]. In this paper, we briefly describe 4 representative and ongoing projects at LCIB in the context of predicting outcome of breast and prostate cancer patients and involving computerized image, data analysis and fusion of quantitative measurements from digitized histopathology, and protein expression features obtained via mass spectrometry. Preliminary data pertaining to these projects is also presented.
Section snippets
Image-based risk score for ER+ breast cancers
The current gold standard for achieving a quantitative and reproducible prognosis in estrogen receptor-positive breast cancers (ER+ BC) is via the Oncotype DX (Genomic Health, Inc.) molecular assay, which produces a Recurrence Score (RS) between 0 and 100, where a high RS corresponds to a poor outcome and vice versa. In [101], we presented Image-based Risk Score (IbRiS), a novel CAP scheme that uses only quantitatively derived information (architectural features derived from spatial arrangement
Lymphocytic infiltration and outcome in HER2+ breast cancers
The identification of phenotypic changes in BC histopathology with respect to corresponding molecular changes is of significant clinical importance in predicting BC outcome. One such example is the presence of lymphocytic infiltration (LI) in BC histopathology, which has been correlated with nodal metastasis and distant recurrence in human epidermal growth factor amplified (HER2+) breast cancers.
In [103], [104], we introduced a computerized image analysis system for detecting and grading the
Automated Gleason grading on prostate cancer histopathology
PCa is diagnosed in over 200,000 people and causes 27,000 deaths in the US annually. However, the five-year survival rate for patients diagnosed at an early stage of tumor development is very high [106], [107]. If PCa is found on a needle biopsy, the tumor is then assigned a Gleason grade (1–5) [6], [9]. Gleason grade 1 tissue is highly differentiated and non-infiltrative while grade 5 is poorly differentiated and highly infiltrative. Gleason grading is predominantly based on tissue
Integrated proteomic, histological signatures for predicting prostate cancer recurrence
Following radical prostatectomy (RP), there remains a substantial risk of disease recurrence (estimated at 25–40%) [109]. Studies have identified infiltration beyond the surgical margin, and high Gleason score as possible predictors of prostate cancer recurrence. However, owing to inter-observer variability in Gleason grade determination, cancers identified with the same Gleason grade could have significantly different outcomes [110]. Discovery of a predictive biomarker for outcome following RP
Concluding remarks
In this paper we briefly described some of the primary challenges in the quantitative fusion of multi-scale, multi-modal data for building prognostic meta-classifiers for predicting treatment response and patient outcome. We also described some of the ongoing efforts at the Laboratory for Computational Imaging and Bioinformatics (LCIB) at Rutgers University to address some of these computational challenges in personalized therapy and highlighted ongoing projects in computer-aided prognosis of
Acknowledgments
This work was supported by the Wallace H. Coulter Foundation, the National Cancer Institute under Grants R01CA136535, R01CA140772, R03CA143991, the Cancer Institute of New Jersey, and the Department of Defense (W81XWH-08-1-0145).
References (112)
- et al.
Pairwise probabilistic models for markov random fields: detecting prostate cancer from digitized whole-mount histopathology
Med Image Anal
(2010) - et al.
Fast and robust registration of PET and MR images of human brain
Neuroimage
(2004) - et al.
Brain tissue segmentation based on DTI data
Neuroimage
(2007) - et al.
Image fusion of fluid-attenuated inversion recovery magnetic resonance imaging sequences for surgical image guidance
Surg Neurol
(2007) - et al.
Representation and fusion of heterogeneous fuzzy information in the 3D space for model-based structural recognition – application to 3D brain imaging
Artif Intell
(2003) - et al.
The use of multivariate MR imaging intensities versus metabolic data from MR spectroscopic imaging for brain tumour classification
J Magn Reson
(2005) - et al.
Selection of the optimal parameter value for the Isomap algorithm
Pattern Recogn Lett
(2006) - et al.
Using locally estimated geodesic distance to optimize neighborhood graph for isometric data embedding
Pattern Recogn
(2008) - et al.
Can predictive models for prostate cancer patients derived in the United States of America be utilized in European patients? A validation study of the partin tables
Eur Urol
(2003) - et al.
Prognostic value of graph theory-based tissue architecture analysis in carcinomas of the tongue
Lab Invest
(2000)
Computer-aided prognosis: predicting patient and disease outcome via multi-modal image analysis
IEEE Int Symp Biomed Imaging (ISBI)
Hierarchical normalized cuts: unsupervised segmentation of vascular biomarkers from ovarian cancer tissue microarrays
Med Image Comput Comput Assist Interv
Integrated diagnostics: a conceptual framework with examples, Clinical Chemistry and Laboratory Medicine
Clin Chem Lab Med
Digital pathology image analysis: opportunities and challenges
Imaging Med
A comprehensive multi-attribute manifold learning scheme-based computer aided diagnostic system for breast MRI
A boosted Bayesian multi-resolution classifier for prostate cancer detection from digitized needle biopsies
IEEE Trans Biomed Eng
Graph embedding to improve supervised classification: detecting prostate cancer
An illustration of the potential for mapping MRI/MRS parameters with genetic over-expression profiles in human prostate cancer
Magma
Identification of a MicroRNA panel for clear-cell kidney cancer
Urology
Towards improved cancer diagnosis and prognosis using analysis of gene expression data and computer aided imaging
Exp Biol Med (Maywood)
Toward theragnostics
Crit Care Med
Wisdom of theragnostics, other changes
MLO Med Lab Obs
Mapping translational research in personalized therapeutics: from molecular markers to health policy
Pharmacogenomics
An architectural selection framework for data fusion in sensor platforms
Perspectives on the fusion of image and non-image data
Bagging predictors
Mach Learn
Tutorial on support vector machines for pattern recognition
Data Min Knowl Discov
Experiments with a new boosting algorithm in proceedings of national conference
Mach Learn
Optimally combining 3D texture features for automated segmentation of prostatic adenocarcinoma from high resolution MR images
Automated detection of prostatic adenocarcinoma from high resolution ex vivo MRI
IEEE Trans Med Imaging
Comparing classification performance of feature ensembles: detecting prostate cancer from high resolution MRI
Image fusion for dynamic contrast enhanced magnetic resonance imaging
Biomed Eng Online
Optimized approach to decision fusion of heterogeneous data for breast cancer diagnosis
Med Phys
Pattern classification and scene analysis
An analysis of pathology knowledge and decision making for the development of artificial intelligence-based consulting systems
Anal Quant Cytol Histol
Content based image retrieval for MR image studies of brain tumors
Conf Proc IEEE Eng Med Biol Soc
Multiclassifier fusion in human brain MR segmentation: modelling convergence
Med Image Comput Comput Assist Interv Int Conf
Combined X-ray and magnetic resonance imaging facility: application to image-guided stereotactic and functional neurosurgery
Neurosurgery
Information fusion in biomedical image analysis: combination of data vs. combination of interpretations
Inf Process Med Imaging
PET and brain tumor image fusion
Cancer J
Design of multimodal dissimilarity spaces for retrieval of video documents
IEEE Trans Pattern Anal Mach Intell
Kernel-based data fusion and its application to protein function prediction in yeast
Pac Symp Biocomput
Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure
Struct Bioinform
Sequential data fusion via vector spaces: fusion of heterogeneous data in the complex domain
J VLSI Signal Process
Functional topography: multidimensional scaling and functional connectivity in the brain
Cereb Cortex
Heterogeneous data fusion for alzheimer's disease study
Evaluating distance functions for clustering tandem repeats
Genome Inform
Combination of feature-reduced MR spectroscopic and MR imaging data for improved brain tumor classification
NMR Biomed
A chemometric approach for brain tumor classification using magnetic resonance imaging and spectroscopy
Anal Chem
Cited by (108)
A new approach for cancer prediction based on deep neural learning
2023, Journal of King Saud University - Computer and Information SciencesInformation fusion and artificial intelligence for smart healthcare: a bibliometric study
2023, Information Processing and ManagementThe effect of multilinear data fusion on the accuracy of multivariate curve resolution outputs
2022, Analytica Chimica ActaYottixel – An Image Search Engine for Large Archives of Histopathology Whole Slide Images
2020, Medical Image AnalysisCitation Excerpt :In fact, CAD is now integral to many clinical routines for diagnostic radiology and recently becoming eminent in diagnostic pathology as well. With an increase in the workload of pathologists, there is a compelling need to integrate CAD systems into pathology routines (Komura and Ishikawa, 2018; Madabhushi and Lee, 2016; Madabhushi et al., 2011; Gurcan et al., 2009a). Researchers in both image analysis and pathology fields have recognized the importance of the quantitative analysis of pathology images by using machine learning (ML) techniques (Gurcan et al., 2009a).
Artificial intelligence and the interplay between tumor and immunity
2020, Artificial Intelligence and Deep Learning in Pathology