Survival analysis of microarray expression data by transformation models
Section snippets
Background
A major goal of many microarray studies is to relate gene expressions to biological conditions such as disease status, tumor types, drug treatment effects and time to events of interest. Survival analysis is concerned with the relationship of covariates and the time to events of interest. Since event times or survival times that are often incomplete or censored are available for the biological samples in microarray studies, the data requires more sophisticated methods of analysis. For example,
Transformation models
Suppose T is the survival time, which is possibly censored. Cox proportional hazards model is widely used in survival analysis. It assumes that the hazard rate at time t, given covariate z, has the following form:which implies that the hazard rate changes proportionally when the covariate z or its components changes. However, the PH assumption may be violated in practice. As an illustration, the Cox–Snell residual plot for proportional hazards model fitting is shown in Fig. 1 for
Example: methods and results
We illustrate the benefits of transformation model in analyzing microarray data by the lung cancer dataset from Beer et al. (2002). The data consists of gene expressions of 4966 genes for 83 patients (we do not include those patients with missing values). The patients were classified according to the progression of the disease. Sixty four patients were classified as stage I. Nineteen patients were classified as stage III. For each of the 83 patients, the survival time as well as the censoring
Discussion
We have applied the transformation regression models to microarray survival data. Our preliminary exploration shows that microarray data may not have the property of proportionality in hazards. Therefore, proportional hazards modelling approaches may not be valid in analyzing such microarray data. Our results indicate that analyses based on transformation models have better prediction capabilities than those based on Cox proportional hazards model alone for the microarray dataset we analyzed.
Acknowledgements
The second author is sponsored by NIH grant HG00008 and by China Bairen funding. We gratefully acknowledge the encouragement and many helps from Dr. Zhiliang Ying and Dr. Zhezhen Jin. We also thank the anonymous referees for their helpful comments.
References (13)
- et al.
Gene-expression profiles predict survival of patients with lung adenocarcinoma
Nat. Med.
(2002) - et al.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
J. Roy. Statist. Soc. Ser. B
(1995) - et al.
Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses
Proc. Natl. Acad. Sci. U.S.A.
(2001) - et al.
An analysis of transformations (with discussions)
J. Roy. Statist. Soc. B
(1964) - et al.
Semiparametric analysis of transformation models with censored data
Biometrika
(2002) - et al.
Analysis of transformation models with censored data
Biometrika
(1995)