Abstract
As biotechnology has made remarkable progress nowadays, there has also been a great improvement on data collection with lower cost and higher quality outcomes. More often than not investigators can obtain the measurements of many disease-related features simultaneously. When multiple potential biomarkers are available for constructing a diagnostic tool of a disease, an effective approach is to combine these biomarkers to build one single indicator. For continuous-scaled variables, the use of linear combinations is popular due to its easy interpretation. Su and Liu (J Ame Stat Assoc 88(424):1350–1355, 1993) derived the best linear combination under the criterion of the area under the receiver operating characteristic (ROC) curve, when the joint normality of biomarkers is assumed. However, in many investigations, the emphases are placed only on a limited extent of clinical relevancy, instead of the whole ROC curve. The goal of this study is to find the linear combination that maximizes the partial area under a ROC curve (pAUC) for a pre-specified range. In order to find an analytic solution, the first derivative of the pAUC under normal assumption is derived. The explicit form is so complicated, that a further validation on the Hessian matrix is difficult. On the other hand, we find that the pAUC maximizer may not be unique and local maximizers do exist in some cases. Consequently, the existing algorithms find an initial-point dependent solution and are inadequate to serve our needs. Hence, we propose a new algorithm by adopting several initial points at one time. Intensive numerical studies have been performed to show the adequacy of the proposed algorithm. Real examples are also provided for illustration.
Similar content being viewed by others
References
Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12(4): 387–415
Bast R Jr (1993) Perspectives on the future of cancer markers. Clin Chem 39(11): 2444–2451
Dodd L, Pepe M (2003) Partial AUC estimation and regression. Biometrics 59(3): 614–623
Elkin PL, Froehling DA, Wahner-Roedler DL, Brown SH, Bailey KR (2012) Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes. Ann Intern Med 156(1): 11–18
Estrela da Silva J, Marques de Sá JP, Jossinet J (2000) Classification of breast tissue by electrical impedance spectroscopy. Med Biolog Eng Comput 38(1): 26–30
He Y, Escobar M (2008) Nonparametric statistical inference method for partial areas under receiver operating characteristic curves, with application to genomic studies. Stat Med 27(25): 5291–5308
Komori O, Eguchi S (2010) A boosting method for maximizing the partial area under the ROC curve. BMC Bioinform 11: 314. doi:10.1186/1471-2105-11-314
Li C-R, Liao C-T, Liu J-P (2008) A non-inferiority test for diagnostic accuracy based on the paired partial areas under ROC curves. Stat Med 27(10): 1762–1776
Lionberger RA, Raw AS, Kim SH, Zhang X, Yu LX (2012) Use of partial AUC to demonstrate bioequivalence of zolpidem tartrate extended release formulations. Pharma Res. doi:10.1007/s11095-011-0662-8
Liu Z, Hyslop T (2010) Partial AUC for differentiated gene detection. In: Proceedings of the 2010 IEEE international conference on bioinformatics and bioengineering (BIBE ’10). IEEE Computer Society, Washington, DC, USA, pp 310–311
Liu A, Schisterman E, Zhu Y (2005) On linear combinations of biomarkers to improve diagnostic accuracy. Stat Med 24(1): 37–47
Marsaglia G (1972) Choosing a point from the surface of a sphere. Ann Math Stat 43((2): 645–646
Marshall R (1989) The predictive value of simple rules for combining two diagnostic tests. Biometrics 45(4): 1213–1222
McClish D (1989) Analyzing a portion of the ROC curve. Med Decis Making 9(3): 190–195
Muller M (1959) A note on a method for generating points uniformly on n-dimensional spheres. Commun ACM 2(4): 19–20
Pepe M, Thompson M (2000) Combining diagnostic test results to increase accuracy. Biostatistics 1(2): 123–140
Pepe M, Longton G, Anderson GL, Schummer M (2003) Selecting differentially expressed genes from microarray experiments. Biometrics 59(1): 133–142
Ricamato MT, Tortorella F (2011) Partial AUC maximization in a linear combination of dichotomizers. Pattern Recognit Elsevier. Retrieved from http://dx.doi.org/10.1016/j.patcog.2011.03.022
Su J, Liu J (1993) Linear combinations of multiple diagnostic markers. J Am Stat Assoc 88(424): 1350–1355
Thompson M, Zucchini W (1989) On the statistical analysis of ROC curves. Stat Med 8(10): 1277–1290
Tian L (2010) Confidence interval estimation of partial area under curve based on combined biomarkers. Comput Stat Data Anal 54(2): 466–472
Wang Z, Chang Y-CI (2011) Marker selection via maximizing the partial area under the ROC curve of linear risk scores. Biostatistics 12(2): 369–385
Weng CG, Poon J (2008) A new evaluation measure for imbalanced datasets. In: Roddick JF, Li J, Christen P, Kennedy PJ (eds) Proceedings of the seventh australasian data mining conference (AusDM 2008), Glenelg, South Australia. CRPIT, 87. ACS. pp 27–32
Woolas R, Conaway M, Xu F, Jacobs I, Yu Y, Daly L, Davies A, O’Briant K, Berchuck A, Soper JT, Clarke-Pearson DL, Rodriguez G, Oram DH, Bast RC Jr. (1995) Combinations of multiple serum markers are superior to individual assays for discriminating malignant from benign pelvic masses. Gynecol Oncol 59(1): 111–116
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hsu, MJ., Hsueh, HM. The linear combinations of biomarkers which maximize the partial area under the ROC curves. Comput Stat 28, 647–666 (2013). https://doi.org/10.1007/s00180-012-0321-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-012-0321-5