Skip to main content
Log in

The linear combinations of biomarkers which maximize the partial area under the ROC curves

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

As biotechnology has made remarkable progress nowadays, there has also been a great improvement on data collection with lower cost and higher quality outcomes. More often than not investigators can obtain the measurements of many disease-related features simultaneously. When multiple potential biomarkers are available for constructing a diagnostic tool of a disease, an effective approach is to combine these biomarkers to build one single indicator. For continuous-scaled variables, the use of linear combinations is popular due to its easy interpretation. Su and Liu (J Ame Stat Assoc 88(424):1350–1355, 1993) derived the best linear combination under the criterion of the area under the receiver operating characteristic (ROC) curve, when the joint normality of biomarkers is assumed. However, in many investigations, the emphases are placed only on a limited extent of clinical relevancy, instead of the whole ROC curve. The goal of this study is to find the linear combination that maximizes the partial area under a ROC curve (pAUC) for a pre-specified range. In order to find an analytic solution, the first derivative of the pAUC under normal assumption is derived. The explicit form is so complicated, that a further validation on the Hessian matrix is difficult. On the other hand, we find that the pAUC maximizer may not be unique and local maximizers do exist in some cases. Consequently, the existing algorithms find an initial-point dependent solution and are inadequate to serve our needs. Hence, we propose a new algorithm by adopting several initial points at one time. Intensive numerical studies have been performed to show the adequacy of the proposed algorithm. Real examples are also provided for illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12(4): 387–415

    Article  MathSciNet  MATH  Google Scholar 

  • Bast R Jr (1993) Perspectives on the future of cancer markers. Clin Chem 39(11): 2444–2451

    Google Scholar 

  • Dodd L, Pepe M (2003) Partial AUC estimation and regression. Biometrics 59(3): 614–623

    Article  MathSciNet  MATH  Google Scholar 

  • Elkin PL, Froehling DA, Wahner-Roedler DL, Brown SH, Bailey KR (2012) Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes. Ann Intern Med 156(1): 11–18

    Google Scholar 

  • Estrela da Silva J, Marques de Sá JP, Jossinet J (2000) Classification of breast tissue by electrical impedance spectroscopy. Med Biolog Eng Comput 38(1): 26–30

    Article  Google Scholar 

  • He Y, Escobar M (2008) Nonparametric statistical inference method for partial areas under receiver operating characteristic curves, with application to genomic studies. Stat Med 27(25): 5291–5308

    Article  MathSciNet  Google Scholar 

  • Komori O, Eguchi S (2010) A boosting method for maximizing the partial area under the ROC curve. BMC Bioinform 11: 314. doi:10.1186/1471-2105-11-314

    Article  Google Scholar 

  • Li C-R, Liao C-T, Liu J-P (2008) A non-inferiority test for diagnostic accuracy based on the paired partial areas under ROC curves. Stat Med 27(10): 1762–1776

    Article  MathSciNet  Google Scholar 

  • Lionberger RA, Raw AS, Kim SH, Zhang X, Yu LX (2012) Use of partial AUC to demonstrate bioequivalence of zolpidem tartrate extended release formulations. Pharma Res. doi:10.1007/s11095-011-0662-8

  • Liu Z, Hyslop T (2010) Partial AUC for differentiated gene detection. In: Proceedings of the 2010 IEEE international conference on bioinformatics and bioengineering (BIBE ’10). IEEE Computer Society, Washington, DC, USA, pp 310–311

  • Liu A, Schisterman E, Zhu Y (2005) On linear combinations of biomarkers to improve diagnostic accuracy. Stat Med 24(1): 37–47

    Article  MathSciNet  Google Scholar 

  • Marsaglia G (1972) Choosing a point from the surface of a sphere. Ann Math Stat 43((2): 645–646

    Article  MATH  Google Scholar 

  • Marshall R (1989) The predictive value of simple rules for combining two diagnostic tests. Biometrics 45(4): 1213–1222

    Article  MathSciNet  MATH  Google Scholar 

  • McClish D (1989) Analyzing a portion of the ROC curve. Med Decis Making 9(3): 190–195

    Article  Google Scholar 

  • Muller M (1959) A note on a method for generating points uniformly on n-dimensional spheres. Commun ACM 2(4): 19–20

    Article  MATH  Google Scholar 

  • Pepe M, Thompson M (2000) Combining diagnostic test results to increase accuracy. Biostatistics 1(2): 123–140

    Article  MATH  Google Scholar 

  • Pepe M, Longton G, Anderson GL, Schummer M (2003) Selecting differentially expressed genes from microarray experiments. Biometrics 59(1): 133–142

    Article  MathSciNet  MATH  Google Scholar 

  • Ricamato MT, Tortorella F (2011) Partial AUC maximization in a linear combination of dichotomizers. Pattern Recognit Elsevier. Retrieved from http://dx.doi.org/10.1016/j.patcog.2011.03.022

  • Su J, Liu J (1993) Linear combinations of multiple diagnostic markers. J Am Stat Assoc 88(424): 1350–1355

    Article  MathSciNet  MATH  Google Scholar 

  • Thompson M, Zucchini W (1989) On the statistical analysis of ROC curves. Stat Med 8(10): 1277–1290

    Article  Google Scholar 

  • Tian L (2010) Confidence interval estimation of partial area under curve based on combined biomarkers. Comput Stat Data Anal 54(2): 466–472

    Article  MATH  Google Scholar 

  • Wang Z, Chang Y-CI (2011) Marker selection via maximizing the partial area under the ROC curve of linear risk scores. Biostatistics 12(2): 369–385

    Article  Google Scholar 

  • Weng CG, Poon J (2008) A new evaluation measure for imbalanced datasets. In: Roddick JF, Li J, Christen P, Kennedy PJ (eds) Proceedings of the seventh australasian data mining conference (AusDM 2008), Glenelg, South Australia. CRPIT, 87. ACS. pp 27–32

  • Woolas R, Conaway M, Xu F, Jacobs I, Yu Y, Daly L, Davies A, O’Briant K, Berchuck A, Soper JT, Clarke-Pearson DL, Rodriguez G, Oram DH, Bast RC Jr. (1995) Combinations of multiple serum markers are superior to individual assays for discriminating malignant from benign pelvic masses. Gynecol Oncol 59(1): 111–116

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huey-Miin Hsueh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsu, MJ., Hsueh, HM. The linear combinations of biomarkers which maximize the partial area under the ROC curves. Comput Stat 28, 647–666 (2013). https://doi.org/10.1007/s00180-012-0321-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-012-0321-5

Keywords

Navigation