Assessing the diagnostic power of variables measured with a detection limit

Jia, Bochao; Chang, Yuan-chin Ivan; Wang, Zhanfeng

doi:10.1007/s00180-015-0628-0

Assessing the diagnostic power of variables measured with a detection limit

Original Paper
Published: 23 November 2015

Volume 31, pages 1287–1303, (2016)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Bochao Jia^1,2,
Yuan-chin Ivan Chang³ &
Zhanfeng Wang¹

391 Accesses
Explore all metrics

Abstract

The phenomenon of the limit of detection (LoD) often happens in many practical situations because of technique and instrument limitations. In the literature, some reports show that, in general, to apply conventional methods to evaluate the diagnostic power of variables while ignoring LoD could be seriously biased. Although the area under the receiver operating characteristic (ROC) curve can be estimated consistently if the distribution of variables are known. In practical situation, such information is usually not available. On the other hand, it has been proved that the area under ROC curve of a variable with a LoD and no distribution assumptions is usually biased no matter what kinds of replacement strategies are used. However, there is a lack of similar studies on the partial area under ROC curve (pAUC), and because this measure is usually preferred in practice, it is of interest to examine whether the estimate of pAUC of a variable measured with a LoD behaves the same. In this study, we found that for some LoD scenarios, and even without distribution assumption, consistent estimate of pAUC can be constructed. When the consistent estimate of pAUC cannot be obtained, the bias can be ineffectual in practical situations, and the proposed estimator can be a good approximation of pAUC. Numerical studies using simulated data sets and real data examples are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical Likelihood Confidence Intervals for the Difference of Areas Under Two Correlated ROC Curves

Article 01 September 2014

Reducing the overfitting in the gROC curve estimation

Article 10 March 2023

A comparative study of methods for testing the equality of two or more ROC curves

Article 18 December 2017

References

Armbruster DA, Pry T (2008) Limit of blank, limit of detection and limit of quantitation. Clin Biochem Rev 29(Supple 1):S49–S52
Bache K, Lichman M (2013) UCI machine learning repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science 2013, Irvine
Cai T, Dodd LE (2008) Regression analysis for the partial area under the ROC curve. Stat Sin 18:817–836
MathSciNet MATH Google Scholar
Davatchi F, Shahram F, Nadji A, Chams-Davatchi C, Shama H, Akhkaghi M, Abdollahi BS, Ziaie N (2006) Performance of existing diagnosis/classification criteria for Behcet’s Disease in Iranian patients: analysis of 5666 patients and 2406 controls. APLAR J Rheumatol 9:238–243
Article Google Scholar
Faraggi D, Reiser B (2002) Estimation of the area under the ROC curve. Stat Med 21:3093–3106
Article Google Scholar
Jiang Y, Metz CE, Nishikawa RM (1996) A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology 201:745–750
Article Google Scholar
LaFleur B, Lee W, Billhiemer D, Lockhart C, Liu J, Merchant N (2011) Statistical methods for assays with limits of detection: serum bile acid as a differentiator between patients with normal colons, adenomas, and colorectal cancer. J Carcinog 10:12
Mumford SL, Schisterman EF, Vexler A, Liu A (2006) Pooling biospecimens and limits of detection: effects on ROC curve analysis. Biostatistics 7:585–598
Article MATH Google Scholar
Park LJ (2005) Learning of neural networks for fraud detection based on a partial area under curve. In: Advances in neural networks V ISNN 2005 second international symposium on neural networks, Chongqing, China, May 30–June 1, 2005, Proceedings, Part II. Lecture notes in computer science, Springer, Berlin, 3497, pp 922–927
Perkins NJ, Schisterman EF, Vexler A (2011a) ROC curve inference for best linear combination of two biomarkers subject to limits of detection. Biom J 53:464–476
Article MathSciNet MATH Google Scholar
Perkins NJ, Schisterman EF, Vexler A (2011b) Receiver operating characteristic curve inference from a sample with a limit of detection. J Epidemiol 165:325–333
Article Google Scholar
Ramana BV, Prasad Babu MS, Venkateswarlu NB (2011) A critical study of selected classification algorithms for liver disease diagnosis. Int J Database Manag Syst 3:101–114
Article Google Scholar
Ramana BV, Prasad Babu MS, Venkateswarlu NB (2012) A critical comparative study of liver patients from USA and India: an exploratory analysis. Int J Comput Sci Issues 9:506–516
Google Scholar
Thompson IM, Resnick MI, Klein EA (2001) Prostate cancer screening. Humana Press, New York
Book Google Scholar
Thompson IM, Chi C, Ankerst DP, Goodman PJ, Tangen CM, Lippman SM, Lucia MS, Parnes HL, Coltman CA Jr (2006) Effect of finasteride on the sensitivity of PSA for detecting prostate cancer. J Natl Cancer Inst 98:1128–1133
Article Google Scholar
Vexler A, Liu A, Eliseeva E, Schisterman EF (2008) Maximum likelihood ratio tests for comparing the discriminatory ability of biomarkers subject to limit of detection. Biometrics 64:895–903
Article MathSciNet MATH Google Scholar
Walter SD (2005) The partial area under the summary ROC curve. Stat Med 24:2025–2040
Article MathSciNet Google Scholar
Wang Z, Chang YCI, Ying Z, Zhu L, Yang Y (2007) A parsimonious threshold-independent protein feature selection method through the area under receiver operating characteristic curve. Bioinformatics 23:2788–2794
Article Google Scholar
Wang Z, Chang YCI (2012) Marker selection via maximizing the partial area under the ROC curve of linear risk scores. Biostatistics 12:369–385
Article MathSciNet Google Scholar
Zhang DD, Zhou XH, Freeman DH, Freeman JL (2002) A nonparametric method for the comparison of partial areas under ROC curves and its application to large health care data sets. Stat Med 21:701–715
Article Google Scholar

Download references

Acknowledgments

The authors are grateful to the Editor, the Associate Editor, and the anonymous referees for comments and suggestions that lead to improvements in the paper. Wang’s work is supported by funds of the State Key Program of National Natural Science of China (No. 11231010) and National Natural Science of China (No. 11471302).

Author information

Authors and Affiliations

Department of Statistics and Finance, University of Science Technology of China, Hefei, 230026, China
Bochao Jia & Zhanfeng Wang
Department of Biostatistics, University of Florida, Gainesville, FL, 32610, USA
Bochao Jia
Institute of Statistical Science, Academia Sinica, Taipei, 11529, Taiwan
Yuan-chin Ivan Chang

Authors

Bochao Jia
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-chin Ivan Chang
View author publications
You can also search for this author in PubMed Google Scholar
Zhanfeng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhanfeng Wang.

Appendix

Proof of Theorem 2.1

From (2) and (5), we know that

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X, X\ge S_{\bar{D}}^{-1}(u)\right) =P\left( Y\ge X,X\ge c\right) \\ pAUC^*(u)= & {} P\left( \tilde{Y}\ge \tilde{X}, \tilde{X} \ge S_{\bar{D*}}^{-1}(u)\right) =P\left( \tilde{Y}\ge \tilde{X}, \tilde{X}\ge c^*\right) , \end{aligned}$$

where $c=S_{\bar{D}}^{-1}(u)$ and $c^*=S_{\bar{D*}}^{-1}(u)$ are $1-u$ quantiles of X and $\tilde{X}$, respectively. By total probability formula, we have

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X,X\ge c|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} {I}+{J}+{K}+{L}, \end{aligned}$$

(8)

$$\begin{aligned} {pAUC^*(u)}= & {} P\left( Y\ge X,X\ge c^*|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge \widetilde{X},\widetilde{X}\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( \widetilde{Y}\ge X,X\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( \widetilde{Y}\ge \widetilde{X},\widetilde{X}\ge c^*|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} I^*+J^*+K^*+L^*. \end{aligned}$$

(9)

By the definitions of $\widetilde{X}$ and $\widetilde{Y}$, we know that $\widetilde{X}=r$ and $\widetilde{Y}=r$ when $X<d$ and $Y<d$. Then components of (9) can be rewritten as

$$\begin{aligned}&J^*=P\left( Y\ge r,r\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) ,\nonumber \\&K^*=P\left( X\le r,X\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) ,\nonumber \\&L^*=P\left( \widetilde{Y}\ge \widetilde{X},r\ge c^*|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) . \end{aligned}$$

(10)

(1) When $S_{\bar{D}}(d)>u$ and $S_{\bar{D*}}(d) > u$, both of $c=S_{\bar{D}}^{-1}(u)$ and $c^*=S_{\bar{D*}}^{-1}(u)$ are not smaller than the lower bound d. Therefore, it follows from definition of quantile and continuous property of X that $c=c^*\ge d$. Thus, from continuous properties of X and Y and $r\le d$, we have $I={I^*}$, $J={J^*}=0$, $K={K^*}=0$, $L={L^*}=0$, which suggests $pAUC^*(u)={pAUC(u)}$.

(2) When $S_{\bar{D}}(d)>u$ and $S_{\bar{D*}}(d)<u$, the inequality $c>d>c^*$ holds. Thus, $I^*>I$, $J=0$, ${J^*}\ge 0$, $K^*=K=0$, $L=0$, ${L^*}\ge 0$. Hence, we get $pAUC^*(u)-{pAUC(u)}>0$. Similarly, If $S_{\bar{D}}(d)<u$ and $S_{\bar{D*}}(d)>u$, then $c<d<c^*$ and $pAUC^*(u)-{pAUC(u)}<0$. In conclusion, under situation (2), $|pAUC^*(u)-{pAUC(u)}|>0$.

(3) When $S_{\bar{D}}(d)<u$ and $S_{\bar{D*}}(d)<u$, we have $c<d$ and $c^*<d$. This situation can be splitted into two parts, (i) $c^*\le c<d$ and (ii) $c<c^*<d$.

If (i) holds, then $I=I^*,~J<J^*,~K=K^*=0,~L<L^*$, which indicates that $pAUC^*(u)-{pAUC(u)}>0$. And if (ii) holds, then $I={I^*},~J\le J^*,~K={K^*}=0$ and $L<{L^*}$. So $(pAUC^*(u)-{pAUC(u)})>0$. Hence, the conclusion holds. $\square $

Proof of Theorem 2.2

Similar to proof of Theorem 2.1, we have

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X, X\ge S_{\bar{D}}^{-1}(u)\right) =P\left( Y\ge X,X\ge c\right) \\ pAUC^*(u)= & {} P\left( \tilde{Y}\ge \tilde{X}, \tilde{X} \ge S_{\bar{D*}}^{-1}(u)\right) =P\left( \tilde{Y}\ge \tilde{X}, \tilde{X}\ge c^*\right) , \end{aligned}$$

where $c=S_{\bar{D}}^{-1}(u)$ and $c^*=S_{\bar{D*}}^{-1}(u)$ are $1-u$ quantiles of X and $\tilde{X}$, respectively. By total probability formula, we have

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X,X\ge c|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} {I}+{J}+{K}+{L}, \end{aligned}$$

(11)

$$\begin{aligned} {pAUC^*(u)}= & {} P\left( \widetilde{Y}\ge \widetilde{X},\widetilde{X}\ge c^*|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( \widetilde{Y}\ge X,X\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( Y\ge \widetilde{X},\widetilde{X}\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X, X\ge c^*|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} I^*+J^*+K^*+L^*. \end{aligned}$$

(12)

By the definitions of $\widetilde{X}$ and $\widetilde{Y}$, we know that $\widetilde{X}=r$ and $\widetilde{Y}=r$ when $X>d$ and $Y>d$. Then components of (12) can be rewritten as

$$\begin{aligned}&I^*=P\left( r\ge r,r\ge c^*|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) ,\\&J^*=P\left( r\ge X,X\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) ,\\&K^*=P\left( Y\ge r,r\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) . \end{aligned}$$

(1) When $S_{\bar{D}}(d)<u$, $c=S_{\bar{D}}^{-1}(u)$ is smaller than d and then $c^*=c$. Therefore, ${I^*}=P(Y\ge d,X\ge d)$, ${J^*}=P(X\ge c|Y\ge d,X\le d)P(Y\ge d,X\le d)$, ${K^*}=0$ and ${L^*}=P(Y\ge X, X\ge c|Y\le d,X\le d)P(Y\le d,X\le d)$, which suggests that ${pAUC^*(u)}$ is free of choice of the replacement value r and is a constant. Moreover, by Eq. (11), ${pAUC^*(u)}-pAUC(u)= P(Y\ge d,X\ge d)-P(Y\ge X,X\ge c|Y\ge d,X\ge d)P(Y\ge d,X\ge d) =P(Y<X,Y\ge d,X\ge d)$.

(2) when $S_{\bar{D}}(d)>u$, $c^*=r$ and ${I^*}=P(Y\ge d,X\ge d)$, ${J^*}=0$, ${K^*}=0$ and ${L^*}=P(Y\ge X, X\ge r|Y\le d,X\le d)P(Y\le d,X\le d)=0$. So the ${pAUC^*(u)}=P(Y\ge d,X\ge d)$ becomes a constant not varying with the replacement value r. Hence, the conclusion holds. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jia, B., Chang, Yc.I. & Wang, Z. Assessing the diagnostic power of variables measured with a detection limit. Comput Stat 31, 1287–1303 (2016). https://doi.org/10.1007/s00180-015-0628-0

Download citation

Received: 30 June 2014
Accepted: 19 October 2015
Published: 23 November 2015
Issue Date: December 2016
DOI: https://doi.org/10.1007/s00180-015-0628-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing the diagnostic power of variables measured with a detection limit

Abstract

Access this article

Similar content being viewed by others

Empirical Likelihood Confidence Intervals for the Difference of Areas Under Two Correlated ROC Curves

Reducing the overfitting in the gROC curve estimation

A comparative study of methods for testing the equality of two or more ROC curves

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing the diagnostic power of variables measured with a detection limit

Abstract

Access this article

Similar content being viewed by others

Empirical Likelihood Confidence Intervals for the Difference of Areas Under Two Correlated ROC Curves

Reducing the overfitting in the gROC curve estimation

A comparative study of methods for testing the equality of two or more ROC curves

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation