Skip to main content
Log in

Assessing the diagnostic power of variables measured with a detection limit

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

The phenomenon of the limit of detection (LoD) often happens in many practical situations because of technique and instrument limitations. In the literature, some reports show that, in general, to apply conventional methods to evaluate the diagnostic power of variables while ignoring LoD could be seriously biased. Although the area under the receiver operating characteristic (ROC) curve can be estimated consistently if the distribution of variables are known. In practical situation, such information is usually not available. On the other hand, it has been proved that the area under ROC curve of a variable with a LoD and no distribution assumptions is usually biased no matter what kinds of replacement strategies are used. However, there is a lack of similar studies on the partial area under ROC curve (pAUC), and because this measure is usually preferred in practice, it is of interest to examine whether the estimate of pAUC of a variable measured with a LoD behaves the same. In this study, we found that for some LoD scenarios, and even without distribution assumption, consistent estimate of pAUC can be constructed. When the consistent estimate of pAUC cannot be obtained, the bias can be ineffectual in practical situations, and the proposed estimator can be a good approximation of pAUC. Numerical studies using simulated data sets and real data examples are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Armbruster DA, Pry T (2008) Limit of blank, limit of detection and limit of quantitation. Clin Biochem Rev 29(Supple 1):S49–S52

  • Bache K, Lichman M (2013) UCI machine learning repository [http://archive.ics.uci.edu/ml]. University of California, School of Information and Computer Science 2013, Irvine

  • Cai T, Dodd LE (2008) Regression analysis for the partial area under the ROC curve. Stat Sin 18:817–836

    MathSciNet  MATH  Google Scholar 

  • Davatchi F, Shahram F, Nadji A, Chams-Davatchi C, Shama H, Akhkaghi M, Abdollahi BS, Ziaie N (2006) Performance of existing diagnosis/classification criteria for Behcet’s Disease in Iranian patients: analysis of 5666 patients and 2406 controls. APLAR J Rheumatol 9:238–243

    Article  Google Scholar 

  • Faraggi D, Reiser B (2002) Estimation of the area under the ROC curve. Stat Med 21:3093–3106

    Article  Google Scholar 

  • Jiang Y, Metz CE, Nishikawa RM (1996) A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology 201:745–750

    Article  Google Scholar 

  • LaFleur B, Lee W, Billhiemer D, Lockhart C, Liu J, Merchant N (2011) Statistical methods for assays with limits of detection: serum bile acid as a differentiator between patients with normal colons, adenomas, and colorectal cancer. J Carcinog 10:12

  • Mumford SL, Schisterman EF, Vexler A, Liu A (2006) Pooling biospecimens and limits of detection: effects on ROC curve analysis. Biostatistics 7:585–598

    Article  MATH  Google Scholar 

  • Park LJ (2005) Learning of neural networks for fraud detection based on a partial area under curve. In: Advances in neural networks V ISNN 2005 second international symposium on neural networks, Chongqing, China, May 30–June 1, 2005, Proceedings, Part II. Lecture notes in computer science, Springer, Berlin, 3497, pp 922–927

  • Perkins NJ, Schisterman EF, Vexler A (2011a) ROC curve inference for best linear combination of two biomarkers subject to limits of detection. Biom J 53:464–476

    Article  MathSciNet  MATH  Google Scholar 

  • Perkins NJ, Schisterman EF, Vexler A (2011b) Receiver operating characteristic curve inference from a sample with a limit of detection. J Epidemiol 165:325–333

    Article  Google Scholar 

  • Ramana BV, Prasad Babu MS, Venkateswarlu NB (2011) A critical study of selected classification algorithms for liver disease diagnosis. Int J Database Manag Syst 3:101–114

    Article  Google Scholar 

  • Ramana BV, Prasad Babu MS, Venkateswarlu NB (2012) A critical comparative study of liver patients from USA and India: an exploratory analysis. Int J Comput Sci Issues 9:506–516

    Google Scholar 

  • Thompson IM, Resnick MI, Klein EA (2001) Prostate cancer screening. Humana Press, New York

    Book  Google Scholar 

  • Thompson IM, Chi C, Ankerst DP, Goodman PJ, Tangen CM, Lippman SM, Lucia MS, Parnes HL, Coltman CA Jr (2006) Effect of finasteride on the sensitivity of PSA for detecting prostate cancer. J Natl Cancer Inst 98:1128–1133

    Article  Google Scholar 

  • Vexler A, Liu A, Eliseeva E, Schisterman EF (2008) Maximum likelihood ratio tests for comparing the discriminatory ability of biomarkers subject to limit of detection. Biometrics 64:895–903

    Article  MathSciNet  MATH  Google Scholar 

  • Walter SD (2005) The partial area under the summary ROC curve. Stat Med 24:2025–2040

    Article  MathSciNet  Google Scholar 

  • Wang Z, Chang YCI, Ying Z, Zhu L, Yang Y (2007) A parsimonious threshold-independent protein feature selection method through the area under receiver operating characteristic curve. Bioinformatics 23:2788–2794

    Article  Google Scholar 

  • Wang Z, Chang YCI (2012) Marker selection via maximizing the partial area under the ROC curve of linear risk scores. Biostatistics 12:369–385

    Article  MathSciNet  Google Scholar 

  • Zhang DD, Zhou XH, Freeman DH, Freeman JL (2002) A nonparametric method for the comparison of partial areas under ROC curves and its application to large health care data sets. Stat Med 21:701–715

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful to the Editor, the Associate Editor, and the anonymous referees for comments and suggestions that lead to improvements in the paper. Wang’s work is supported by funds of the State Key Program of National Natural Science of China (No. 11231010) and National Natural Science of China (No. 11471302).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhanfeng Wang.

Appendix

Appendix

Proof of Theorem 2.1

From (2) and (5), we know that

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X, X\ge S_{\bar{D}}^{-1}(u)\right) =P\left( Y\ge X,X\ge c\right) \\ pAUC^*(u)= & {} P\left( \tilde{Y}\ge \tilde{X}, \tilde{X} \ge S_{\bar{D*}}^{-1}(u)\right) =P\left( \tilde{Y}\ge \tilde{X}, \tilde{X}\ge c^*\right) , \end{aligned}$$

where \(c=S_{\bar{D}}^{-1}(u)\) and \(c^*=S_{\bar{D*}}^{-1}(u)\) are \(1-u\) quantiles of X and \(\tilde{X}\), respectively. By total probability formula, we have

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X,X\ge c|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} {I}+{J}+{K}+{L}, \end{aligned}$$
(8)
$$\begin{aligned} {pAUC^*(u)}= & {} P\left( Y\ge X,X\ge c^*|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge \widetilde{X},\widetilde{X}\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( \widetilde{Y}\ge X,X\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( \widetilde{Y}\ge \widetilde{X},\widetilde{X}\ge c^*|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} I^*+J^*+K^*+L^*. \end{aligned}$$
(9)

By the definitions of \(\widetilde{X}\) and \(\widetilde{Y}\), we know that \(\widetilde{X}=r\) and \(\widetilde{Y}=r\) when \(X<d\) and \(Y<d\). Then components of (9) can be rewritten as

$$\begin{aligned}&J^*=P\left( Y\ge r,r\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) ,\nonumber \\&K^*=P\left( X\le r,X\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) ,\nonumber \\&L^*=P\left( \widetilde{Y}\ge \widetilde{X},r\ge c^*|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) . \end{aligned}$$
(10)

(1) When \(S_{\bar{D}}(d)>u\) and \(S_{\bar{D*}}(d) > u\), both of \(c=S_{\bar{D}}^{-1}(u)\) and \(c^*=S_{\bar{D*}}^{-1}(u)\) are not smaller than the lower bound d. Therefore, it follows from definition of quantile and continuous property of X that \(c=c^*\ge d\). Thus, from continuous properties of X and Y and \(r\le d\), we have \(I={I^*}\), \(J={J^*}=0\), \(K={K^*}=0\), \(L={L^*}=0\), which suggests \(pAUC^*(u)={pAUC(u)}\).

(2) When \(S_{\bar{D}}(d)>u\) and \(S_{\bar{D*}}(d)<u\), the inequality \(c>d>c^*\) holds. Thus, \(I^*>I\), \(J=0\), \({J^*}\ge 0\), \(K^*=K=0\), \(L=0\), \({L^*}\ge 0\). Hence, we get \(pAUC^*(u)-{pAUC(u)}>0\). Similarly, If \(S_{\bar{D}}(d)<u\) and \(S_{\bar{D*}}(d)>u\), then \(c<d<c^*\) and \(pAUC^*(u)-{pAUC(u)}<0\). In conclusion, under situation (2), \(|pAUC^*(u)-{pAUC(u)}|>0\).

(3) When \(S_{\bar{D}}(d)<u\) and \(S_{\bar{D*}}(d)<u\), we have \(c<d\) and \(c^*<d\). This situation can be splitted into two parts, (i) \(c^*\le c<d\) and (ii) \(c<c^*<d\).

If (i) holds, then \(I=I^*,~J<J^*,~K=K^*=0,~L<L^*\), which indicates that \(pAUC^*(u)-{pAUC(u)}>0\). And if (ii) holds, then \(I={I^*},~J\le J^*,~K={K^*}=0\) and \(L<{L^*}\). So \((pAUC^*(u)-{pAUC(u)})>0\). Hence, the conclusion holds. \(\square \)

Proof of Theorem 2.2

Similar to proof of Theorem 2.1, we have

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X, X\ge S_{\bar{D}}^{-1}(u)\right) =P\left( Y\ge X,X\ge c\right) \\ pAUC^*(u)= & {} P\left( \tilde{Y}\ge \tilde{X}, \tilde{X} \ge S_{\bar{D*}}^{-1}(u)\right) =P\left( \tilde{Y}\ge \tilde{X}, \tilde{X}\ge c^*\right) , \end{aligned}$$

where \(c=S_{\bar{D}}^{-1}(u)\) and \(c^*=S_{\bar{D*}}^{-1}(u)\) are \(1-u\) quantiles of X and \(\tilde{X}\), respectively. By total probability formula, we have

$$\begin{aligned} pAUC(u)= & {} P\left( Y\ge X,X\ge c|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X,X\ge c|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} {I}+{J}+{K}+{L}, \end{aligned}$$
(11)
$$\begin{aligned} {pAUC^*(u)}= & {} P\left( \widetilde{Y}\ge \widetilde{X},\widetilde{X}\ge c^*|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) \nonumber \\&+\,P\left( \widetilde{Y}\ge X,X\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) \nonumber \\&+\,P\left( Y\ge \widetilde{X},\widetilde{X}\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) \nonumber \\&+\,P\left( Y\ge X, X\ge c^*|Y\le d,X\le d\right) P\left( Y\le d,X\le d\right) \nonumber \\:= & {} I^*+J^*+K^*+L^*. \end{aligned}$$
(12)

By the definitions of \(\widetilde{X}\) and \(\widetilde{Y}\), we know that \(\widetilde{X}=r\) and \(\widetilde{Y}=r\) when \(X>d\) and \(Y>d\). Then components of (12) can be rewritten as

$$\begin{aligned}&I^*=P\left( r\ge r,r\ge c^*|Y\ge d,X\ge d\right) P\left( Y\ge d,X\ge d\right) ,\\&J^*=P\left( r\ge X,X\ge c^*|Y\ge d,X\le d\right) P\left( Y\ge d,X\le d\right) ,\\&K^*=P\left( Y\ge r,r\ge c^*|Y\le d,X\ge d\right) P\left( Y\le d,X\ge d\right) . \end{aligned}$$

(1) When \(S_{\bar{D}}(d)<u\), \(c=S_{\bar{D}}^{-1}(u)\) is smaller than d and then \(c^*=c\). Therefore, \({I^*}=P(Y\ge d,X\ge d)\), \({J^*}=P(X\ge c|Y\ge d,X\le d)P(Y\ge d,X\le d)\), \({K^*}=0\) and \({L^*}=P(Y\ge X, X\ge c|Y\le d,X\le d)P(Y\le d,X\le d)\), which suggests that \({pAUC^*(u)}\) is free of choice of the replacement value r and is a constant. Moreover, by Eq. (11), \({pAUC^*(u)}-pAUC(u)= P(Y\ge d,X\ge d)-P(Y\ge X,X\ge c|Y\ge d,X\ge d)P(Y\ge d,X\ge d) =P(Y<X,Y\ge d,X\ge d)\).

(2) when \(S_{\bar{D}}(d)>u\), \(c^*=r\) and \({I^*}=P(Y\ge d,X\ge d)\), \({J^*}=0\), \({K^*}=0\) and \({L^*}=P(Y\ge X, X\ge r|Y\le d,X\le d)P(Y\le d,X\le d)=0\). So the \({pAUC^*(u)}=P(Y\ge d,X\ge d)\) becomes a constant not varying with the replacement value r. Hence, the conclusion holds. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, B., Chang, Yc.I. & Wang, Z. Assessing the diagnostic power of variables measured with a detection limit. Comput Stat 31, 1287–1303 (2016). https://doi.org/10.1007/s00180-015-0628-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-015-0628-0

Keywords

Navigation