Unsupervised learning of pharmacokinetic responses

Tomás, Elson; Vinga, Susana; Carvalho, Alexandra M.

doi:10.1007/s00180-016-0707-x

Unsupervised learning of pharmacokinetic responses

Original Paper
Published: 17 January 2017

Volume 32, pages 409–428, (2017)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Elson Tomás¹,
Susana Vinga¹ &
Alexandra M. Carvalho²

414 Accesses
2 Citations
Explore all metrics

Abstract

Pharmacokinetics (PK) is a branch of pharmacology dedicated to the study of the time course of drug concentrations, from absorption to excretion from the body. PK dynamic models are often based on homogeneous, multi-compartment assumptions, which allow to identify the PK parameters and further predict the time evolution of drug concentration for a given subject. One key characteristic of these time series is their high variability among patients, which may hamper their correct stratification. In the present work, we address this variability by estimating the PK parameters and simultaneously clustering the corresponding subjects using the time series. We propose an expectation maximization algorithm that clusters subjects based on their PK drug responses, in an unsupervised way, collapsing clusters that are closer than a given threshold. Experimental results show that the proposed algorithm converges fast and leads to meaningful results in synthetic and real scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Fast Parameter Identification Framework for Personalized Pharmacokinetics

Article Open access 02 October 2019

Low-dimensional neural ODEs and their application in pharmacokinetics

Article Open access 14 October 2023

A Signal Processing Approach to Pharmacokinetic Data Analysis

Article 22 March 2021

References

Azzimonti L, Ieva F, Paganoni AM (2013) Nonlinear nonparametric mixed-effects models for unsupervised classification. Comput Stat 28(4):1549–1570
Article MathSciNet MATH Google Scholar
Beal SL, Sheiner LB (1980) The NONMEM system. Am Stat 34:118–119
Article Google Scholar
Beal SL, Sheiner LB, Boeckmann AJ (1993) NONMEM users guide. Technical report, University of California, San Francisco
Carvalho AM, Adão P, Mateus P (2014) Hybrid learning of Bayesian multinets for binary classification. Pattern Recognit 47:3438–3450
Article Google Scholar
Carvalho AM, Roos T, Oliveira AL, Myllymki P (2011) Discriminative learning of Bayesian networks via factorized conditional log-likelihood. J Mach Learn Res 12:2181–2210
MathSciNet MATH Google Scholar
Davidian M, Giltinan DM (2003) Nonlinear models for repeated measurement data: an overview and update. J Agric Biol Environ Stat 8:387–419
Article Google Scholar
Dayneka NL, Garg V, Jusko WJ (1993) Comparison of four basic models of indirect pharmacodynamic responses. J Pharmacokinet Biopharm 21(4):457–478
Article Google Scholar
Delyon B, Lavielle M, Moulines E (1999) Convergence os a stochastic approximation version of the EM procedure. Ann Stat 27:94–128
Article MATH Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 1:1–38
MathSciNet MATH Google Scholar
Derendorf H, Lesko LJ, Chaikin P, Colburn WA, Lee P, Miller R, Powell R, Rhodes G, Stanski D, Venitz J (2000) Pharmacokinetic/pharmacodynamic modeling in drug research and development. J Clin Pharmacol 40(12 Pt 2):1399–1418
Google Scholar
Gueorguieva I, Ogungbenro K, Graham G, Glatt S, Aarons L (2007) A program for individual and population optimal design for univariate and multivariate response pharmacokinetic-pharmacodynamic models. Comput Methods Programs Biomed 86(1):51–61
Article Google Scholar
Kuhn E, Lavielle M (2005) Maximum likelihood estimation in nonlinear mixed effects models. Comput Stat Data Anal 49:1020–1038
Article MathSciNet MATH Google Scholar
Lee J, Lee H, Jang K, Lim KS, Shin D, Yu KS (2014) Evaluation of the pharmacokinetic and pharmacodynamic drug interactions between cilnidipine and valsartan, in healthy volunteers. Drug Des Dev Ther 8:1781–1788
Article Google Scholar
Lee PID, Amidon GL (1996) Pharmacokinetic analysis: a practical approach. CRC Press, Boca Raton
Google Scholar
Lindstrom MJ, Bates DM (1988) Newton–Raphson and EM algorithms for linear mixed-effects models for repeated-measures data. J Am Stat Assoc 84:1014–1022
MathSciNet MATH Google Scholar
Mager DE, Wyska E, Jusko WJ (2003) Diversity of mechanism-based pharmacodynamic models. Drug Metab Dispos 31(5):510–518
Article Google Scholar
Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J Optim 22:341–362
Article MathSciNet MATH Google Scholar
Rissanen J (1997) Stochastic complexity in learning. J Comput Syst Sci 55:89–95
Article MathSciNet MATH Google Scholar
Roden DM, George AL Jr (2002) The genetic basis of variability in drug responses. Nat Rev Drug Discov 1:37–44
Article Google Scholar
Sheiner LB, Rosenberg B, Marathe VV (1977) Estimation of population characteristics of pharmacokinetic parameters from routine clinical data. J Pharmacokinet Biopharm 5:445–479
Article Google Scholar
Trout H, Mentré F, Panhard X, Kodjo A, Escaut L, Pernet P, Gobert JG, Vittecoq D, Knellwolf AL, Caulin C, Bergmann JF (2004) Enhanced saquinavir exposure in HIV1-infected patients with diarrhea and/or wasting syndrome. Antimicrob Agents Chemother 48:538–545
Article Google Scholar
Walker G (1996) An em algorithm for non-linear random effects models. Biometrics 52:934–944
Article MathSciNet MATH Google Scholar
Wei GC, Tanners MZ (1991) Applications of multiple imputation to the analysis of censored regression data. Biometrics 47:1297–1309
Article Google Scholar
Wright SJ (2015) Coordinate descent algorithms. Math Program 151:3–34
Article MathSciNet MATH Google Scholar
Wu L (2002) A joint model for nonlinear mixed-effects models with censoring and covariates measured with error, with application to AIDS studies. J Am Stat Assoc 97:955–964
Article MATH Google Scholar
Wu L (2004) Exact and approximate inferences for nonlinear mixed-effects models with missing covariates. J Am Stat Assoc 99:700–709
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to express their appreciation to Paulo Mateus for many useful inputs and valuable comments. Special thanks go to the anonymous reviewers, who significant contributed to the quality of this manuscript with their valuable and well-aimed comments. This work was supported by national funds through FCT, Fundação para a Ciência e a Tecnologia, under contracts LAETA (UID/EMS/50022/2013) and IT (UID/EEA/50008/2013), and by projects InteleGen (PTDC/DTP-FTO/1747/2012), PERSEIDS (PTDC/EMS-SIS/0642/2014) and internal IT project QBigData. SV acknowledges support by Program Investigador FCT (IF/00653/2012) from FCT, co-funded by the European Social Fund (ESF) through the Operational Program Human Potential (POPH). This work was partially supported by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 633974 (SOUND project).

Author information

Authors and Affiliations

IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001, Lisbon, Portugal
Elson Tomás & Susana Vinga
Instituto de Telecomunicações, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001, Lisbon, Portugal
Alexandra M. Carvalho

Authors

Elson Tomás
View author publications
You can also search for this author in PubMed Google Scholar
Susana Vinga
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra M. Carvalho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandra M. Carvalho.

Appendices

Appendix 1: Experimental setup

As described in Sect. 3.1, the M-step maximizes the likelihood over the parameters $\beta _{1\ell }$ and $\beta _{2\ell }$, for each cluster $\ell $, resorting to a numerical method; indeed, it is overwhelming to obtain the analytical solution of the transcendental system of equations that provided the maxima. Next, we detail a simple optimization and stopping criteria for coordinate descent method used in the experiments presented in Sect. 4.

1.1 Optimization

To improve the convergence rate of coordinate descent method we made an improvement in the method by intercalating the (one-dimensional) Newtown iterations for variables $\beta _{1\ell }$ and $\beta _{2\ell }$, thus, solving Eqs. (4) and (5) simultaneously. With such modification, the method converged significantly faster, without deteriorating the results. Recall that Newton(b, h) is an iterative method to find a root of a function h given an initial approximation b, such that, in the d-th iteration of Newton’s method we have

$$\begin{aligned} b^{(0)}=b\text { and }b^{(d+1)}=b^{(d)}-\frac{h(b^{(d)})}{h'(b^{(d)})}. \end{aligned}$$

We can recast the M-step for maximizing $\beta _{1\ell }$ and $\beta _{2\ell }$ as the problem of finding $b_1$ that maximizes $h_1(b_1,b_2)$, given $b_2$ fixed, and $b_2$ that maximizes $h_2(b_1,b_2)$, given $b_1$ fixed, simultaneously. A simple approach to address this problem is to interleave the iterations of Newton’s method as $b_1^{(0)}, b_2^{(0)}, b_1^{(1)}, b_2^{(1)},\dots $ so that

$$\begin{aligned} b_1^{(d+1)}=b_1^{(d)}-\frac{h_{1}(b_{1}^{(d)},b_{2}^{(d)})}{\frac{\partial h_{1}}{\partial b_1}(b_{1}^{(d)},b_{2}^{(d)})} \text { and } b_2^{(d+1)}=b_2^{(d)}-\frac{h_{2}(b_{1}^{(d)},b_{2}^{(d)})}{\frac{\partial h_{2}}{\partial b_2}(b_{1}^{(d)},b_{2}^{(d)})}. \end{aligned}$$

We took this approach where

$$\begin{aligned} b_1^{(0)}=\beta ^{(k)}_{1\ell }\text { and }b_2^{(0)}=\beta ^{(k)}_{2\ell }. \end{aligned}$$

Note that k is used for the k-th iteration of the EM algorithm, whereas d (in the above case $d=0$) is used for the d-th iteration of the Newton’s method. So the previous estimates of $\beta _{1\ell }$ and $\beta _{2\ell }$ by the EM algorithm, are the initial approximation for the roots in Newton’s method. In this case, we let

$$\begin{aligned} h_1(b_1,b_2)=h^{(k)}_{1\ell }(b_1,b_2)\text { and }h_2(b_1,b_2)=h^{(k)}_{2\ell }(b_1,b_2), \end{aligned}$$

with $h^{(k)}_{1\ell }(b_1,b_2)$ defined as

$$\begin{aligned} \displaystyle \sum _{i=1}^N\sum _{j=1}^n\Big (-\frac{\alpha ^{(k+1)}_\ell }{v^{(k)}_\ell }X_{i\ell }^{(k)}t_je^{-b_1t_j} \left( y_{ij}-\alpha _\ell ^{(k+1)}(e^{-b_1t_j}-e^{-b_2t_j})\right) \Big ), \end{aligned}$$

and $h^{(k)}_{2\ell }(b_1,b_2)$ defined as

$$\begin{aligned} \displaystyle \sum _{i=1}^N\sum _{j=1}^n\Big (\frac{\alpha _\ell ^{(k+1)}}{v^{(k)}_\ell }X_{i\ell }^{(k)}t_je^{-b_2t_j} \left( y_{ij}-\alpha _\ell ^{(k+1)}(e^{-b_1t_j}-e^{-b_2t_j})\right) \Big ). \end{aligned}$$

Observe that functions $h^{(k)}_{1\ell }$ and $h^{(k)}_{2\ell }$ as defined in Sect. 3.1 have only one argument, while here they have two for the purpose of applying the envisage optimization. With this optimization we start with $b^{(0)}_1=\beta ^{(k)}_{1\ell }$ and $b^{(0)}_2=\beta ^{(k)}_{2\ell }$, then we iterate the Newton’s method until convergence in, say, d iterations, and finally, we update $\beta _{1\ell }$ and $\beta _{2\ell }$ as $\beta ^{(k+1)}_{1\ell }=b^{(d)}_1$ and $\beta ^{(k+1)}_{2\ell }=b^{(d)}_2$.

1.2 Stopping criterion

In our implementation, the $\text {Newton}(b,h)$ method is stopped at iteration d if

$$\begin{aligned} \mid b^{(d)}-b^{(d-1)} \mid <10^{-10}. \end{aligned}$$

Moreover, if the number of iterations exceeds $10^4$ the method is aborted (with a warning to the user). However, in practice, this iteration limit was never achieved in the experiments performed in Sect. 4.

In our particular setting, as we are computing at each step both $\beta _{1\ell }$ and $\beta _{2\ell }$, we stop when both approximations fulfill the previous criteria.

1.3 Knee analysis

For the clustered curves to have biological meaning (e.g., concentrations need to be positive, rates cannot become exponentially faster, etc.), we imposed that

$$\begin{aligned} \beta _{1\ell }, \beta _{2\ell }\in (0,5)\text { and }\beta _{1\ell }\le \beta _{2\ell }. \end{aligned}$$

If Newton’s method ends with $\beta _{1\ell }>\beta _{2\ell }$ then we swap $\beta _{1\ell }$ with $\beta _{2\ell }$; in practice, this never happened in the experiments performed in Sect. 4. Nonetheless, if $\beta _{1\ell }\notin (0,5)$ or $\beta _{2\ell }\notin (0,5)$, we perform a knee analysis to elicit a value within the allowed range.

In a general setting of $\text {Newton}(b,h)$, assume that the method converges in iteration d and $b^{(d)}$ is out of some allowed range. It might be very well the case $b^{(d)}$ is out of range but the images of h within the allowed range are very closed to $h(b^{(d)})$. In this case, the method does not converge inside the range because $b^{(d)}>b^{(d-1)}$ but $h(b^{(d)})\approx h(b^{(d-1)})$. Therefore, the method should be stopped when

$$\begin{aligned} \mid h(b^{(d)})-h(b^{(d-1)}) \mid <10^{-10}, \end{aligned}$$

which is called a knee analysis.

In our particular case, we perform this knee analysis to avoid $\beta _{1\ell } \notin (0,5)$ and $\beta _{2\ell } \notin (0,5)$. If, even performing this analysis, we have $\beta _{1\ell } \notin (0,5)$ then we keep the previous parameters approximation, that is, $\beta _{1\ell }^{(k+1)}=\beta _{1\ell }^{(k)}$, and the EM algorithm proceeds. The same analysis is carried out for $\beta _{2\ell }$.

Appendix 2: Implementation details

In this appendix, we present implementation details related to the proposed EM algorithm.

1.1 Initial number of clusters

The initial number of clusters is an user-defined parameter. However, if this value is not given, we set $M = \frac{N}{3}$.

1.2 Stopping criterion

Usually, an EM algorithm stops when the difference between consecutive values of the parameters reaches some threshold. In our case, the impact of the parameters $\beta _{1\ell }$ and $\beta _{2\ell }$ overwhelms the impact of all remaining parameters in the definition of the clusters; recall that $\beta _{1\ell }$ and $\beta _{2\ell }$ are exponents in Eq. (2). For this reason, only $\beta _{1\ell }$ and $\beta _{2\ell }$ are considered in the stopping criterion of the proposed EM algorithm. Specifically, the EM algorithm stops when

$$\begin{aligned} \frac{\mid \beta _{1\ell }^{(k+1)}-\beta _{1\ell }^{(k)}\mid }{\mid \beta _{1\ell }^{(k+1)}\mid }\le 10^{-6}\text { and } \frac{\mid \beta _{2\ell }^{(k+1)}-\beta _{2\ell }^{(k)}\mid }{\mid \beta _{2\ell }^{(k+1)}\mid }\le 10^{-6}, \end{aligned}$$

for all clusters $\ell \in \{1,\dots ,M\}$.

1.3 Cluster thresholds

The thresholds for disregarding and merging clusters were defined (empirically) as $\overline{W}=0.025$ and $\overline{L}=1$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tomás, E., Vinga, S. & Carvalho, A.M. Unsupervised learning of pharmacokinetic responses. Comput Stat 32, 409–428 (2017). https://doi.org/10.1007/s00180-016-0707-x

Download citation

Received: 17 December 2015
Accepted: 15 December 2016
Published: 17 January 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s00180-016-0707-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised learning of pharmacokinetic responses

Abstract

Access this article

Similar content being viewed by others

A Fast Parameter Identification Framework for Personalized Pharmacokinetics

Low-dimensional neural ODEs and their application in pharmacokinetics

A Signal Processing Approach to Pharmacokinetic Data Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Experimental setup

1.1 Optimization

1.2 Stopping criterion

1.3 Knee analysis

Appendix 2: Implementation details

1.1 Initial number of clusters

1.2 Stopping criterion

1.3 Cluster thresholds

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised learning of pharmacokinetic responses

Abstract

Access this article

Similar content being viewed by others

A Fast Parameter Identification Framework for Personalized Pharmacokinetics

Low-dimensional neural ODEs and their application in pharmacokinetics

A Signal Processing Approach to Pharmacokinetic Data Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Experimental setup

1.1 Optimization

1.2 Stopping criterion

1.3 Knee analysis

Appendix 2: Implementation details

1.1 Initial number of clusters

1.2 Stopping criterion

1.3 Cluster thresholds

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation