Skip to main content
Log in

Variable selection for generalized partially linear models with longitudinal data

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Variables selection and parameter estimation are of great significance in all regression analysis. A variety of approaches have been proposed to tackle this problem. Among those, the penalty-based shrinkage approach has been most popular for the ability to carry out the variable selection and parameter estimation simultaneously. However, not much work is available on the variable selection for the generalized partially models (GPLMs) with longitudinal data. In this paper, we proposed a variable selection procedure for GPLMs with longitudinal data. The inference is based on the SCAD-penalized quadratic inference functions, which is obtained after the B-spline approximating to non-parametric function in the model. The proposed approach efficiently utilized the within-cluster correlation information, which can improve estimating efficiency. The proposed approach also has the virtue of low computational cost. With the tuning parameter chosen by BIC, the correct model is identified with probability tends to 1. The resulted estimator of the parametric component is asymptotic to a normal distribution, and that of the non-parametric function achieves the optimal convergence rate. The performance of the proposed methods is evaluated through extensive simulation studies. A real data analysis shows that the proposed approach succeeds in excluding the insignificant variable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Breiman L (1995) Better subset selection using nonnegative garrote. Techonometrics 37:373–384

    Article  MATH  Google Scholar 

  2. Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B 58:267–288

    MathSciNet  MATH  Google Scholar 

  3. Fu WJ (1998) Penalized regression: the bridge versus the LASSO. J Comput Graph Stat 7:397–416

    MathSciNet  Google Scholar 

  4. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360

    Article  MathSciNet  MATH  Google Scholar 

  5. Zhou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429

    Article  MathSciNet  MATH  Google Scholar 

  6. Wang L, Li H, Huang JZ (2008) Variable selection in non-parametric varying coefficient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569

    Article  MATH  Google Scholar 

  7. Xue L, Qu A, Zhou J (2010) Consistent model selection in marginal generalized additive models for correlated data. J Am Stat Assoc 105:1518–1530

    Article  MATH  Google Scholar 

  8. Tian RQ, Xue LG, Liu CL (2014) Penalized quadratic functions for semiparametric varying coefficient partially linear models with longitudinal data. J Multivar Anal 132:94–110

    Article  MathSciNet  MATH  Google Scholar 

  9. Fan J, Li R (2004) New estimation and model selection procedure for semiparametric modeling in longitudinal data analysis. J Am Stat Assoc 99:710–723

    Article  MathSciNet  MATH  Google Scholar 

  10. Fan J, Zhang W (2008) Statistical methods with varying coefficient models. Stat Interface 1:179–195

    Article  MathSciNet  MATH  Google Scholar 

  11. Zhao PX, Xue LG (2009) Variable selection for semi-parametric varying coefficient partially linear models. Stat Probab Lett 79:2148–2157

    Article  MATH  Google Scholar 

  12. Wang L, Xue L, Qu A, Liang H (2014) Estimation and model selection in generalized additive partial linear models for correlated data with diverging number of covariates. Ann Stat 42:592–694

    Article  MathSciNet  MATH  Google Scholar 

  13. Liang KL, Zeger SL (1986) Longitudinal data analysis using generalized estimating equations. Biometrika 73:13–22

    Article  MathSciNet  MATH  Google Scholar 

  14. Qu A, Lindsay BG, Li B (2000) Improving generalized estimating equations using quadratic inference functions. Biometrika 87:823–836

    Article  MathSciNet  MATH  Google Scholar 

  15. Hansen LP (1982) Large sample properties of generalized method of moments estimators. Econometrica 50(4):1029–1054

    Article  MathSciNet  MATH  Google Scholar 

  16. Qu A, Li R (2006) Quadratic inference functions for varying coefficent models with longitudinal data. Biometrics 62:379–391

    Article  MathSciNet  MATH  Google Scholar 

  17. Bai Y, Zhu ZY, Fung WK (2008) Partially linear models for longitudinal data based on quadratic inference functions. Scand J Stat 35:104–118

    Article  MATH  Google Scholar 

  18. Zhang JH, Xue LG (2017) Quadratic inference functions for generalized partially models with longitudinal data. Chin J Appl Probab Stat 33:417–432

    MathSciNet  MATH  Google Scholar 

  19. Bai Y, Fung WK, Zhu ZY (2009) Penalized quadratic inference functions for single-index models with longitudinal data. J Multivar Anal 100:152–161

    Article  MathSciNet  MATH  Google Scholar 

  20. Cho H, Qu A (2013) Model selection for correlated data with diverging number of parameters. Stat Sin 23:901–927

    MathSciNet  MATH  Google Scholar 

  21. Lin XH, Carroll RJ (2001) Non-parametric function estimation for clustered data when the predictor is measured without/with error. J Am Stat Assoc 95:520–534

    Article  MATH  Google Scholar 

  22. Lin XH, Carroll RJ (2001) Semiparametric regression for clustered data with generalized estimating equations. J Am Stat Assoc 96:1045–1056

    Article  MathSciNet  MATH  Google Scholar 

  23. He XM, Fung WK, Zhu ZY (2005) Robust estimation in a generalized partially linear model for cluster data. J Am Stat Assoc 34:391–410

    Google Scholar 

  24. Qin GY, Bai Y, Zhu ZY (2012) Robust empirical likelihood inference for generalized partially linear models with longitudinal data. J Multivar Anal 105:32–44

    Article  MATH  Google Scholar 

  25. Qu A, Song XK (2004) Assessing robustness of generalized estimating equations and quadratic inference functions. Biometrika 91:447–459

    Article  MathSciNet  MATH  Google Scholar 

  26. Schumaker G (1981) Spline function. Wiley, New York

    MATH  Google Scholar 

  27. Wang H, Li R, Tsai C (2007) Tuning parameter selection for the smoothly clipped absolute deviation method. Biometrika 94:553–556

    Article  MathSciNet  MATH  Google Scholar 

  28. Wang HS, Xia YC (2009) Shrinkage estimator of the varying coefficient model. J Am Stat Assoc 104:747–757

    Article  MATH  Google Scholar 

  29. Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 36:261–286

    Article  MathSciNet  MATH  Google Scholar 

  30. Oman SD (2009) Easily simulated multivariate binary distributions with given positive and negative correlations. Comput Stat Data Anal 53(4):999–1005

    Article  MathSciNet  MATH  Google Scholar 

  31. Zeger SL, Karim MR (2001) Generalized linear models with random effects: a Gibbs sampling approach. J Am Stat Assoc 86:79–86

    Article  MathSciNet  Google Scholar 

  32. Diggle PJ, Liang KY, Zeger SL (1994) Analysis of longitudinal data. Oxford University Press, Oxford

    MATH  Google Scholar 

  33. Chang XJ, Ma ZG, Yang Y, Zeng ZQ, Hauptmann AG (2017) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans on Cybern 47(5):1180–1197

    Article  Google Scholar 

  34. Galiautdinov R (2020) The math model of drone behavior in the hive, providing algorithmic architecture. Int J Softw Sci Comput Intell 12(2):15–33

    Article  Google Scholar 

  35. Zhang L (2019) Evaluating the effects of size and precision of training data on ANN training performance for the prediction of chaotic time series patterns. Int J Softw Sci Comput Intell 11(1):16–30

    Article  Google Scholar 

Download references

Acknowledgements

The research is funded by the National Natural Science Foundation of China (11571025) and the Beijing Natural Science Foundation (1182008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinghua Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proofs of the main results

Appendix: Proofs of the main results

For convenience and simplicity, let \(C\) denote a positive constant that may have different values at each appearance throughout this paper and \(\parallel A\parallel\) denote the modulus of the largest singular value of matrix or vector A.

Let \(\eta_{ij} = X_{ij}^{T} \beta + W_{ij}^{T} \gamma\), Then \(\mu_{ij} = h\left( {\eta_{ij} } \right)\). Let \(\eta_{i} = (\eta_{i1} , \ldots ,\eta_{im} )^{T} ,\mu_{i} = (\mu_{i1} , \ldots ,\mu_{im} )^{T}\), and \(\theta = (\beta^{T} ,\gamma^{T} )^{T} ,Y_{i} = (Y_{i1} , \ldots ,Y_{im} )^{T} ,X_{i} = (X_{i1} , \ldots ,X_{im} )^{T}\).

Similarly, let \(P_{ij} = (X_{ij}^{T} ,W_{ij} )^{T}\), and \(W_{i} = \left( {W_{i1} , \ldots ,W_{im} } \right))^{T} ,P_{i} = (P_{i1} , \ldots ,P_{im} )^{T} = \left( {X_{i} ,W\left( {U_{i} } \right)} \right)\), then \(\eta_{ij} = P_{ij}^{T} \theta ,\eta_{i} = P_{i} \theta\), and \(\frac{{\partial \eta_{ij} }}{\partial \theta } = P_{ij} ,\frac{{\partial \eta_{i} }}{\partial \theta } = P_{i}^{T}\).

Let \(h^{\prime} \left( t \right) = \frac{dh\left( t \right)}{dt}\), then \(\frac{{\partial \mu_{ij} }}{\partial \theta } = h^{\prime} \left( {\eta_{ij} } \right)P_{ij}\). Let

$$H^{\prime} \left( {\eta_{i} } \right) \triangleq \left( {\begin{array}{*{20}l} {h^{\prime} \left( {\eta_{i1} } \right)} \hfill & {} \hfill & {} \hfill \\ {} \hfill & \ddots \hfill & {} \hfill \\ {} \hfill & {} \hfill & {h^{\prime} \left( {\eta_{im} } \right)} \hfill \\ {} \hfill & {} \hfill & {} \hfill \\ \end{array} } \right),\quad H''\left( {\eta_{i} } \right) \triangleq \left( {\begin{array}{*{20}l} {h^{\prime} \prime \left( {\eta_{i1} } \right)} \hfill & {} \hfill & {} \hfill \\ {} \hfill & \ddots \hfill & {} \hfill \\ {} \hfill & {} \hfill & {h^{\prime} \prime \left( {\eta_{i,} } \right)} \hfill \\ {} \hfill & {} \hfill & {} \hfill \\ \end{array} } \right),$$

then

$$\dot{\mu }_{i} = \left( {\begin{array}{*{20}l} {\frac{{\partial \mu_{i1} }}{{\partial \beta_{1} }}} \hfill & \cdots \hfill & {\frac{{\partial \mu_{i1} }}{{\partial \gamma_{qL} }}} \hfill \\ \vdots \hfill & \cdots \hfill & \vdots \hfill \\ {\frac{{\partial \mu_{im} }}{{\partial \beta_{1} }}} \hfill & \cdots \hfill & {\frac{{\partial \mu_{im} }}{{\partial \gamma_{qL} }}} \hfill \\ \end{array} } \right) = \left( {\begin{array}{*{20}l} {\left( {\frac{{\partial \mu_{i1} }}{\partial \theta }} \right)^{T} } \hfill \\ \vdots \hfill \\ {\left( {\frac{{\partial \mu_{im} }}{\partial \theta }} \right)^{T} } \hfill \\ \end{array} } \right) = \left( {\begin{array}{*{20}l} {P_{i1}^{T} h^{\prime} \left( {\eta_{i1} } \right)} \hfill \\ \vdots \hfill \\ {P_{im}^{T} h^{\prime} \left( {\eta_{im} } \right)} \hfill \\ \end{array} } \right) = H^{\prime} \left( {\eta_{i} } \right)P_{i} .$$

Proof of Theorem 1

Let \(\delta = n^{{ - \frac{1}{2}}} ,\beta = \beta_{0} + \delta D_{1} ,\gamma = \gamma_{0} + \delta D_{2}\) and \(D = (D_{1}^{T} ,D_{2}^{T} )^{T}\). we first show that for any given \(\varepsilon > 0\), there exists a large constant C such that

$$P\left\{ {\mathop { \inf }\limits_{\parallel D\parallel = c} {\mathcal{Q}}_{n}^{P} \left( {\beta ,\gamma } \right) > {\mathcal{Q}}_{n}^{P} \left( {\beta_{0} ,\gamma_{0} } \right)} \right\} \ge 1 - \varepsilon .$$
(9.1)

Note that \(\beta_{0l} = 0\), for all \(l = P_{1} + 1, \cdots ,p\) and \(\gamma_{0k} = 0\), for all \(k = q_{1} , \cdots ,q\), together with assumption (A1) and \(p_{\lambda } \left( 0 \right) = 0\), we have,

$$\begin{aligned} & {\mathcal{Q}}_{n}^{p} \left(\theta \right) - {\mathcal{Q}}_{n}^{p} \left({\theta_{0}} \right) \geqslant \left[{{\mathcal{Q}}_{n} \left(\theta \right) - {\mathcal{Q}}_{n} \left({\theta_{0}} \right)} \right] + n\mathop \sum \limits_{i = 1}^{{p_{1}}} \left[{p_{{\lambda_{2}}} \left({\left| {\beta_{l}} \right|} \right) - p_{{\lambda_{2}}} \left({\left| {\beta_{0l}} \right|} \right)} \right] \\ & \quad + n\mathop \sum \limits_{i = 1}^{{q_{1}}} \left[{p_{{\lambda_{1}}} \left({\parallel \gamma_{k} \parallel_{H}} \right) - p_{{\lambda_{1}}} \left({\parallel \gamma_{0k} \parallel_{H}} \right)} \right] \\ & \quad \triangleq I_{1} + I_{2} + I_{3}. \\ \end{aligned}$$

By Taylor expansion and assumption (A4), we have

$$\begin{aligned} I_{2} & = n\mathop \sum \limits_{l = 1}^{{p_{1}}} \left[{\delta p_{{\lambda_{2}}}^{\prime} \left({\left| {\beta_{0l}} \right|} \right){\text{s}}gn\left({\beta_{0l}} \right)\left| {D_{1l}} \right| + \delta p_{{\lambda_{2}}}^{\prime \prime} \left({\left| {\beta_{0l}} \right|} \right){\text{s}}gn\left({\beta_{0l}} \right)|D_{1l} |^{2} \left\{{1 + o\left(1 \right)} \right\}} \right] \\ & \leqslant \sqrt {p_{1}} a_{n} \parallel D\parallel O\left({n^{1/2}} \right) + b_{n} \parallel D\parallel^{2} O\left(1 \right) = \sqrt {p_{1}} \parallel D\parallel O\left({n^{- 1/2}} \right) + \parallel D\parallel^{2} o\left(1 \right). \\ \end{aligned}$$

Invoking the proof of Theorem 2 in [8],

$${\mathcal{Q}}_{n} \left( \theta \right) - {\mathcal{Q}}_{n} \left( {\theta_{0} } \right) = D^{T} \dot{g}_{N}^{T} \left( {\theta_{0} } \right)\varOmega_{n}^{ - 1} \left( {\theta_{0} } \right)\dot{g}_{N} \left( {\theta_{0} } \right)D + \parallel D\parallel^{2} o_{p} \left( 1 \right) + \parallel D\parallel O_{p} \left( 1 \right).$$

By choosing a sufficient large \(C\), \(I_{1}\) dominates \(I_{2}\). Similarly, \(I_{1}\) dominates \(I_{3}\) for a sufficient large C. Thus (9.1) holds. i.e., with probability at least \(1 - \varepsilon\), there exists a local minimizer \(\hat{\theta }\) satisfies that \(\parallel \hat{\theta } - \theta_{0} \parallel = O_{p} \left( \delta \right)\). Therefore \(\parallel \hat{\gamma } - \gamma_{0} \parallel = O_{p} \left( {n^{ - 1/2} } \right)\) and \(\parallel \hat{\beta } - \beta_{0} \parallel = O_{p} \left( {n^{ - 1/2} } \right)\). Follow the proof of Theorem 1 of [8], we have

$$\parallel \hat{\alpha }_{k} \left( u \right) - \alpha_{0k} \left( u \right)\parallel^{2} = O_{p} \left( {n^{{ - \frac{2r}{2r + 1}}} } \right).$$

Thus, we complete the proof of Theorem 1.

Proof of Theorem 2

According to Theorem 1, in order to proof the first part of Theorem 2, we need only to proof that, for any \(\gamma\) satisfies \(\parallel \gamma - \gamma_{0} \parallel = O_{p} \left( {n^{ - 1/2} } \right)\) and for any \(\beta_{l}\) satisfies \(\parallel \beta_{l} - \beta_{0l} \parallel = O_{p} \left( {n^{ - 1/2} } \right),l = 1, \ldots ,p_{1}\), there exists a certain \(\epsilon = Cn^{- 1/2}\) satisfies that, as \(n \to \infty\), with probability tending to 1,

$$\frac{{\partial {\mathcal{Q}}_{n}^{p} \left({\beta,\gamma} \right)}}{{\partial \beta_{l}}} > 0,\quad for\quad 0 < \beta_{l} < epsilon,l = p_{1} + 1, \ldots,p,$$
(9.2)

and

$$\frac{{\partial {\mathcal{Q}}_{n}^{p} \left({\beta,\gamma} \right)}}{{\partial \beta_{l}}} < 0,\quad for\quad - epsilon < \beta_{l} < 0,l = p_{1} + 1, \ldots,p.$$
(9.3)

These imply that the PQIF \({\mathcal{Q}}_{n}^{p} \left( {\beta ,\gamma } \right)\) reaches its minimum at \(\beta_{l} = 0,l = p_{1} + 1, \ldots ,p\).

Following Lemma 3 and 4 of [18], we have

$$\begin{aligned} & \frac{{\partial {\mathcal{Q}}_{n}^{p} \left( {\beta ,\gamma } \right)}}{{\partial \beta_{l} }} = \frac{{\partial g_{n}^{T} \left( {\beta ,\gamma } \right)}}{{\partial \beta_{l} }}\varOmega_{n}^{ - 1} \left( {\beta ,\gamma } \right)g_{n} \left( {\beta ,\gamma } \right) + O_{p} \left( 1 \right) + np'_{{\lambda \left( {\left| {\beta_{l} } \right|} \right)}} sgn \left( {\beta_{l} } \right) \\ & \quad = - 2\mathop \sum \limits_{i = 1}^{n} \left( {\begin{array}{*{20}l} {\dot{\mu }_{i}^{T} A_{i}^{ - 1/2} M_{1} A_{i}^{ - 1/2} \frac{{\partial \mu_{i} }}{{\partial \beta_{l} }}} \hfill \\ \vdots \hfill \\ {\dot{\mu }_{i}^{T} A_{i}^{ - 1/2} M_{s} A_{i}^{ - 1/2} \frac{{\partial \mu_{i} }}{{\partial \beta_{l} }}} \hfill \\ \end{array} } \right)^{T} \varOmega_{n}^{ - 1} \left( {\beta ,\gamma } \right)g_{n} \left( {\beta ,\gamma } \right) \\ & \quad + np'_{{\lambda \left( {\left| {\beta_{l} } \right|} \right)}} sgn \left( {\beta_{l} } \right) + O_{p} \left( 1 \right) \\ & \quad = n^{1/2} \left[ {n^{1/2} \lambda \left\{ {\lambda^{ - 1} p'_{\lambda } \left( {\left| {\beta_{l} } \right|} \right) sgn \left( {\beta_{l} } \right)} \right\} + O_{p} \left( 1 \right)} \right]. \\ \end{aligned}$$
(9.4)

According to (3.7), the expression of the derivative of SCAD penalized function, it easy to see that \({ \lim }_{n \to \infty } { \liminf }_{{\beta_{l} \to 0}} \lambda^{ - 1} p'_{\lambda } \left( {\left| {\beta_{l} } \right|} \right) = 1\). Together with the Assumption (A10), \(\lambda n^{1/2} > \lambda_{ \hbox{min} } n^{1/2} \to \infty\), it is clear that the sign of (9.4) is decided by that of \(\beta_{l}\). This implies (9.2) and (9.3) hold. Thus we complete the proof of Theorem 2 .

Proof of Theorem 3

Let \(\theta^{ *} = (\beta^{ *T} ,\gamma^{ *T} )^{T}\), and let \(P_{i}^{ *} = (X_{i}^{ *T} ,W_{i}^{ *T} )^{T} , i = 1, \ldots ,n\) denote the covariates corresponding to \(\theta^{ *}\). Denote \({\dot{\mathcal{Q}}}_{1n} \left( {\beta ,\gamma } \right)\) and \({\dot{\mathcal{Q}}}_{2n} \left( {\beta ,\gamma } \right)\) to be the first derivatives of the PQIF \({\mathcal{Q}}_{n}^{p}\) with respect to \(\beta\) and \(\gamma\) respectively. i.e.,

$${\dot{\mathcal{Q}}}_{1n} \left( {\beta ,\gamma } \right) = \frac{{\partial {\mathcal{Q}}_{n}^{p} \left( {\beta ,\gamma } \right)}}{\partial \beta },{\dot{\mathcal{Q}}}_{2n} \left( {\beta ,\gamma } \right) = \frac{{\partial {\mathcal{Q}}_{n}^{p} \left( {\beta ,\gamma } \right)}}{\partial \gamma }.$$

By Theorems 1 and 2, \((\hat{\beta }^{ *T} ,0^{T} )^{T}\) and \(\hat{\gamma }^{ *T}\) satisfies that

$${\dot{\mathcal{Q}}}_{1n} (((\hat{\beta }^{ *T} ,{\mathbf{0}}^{T} )^{T} ,\hat{\gamma }^{ *T} )^{T} ) = {\mathbf{0}}^{T} \,and\,{\dot{\mathcal{Q}}}_{2n} (((\hat{\beta }^{ *T} ,{\mathbf{0}}^{T} )^{T} ,\hat{\gamma }^{ *T} )^{T} ) = {\mathbf{0}}^{T}$$

By Taylor expansion, we have

$$\begin{aligned} & {\mathcal{Q}}_{1n} |_{{((\hat{\beta }^{ *T} ,{\mathbf{0}}^{T} )^{T} ,\hat{\gamma }^{ *T} )^{T} }} = {\mathcal{Q}}_{1n} |_{{((\beta_{0}^{ *T} ,{\mathbf{0}}^{T} )^{T} ,\gamma_{0}^{ *T} )^{T} }} + \frac{{\partial {\mathcal{Q}}_{1n} }}{\partial \beta }|_{{\theta = \tilde{\theta }}} \left\{ {(\hat{\beta }^{ *T} ,{\mathbf{0}}^{T} )^{T} - (\beta_{0}^{ *T} ,{\mathbf{0}}^{T} )^{T} } \right\} \\ & \quad + \frac{{\partial {\mathcal{Q}}_{1n} }}{\partial \gamma }|_{{\theta = \tilde{\theta }}} \left\{ {\hat{\gamma }^{ *} - \hat{\gamma }_{0}^{ *} } \right\} + \mathop \sum \limits_{i = 1}^{{p_{1} }} np_{\lambda }^{\prime } \left( {\left| {\hat{\beta }_{l} } \right|} \right){\text{s}}gn\left( {\hat{\beta }_{l} } \right), \\ \end{aligned}$$

where \(\tilde{\theta }\) is between \(((\beta_{0}^{ *T} ,{\mathbf{0}}^{T} )^{T} ,\gamma_{0}^{ *T} )^{T}\) and \(((\hat{\beta }^{ *T} ,{\mathbf{0}}^{T} )^{T} ,\hat{\gamma }^{ *T} )^{T}\). Apply Taylor expansion to \(p_{\lambda }^{\prime } \left( {\left| {\hat{\beta }_{l} } \right|} \right)\), we obtain

$$p_{\lambda }^{\prime } \left( {\left| {\hat{\beta }_{l} } \right|} \right) = p_{\lambda }^{\prime } \left( {\left| {\beta_{0l} } \right|} \right) + \{ p_{{\lambda_{2l} }}^{\prime \prime } (|\beta_{0l} | + o_{p} (1)\} \left( {\hat{\beta }_{l} - \beta_{0l} } \right).$$

by assumption (A10), \(p''_{\lambda } \left( {\left| {\beta_{0l} } \right|} \right) = o_{p} \left( 1 \right)\). Note \(p'_{\lambda } \left( {\left| {\beta_{0l} } \right|} \right) = 0\) as \(\lambda_{ \hbox{max} } \to 0\), therefore, by Lemma 4 of [18] and through some calculation, we have

$$\begin{aligned} & \frac{1}{n}{\mathcal{Q}}_{1n} |_{{((\beta_{0}^{ *T} ,{\mathbf{0}}^{T} )^{T} ,\gamma_{0}^{ *T} )^{T} }} = - \frac{2}{{n^{2} }}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{k = 1}^{s} \mathop \sum \limits_{l = 1}^{s} \left\{ {X_{i}^{ *T} H^{\prime} \left( {\eta_{i} } \right)A_{i}^{ - 1/2} M_{k} A_{i}^{ - 1/2} H^{\prime} \left( {\eta_{i} } \right)P_{i}^{ *} } \right. \\ & \quad \cdot \left. {\varOmega_{kl}^{ - 1} P_{j}^{ *T} H^{\prime} \left( {\eta_{j} } \right)A_{j}^{ - 1/2} M_{l} A_{j}^{ - 1/2} \left( {Y_{j} - \mu_{0j} } \right)} \right\} + o_{p} \left( {n^{ - 1/2} } \right) \\ & \quad = - \frac{2}{{n^{2} }}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} X_{i}^{ *T} H^{\prime} \left( {\eta_{i} } \right)\tau_{ij} \left( {Y_{j} - \mu_{0j} } \right) + o_{p} \left( {n^{ - 1/2} } \right) \\ = - \frac{2}{{n^{2} }}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \tilde{X}_{i}^{T} \tau_{ij} \left( {\tilde{R}\left( {U_{j} } \right) + \varepsilon_{j} } \right) + o_{p} \left( {n^{ - 1/2} } \right), \\ \end{aligned}$$

where \(\tilde{X}_{i} = H^{\prime} \left( {\eta_{i} } \right)X_{i}^{ *} ,\tilde{R}\left( {U_{i} } \right) = H^{\prime} \left( {\eta_{i} } \right)R\left( {U_{i} } \right),\varOmega_{kl}^{ - 1}\) is the \(\left( {l,k} \right)\) block of \(\varOmega^{ - 1}\) and

$$\tau_{ij} = \mathop \sum \limits_{k = 1}^{s} \mathop \sum \limits_{l = 1}^{s} A_{i}^{ - 1/2} M_{k} A_{i}^{ - 1/2} H^{\prime} \left( {\eta_{i} } \right)P_{i}^{ *} \varOmega_{kl}^{ - 1} P_{j}^{ *T} H^{\prime} \left( {\eta_{j} } \right)A_{j}^{ - 1/2} M_{l} A_{j}^{ - 1/2} .$$

Similarly, we have

$$\begin{aligned} & \frac{1}{n}\frac{{\partial {\mathcal{Q}}_{1n} }}{\partial \beta }|_{{\theta = \tilde{\theta }}} = - \frac{2}{{n^{2} }}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \tilde{X}_{i}^{T} \tau_{ij} \tilde{X}_{j} + o_{p} \left( {n^{ - 1/2} } \right), \\ & \quad \frac{1}{n}\frac{{\partial {\mathcal{Q}}_{1n} }}{\partial \gamma }|_{{\theta = \tilde{\theta }}} = - \frac{2}{{n^{2} }}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \tilde{X}_{i}^{T} \tau_{ij} \tilde{W}\left( {U_{j} } \right) + o_{p} \left( {n^{ - 1/2} } \right), \\ \end{aligned}$$

where, \(\tilde{W}\left( {U_{j} } \right) = H^{\prime} \left( {\eta_{j} } \right)W\left( {U_{j} } \right),W\left( {U_{j} } \right) = (W_{j1} , \ldots ,W_{jm} )^{T} ,W_{ij} = B\left( {U_{ij} } \right)\). Hence

$$\begin{aligned} & \frac{1}{n}{\mathcal{Q}}_{1n} |_{{((\beta_{0}^{ *T} ,{\mathbf{0}}^{T} 0)^{T} ,\gamma_{0}^{ *T} )^{T} }} = - \frac{2}{{n^{2} }}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \tilde{X}_{i}^{T} \tau_{ij} \left\{ {\tilde{X}_{j} \left( {\beta_{0}^{ *} - \hat{\beta }^{ *} } \right)} \right. \\ & \quad \left. { +\; \tilde{W}\left( {U_{j} } \right) \cdot \left( {\gamma_{0}^{ *} - \hat{\gamma }^{ *} } \right) + \tilde{R}\left( {U_{j} } \right) + \varepsilon_{j} } \right\} + o_{p} \left( {\hat{\beta }^{ *} - \beta_{0}^{ *} } \right), \\ & \quad \frac{1}{n}{\mathcal{Q}}_{2n} |_{{((\beta_{0}^{ *T} ,{\mathbf{0}}^{T} 0)^{T} ,\gamma_{0}^{ *T} )^{T} }} = - \frac{2}{{n^{2} }}\mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \tilde{W}(U_{i} )^{T} \tau_{ij} \left\{ {\tilde{X}_{j} \left( {\beta_{0}^{ *} - \hat{\beta }^{ *} } \right)} \right. \\ & \quad \left. { +\; \tilde{W}\left( {U_{j} } \right) \cdot \left( {\gamma_{0}^{ *} - \hat{\gamma }^{ *} } \right) + \tilde{R}\left( {U_{j} } \right) + \varepsilon_{j} } \right\} + o_{p} \left( {\hat{\gamma }^{ *} - \gamma_{0}^{ *} } \right). \\ \end{aligned}$$

Follow the proof of Theorem 2 in [18], we prove (4.3). Thus we complete the proof of Theorem 3.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Xue, L. Variable selection for generalized partially linear models with longitudinal data. Evol. Intel. 15, 2473–2483 (2022). https://doi.org/10.1007/s12065-020-00521-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-020-00521-6

Keywords

Mathematical Subject Classification

Navigation