Skip to main content
Log in

Solving genetic heterogeneity in extended families by identifying sub-types of complex diseases

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

The study of genetic properties of a disease requires the collection of information concerning the subjects in a set of pedigrees. The main focus of this study was the detection of susceptible genes. However, even with large pedigrees, the heterogeneity of phenotypes in complex diseases such as Schizophrenia, Bipolar and Autism, makes the detection of susceptible genes difficult to accomplish. This is mainly due to a genetic heterogeneity: many genes phenomena are involved in the disease. In order to reduce this heterogeneity, our idea consists in sub-typing the disease and in partitioning the population into more alike sub-groups. We developed a probabilistic model based on a Latent Class Analysis (LCA) that takes into account the familial dependence inside a pedigree, even for large pedigrees. It also takes into account individuals with missing and partially missing measurements. Estimation of model parameters is performed by an EM algorithm, and computations for the E step inside a pedigree are achieved using a pedigree peeling algorithm. When more than one model are fitted, we use model selection strategies such as cross-validation or/and BIC approaches to choose the suitable model among a set of candidates. Moreover, we present a simulation based on a genetic disease class model and we show that our model leads to better individual classification than the model that assumes independence among subjects. An application of our model to a Schizophrenia-Bipolar pedigree data set from Eastern Quebec is also performed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allman ES, Matias C, Rhodes JA (2009) Identifiability of latent class models with many observed variables. Ann Stat 37(6A): 3099–3132

    Article  MathSciNet  MATH  Google Scholar 

  • Bishop YM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis: theory and practice. The MIT Press, Cambridge

    MATH  Google Scholar 

  • Bureau A, Labbe A, Croteau J, Merette C (2008) Using disease symptoms to improve detection of linkage under genetic heterogeneity. Genet Epid 32(5): 476–486

    Article  Google Scholar 

  • Celedon JC, Soto-Quiros ME, Avila L, Lake SL, Liang C, Fournier E, Spesny M, Hersh CP, Sylvia JS, Hudson TJ, Verner A, Klanderman BJ, Freimer NB, Silverman EK, Weiss ST (2007) Significant linkage to airway responsiveness on chromosome 12q24 in families of children with asthma in costa rica. Hum Genet 120: 691–709

    Article  Google Scholar 

  • Dudoit S, Fridlyand J (2002) A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 3(7): 1–21

    Article  Google Scholar 

  • Elston RC, Stewart J (1971) a General model for the genetic analysis of pedigree data. Hum Hered 21: 523–542

    Article  Google Scholar 

  • Fanous AH, Neale MC, Webb BT, Straub RE, O’Neill FA, Walsh D, Riley BP, Kendler KS (2008) Novel linkage to chromosome 20p using latent classes of psychotic illness in 270 irish high density families. Biol Psychiatry 64(2): 121–127

    Article  Google Scholar 

  • Hagenaars JA (1988) Latent structure models with direct effects between indicators: local dependence models. Sociol Methods Res 16: 379–405

    Article  Google Scholar 

  • Hallmayer JF, Jablensky A, Michie P, Woodbury M, Salmon B, Combrinck J, Wichmann H, Rock D, D’Ercole M, Howell S, Dragovic M, Kent A (2003) Linkage analysis of candidate regions using a composite neurocognitive phenotype correlated with schizophrenia. Mol Psychiatry 8(5): 511–523

    Article  Google Scholar 

  • Holi F, Tuulio-Henricksson A, Haukka J, Partonen T, Holmström L, Lönnqvist J (2004) Family-based clusters of cognitive test performance in familial schizophrenia. BMC Psychiatry 4(20). doi:10.1186/1471-244X-4-20

  • Kendler KS, Karkowski LM, Walsh D (1998) The structure of psychosis: latent class analysis of probands from the roscommon family study. Arch Gen Psychiatry 55: 492–499

    Article  Google Scholar 

  • Labbe A, Bureau A, Merette C (2009) Integration of genetic familial dependence structure in latent class model. Int J Biostat 5(1), Article 6. doi:10.2202/1557-4679.1126

  • Lin Y, Liu T, Li J, Yang J, Du Q, Wang J, Yang Y, Liu X, Fan Y, Lu F, Chen Y, Pu Y, Zhang K, He X, Yang Z (2008) A genome-wide scan maps a novel autosomal dominant juvenile-onset open-angle glaucoma locus to 2p15-16. Mol Vis 14: 739–744

    Google Scholar 

  • McLachlan GJ, Krishnan T (2007) The EM algorithm and extensions. Wiley, New York

    MATH  Google Scholar 

  • Neuman RJ, Todd RD, Heath AC, Reich W, Hudziak JJ, Bucholz KK, Madden PA, Begleiter H, Porjesz B, Kuperman S, Hesselbrock V, Reich T (1999) Evaluation of adhd typology in three contrasting samples: a latent class approach. J Am Acad Child Adolesc Psychiatry 38: 25–33

    Article  Google Scholar 

  • Raskind WH, Matsushita M, Peter B, Biderston J, Wolff J, Lipe H, Burbank R, Bird TD (2009) Familial dyskinesia and facial myokymia (FDFM): follow-up of a large family and linkage to chromosome 3p21–3q21. Am J Med Genet B Neuropsychiatr Genet 150B(4): 570–574

    Article  Google Scholar 

  • Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer-Verlag, New York

    MATH  Google Scholar 

  • Schmidt M, Hauser ER, Martin ER, Schmidt S (2005) Extension of the simla package for generating pedigrees with complex inheritance patterns: environmental covariates, gene-gene and gene-environment interaction. Stat Appl Genet Mol Biol 4(1) Article 15. doi:10.2202/1544-6115.1133

  • Smith C (1963) Testing for heterogeneity of recombination fraction values in human genetics. Ann Hum Genet 27: 175–182

    Article  MATH  Google Scholar 

  • Sullivan PF, Kessler RC, Kendler KS (1998) Latent class analysis of lifetime depressive symptoms in the national comorbidity survey. Am J Psychiatry 155: 1398–1406

    Google Scholar 

  • Schwartz G (1978) Estimating the dimension of a model. Ann Stat 6: 461–464

    Article  Google Scholar 

  • Uebersax JS (1993) Statistical modeling of expert ratings on medical treatment appropriateness. J Am Stat Assoc 88(422): 421–427

    Article  Google Scholar 

  • Uebersax JS (1997) Analysis of student problem behaviors with latent trait, latent class and related probit mixture models. In: Rost J, Langeheine R (eds) Applications of latent trait and latent class models in the social sciences. Waxmann, New York, pp 188–195

    Google Scholar 

  • Van Der lann M, Dudoit S, Keles S (2004) Asymptotic optimality of likelihood-based cross validation. Stat Appl Genetics Mol Biol 3(1), Article 4. doi:10.2202/1544-6115.1036

  • Vermunt JK (2008) Latent class and finite mixture models for multilevel data sets. Stat Methods Med Res 17(1): 33–51

    Article  MathSciNet  MATH  Google Scholar 

  • Wang Y, Kuan PJ, Xing C, Cronkhite JT, Torres F, Rosenblatt RL, Dimaio JM, Kinch LN, Grishin NV, Garcia CK (2008) Genetic defects in surfactant protein A2 are associated with pulmonary fibrosis and lung cancer. Am J Hum Genet 84: 52–59

    Article  Google Scholar 

  • Whittemore AS, Halpern J (1994) A class of tests for linkage using affected pedigree members. Biometrics 50: 118–127

    Article  MATH  Google Scholar 

  • Yakowitz SJ, Spragins JD (1968) On the identifiability of finite mixtures. Ann Math Stat 39: 209–214

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arafat Tayeb.

Electronic Supplementary Material

The Below is the Electronic Supplementary Material.

ESM 1 (PDF 233 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tayeb, A., Labbe, A., Bureau, A. et al. Solving genetic heterogeneity in extended families by identifying sub-types of complex diseases. Comput Stat 26, 539–560 (2011). https://doi.org/10.1007/s00180-010-0224-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-010-0224-2

Keywords

Navigation