Skip to main content

Advertisement

Log in

Adaptive feature selection framework for DNA methylation-based age prediction

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Aging process is one of the main unsolved problems of modern biology that affects almost all living species, resulting from multiple interactions of genetics and environmental factors. Numerous studies have shown that DNA methylation changes are one of the most sustainable biomarkers in predicting biological age that has complex relationship with chronological age. This point shows the importance of selecting age-related CpG-sites. Most feature selection methods that have been proposed in this field are problem-dependent techniques for finding important age-related CpG-sites. However, in this study, we propose a general-purpose framework that is problem independent. This adaptive framework is proposed to find the best sequence of feature selection methods and the number of features that selected in each step according to the used dataset. To evaluate our proposed framework, we used two groups of DNA methylation dataset related to blood tissue and non-blood tissues from healthy samples. The results of our adaptive framework have been compared with four studies in terms of mean absolute deviation (MAD) and correlation (R2) separately on blood and non-blood datasets. Our framework achieved MAD of 3.9 years and 5.33 years on the blood and non-blood test datasets, respectively. Also, a correlation (R2) of 95.24% and 91.92% between chronological age and DNAm has been reported on the blood and non-blood test datasets, respectively. The experimental results show that our proposed framework was able to adaptively find the best feature selection method appropriate to the data that has an acceptable performance compared to other studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

All used datasets in this work are free and public.

References

  • Aliferi A, Ballard D, Gallidabino MD, Thurtle H, Barron L, SyndercombeCourt D (2018) DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models. Forensic Sci Int Genet 37:215–226

    Article  Google Scholar 

  • Alkuhlani A, Nassef M, Farag I (2017) Multistage feature selection approach for high-dimensional cancer data. Soft Comput 21(22):6895–6906

    Article  Google Scholar 

  • Amoozegar M, Minaei-Bidgoli B (2018) Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Syst Appl 113:499–514

    Article  Google Scholar 

  • Bekaert B, Kamalandua A, Zapico SC, Van de Voorde W, Decorte R (2015) A selective set of DNA-methylation markers for age determination of blood, teeth and buccal samples. Forensic Sci Int Genet Suppl Ser 5:e144–e145

    Article  Google Scholar 

  • Berdyshev GD, Korotaev GK, Boiarskikh GV, Vaniushin BF (1967) Nucleotide composition of DNA and RNA from somatic tissues of humpback and its changes during spawning. Biokhimiia 32:988–993

    Google Scholar 

  • Bouraoui A, Jamoussi S, BenAyed Y (2018) A multi-objective genetic algorithm for simultaneous model and feature selection for support vector machines. Artif Intell Rev 50(2):261–281

    Article  Google Scholar 

  • Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28

    Article  Google Scholar 

  • Dashtban M, Balafar M, Suravajhala P (2018) Gene selection for tumor classification using a novel bio-inspired multi-objective approach. Genomics 110(1):10–17

    Article  Google Scholar 

  • Di Lena P, Sala C, Nardini C (2021) Estimage: a webserver hub for the computation of methylation age. Nucleic Acids Res 49(W1):W199–W206

    Article  Google Scholar 

  • Du H, Wang Z, Zhan W, Guo J (2018) Elitism and distance strategy for selection of evolutionary algorithms. IEEE Access 6:44531–44541

    Article  Google Scholar 

  • Freire-Aradas A et al (2016) Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system. Forensic Sci Int Genet 24:65–74

    Article  Google Scholar 

  • Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, Sarkar R (2019) Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput 57(1):159–176

    Article  Google Scholar 

  • Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422

    Article  MATH  Google Scholar 

  • Hackett JA, Surani MA (2013) DNA methylation dynamics during the mammalian life cycle. Philos Trans R Soc B Biol Sci 368(1609):20110328

    Article  Google Scholar 

  • Hannum G et al (2013) Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell 49(2):359–367

    Article  Google Scholar 

  • Hillary RF et al (2020) Epigenetic measures of ageing predict the prevalence and incidence of leading causes of death and disease burden. Clin Epigenet 12(1):1–12

    Article  Google Scholar 

  • Hong SR, Jung SE, Lee EH, Shin KJ, Yang WI, Lee HY (2017) DNA methylation-based age prediction from saliva: High age predictability by combination of 7 CpG markers. Forensic Sci Int Genet 29:118–125

    Article  Google Scholar 

  • Hong SR, Shin KJ, Jung SE, Lee EH, Lee HY (2019) Platform-independent models for age prediction using DNA methylation data. Forensic Sci Int Genet 38:39–47

    Article  Google Scholar 

  • Hoque N, Bhattacharyya DK, Kalita JK (2014) MIFS-ND: A mutual information-based feature selection method. Expert Syst Appl 41(14):6371–6385

    Article  Google Scholar 

  • Horvath S (2013) DNA methylation age of human tissues and cell types. Genome Biol 14(10):115

    Article  Google Scholar 

  • Horvath S et al (2015) Accelerated epigenetic aging in down syndrome. Aging Cell 14(3):491–495

    Article  Google Scholar 

  • Horvath S et al (2016) An epigenetic clock analysis of race/ethnicity, sex, and coronary heart disease. Genome Biol 17(1):1–23

    Article  Google Scholar 

  • Horvath S et al (2018) Epigenetic clock for skin and blood cells applied to Hutchinson Gilford Progeria Syndrome and ex vivo studies. Aging (albany, NY) 10(7):1758–1775

    Article  Google Scholar 

  • Itano F, De Abreu De Sousa MA, Del-Moral-Hernandez E (2018) Extending MLP ANN hyper-parameters Optimization by using Genetic Algorithm. In: Proceedings of the international joint conference on neural networks, vol 2018—July

  • Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput 62:203–215

    Article  Google Scholar 

  • Jebari K (2013) Selection methods for genetic algorithms. Int J Emerg Sci 3(4):333–344

    Google Scholar 

  • Jung SE, Lim SM, Hong SR, Lee EH, Shin KJ, Lee HY (2019) DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples. Forensic Sci Int Genet 38:1–8

    Article  Google Scholar 

  • Kanigur Sultuybek G, Soydas T, Yenmis G (2019) NF-κB as the mediator of Metformin’s effect on ageing and ageing-related diseases. Clin Exp Pharmacol Physiol 46(5):413–422

    Article  Google Scholar 

  • Lazar C et al (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans Comput Biol Bioinform 9(4):1106–1119

    Article  Google Scholar 

  • Lee HY, Lee SD, Shin KJ (2016) Forensic DNA methylation profiling from evidence material for investigative leads. BMB Rep 49(7):359

    Article  Google Scholar 

  • Levine ME, Lu AT, Bennett DA, Horvath S (2015) Epigenetic age of the pre-frontal cortex is associated with neuritic plaques, amyloid load, and Alzheimer’s disease related cognitive functioning. Aging (albany, NY) 7(12):1198–1211

    Article  Google Scholar 

  • Levine ME et al (2018) An epigenetic biomarker of aging for lifespan and healthspan. Aging (albany, NY) 10(4):573

    Article  Google Scholar 

  • Li X, Li W, Xu Y (2018) Human age prediction based on DNA methylation using a gradient boosting regressor. Genes (basel) 9(9):424

    Article  Google Scholar 

  • Maierhofer A, Flunkert J, Oshima J, Martin GM, Haaf T, Horvath S (2017) Accelerated epigenetic aging in Werner syndrome. Aging (albany, NY) 9(4):1143

    Article  Google Scholar 

  • Manikandan G, Abirami S (2018) A survey on feature selection and extraction techniques for high-dimensional microarray datasets. Knowl Comput Appl Knowl Comput Specif Domains 2:311–333

    Google Scholar 

  • McEwen LM et al (2020) The PedBE clock accurately estimates DNA methylation age in pediatric buccal cells. Proc Natl Acad Sci USA 117(38):23329–23335

    Article  Google Scholar 

  • Momeni Z, Abadeh MS (2019) Mapreduce-based parallel genetic algorithm for CpG-site selection in age prediction. Genes (basel) 10(12):969

    Article  Google Scholar 

  • Momeni Z, Hassanzadeh E, Saniee Abadeh M, Bellazzi R (2020) A survey on single and multi omics data mining methods in cancer data classification. J Biomed Inform 107:103466

    Article  Google Scholar 

  • Moslehi F, Haeri A (2019) A novel hybrid wrapper–filter approach based on genetic algorithm, particle swarm optimization for feature subset selection. J Ambient Intell Humaniz Comput 11(3):1105–1127

    Article  Google Scholar 

  • Nasir IM et al (2020) Pearson correlation-based feature selection for document classification using balanced training. Sensors 20(23):6793

    Article  Google Scholar 

  • Naue J et al (2017) Chronological age prediction based on DNA methylation: massive parallel sequencing and random forest regression. Forensic Sci Int Genet 31:19–28

    Article  Google Scholar 

  • Park JL et al (2016) Identification and evaluation of age-correlated DNA methylation markers for forensic use. Forensic Sci Int Genet 23:64–70

    Article  Google Scholar 

  • Pes B, Dessì N, Angioni M (2017) Exploiting the ensemble paradigm for stable feature selection: a case study on high-dimensional genomic data. Inf Fusion 35:132–147

    Article  Google Scholar 

  • Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125

    Article  Google Scholar 

  • Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2016) A new hybrid filter–wrapper feature selection method for clustering based on ranking. Neurocomputing 214:866–880

    Article  Google Scholar 

  • Urbanowicz RJ, Meeker M, La Cava W, Olson RS, Moore JH (2018) Relief-based feature selection: introduction and review. J Biomed Inform 85:189–203

    Article  Google Scholar 

  • Ververidis D, Kotropoulos C (2005) Sequential forward feature selection with low computational cost. IEEE Conference Publication. IEEE Xplore. In: 13th European signal processing conference

  • Vidaki A, Ballard D, Aliferi A, Miller TH, Barron LP, Syndercombe Court D (2017) DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing. Forensic Sci Int Genet 28:225–236

    Article  Google Scholar 

  • Xu C et al (2015) A novel strategy for forensic age prediction by DNA methylation and support vector regression model. Sci Rep 5(1):1–10

    Article  Google Scholar 

  • Xu Y, Li X, Yang Y, Li C, Shao X (2019) Human age prediction based on DNA methylation of non-blood tissues. Comput Methods Programs Biomed 171:11–18

    Article  Google Scholar 

  • Yi SH, Jia YS, Mei K, Yang RZ, Huang DX (2015) Age-related DNA methylation changes for forensic age-prediction. Int J Legal Med 129(2):237–244

    Article  Google Scholar 

  • Zaghlool SB, Al-Shafai M, Al-Muftah WA, Kumar P, Falchi M, Suhre K (2015) Association of DNA methylation with age, gender, and smoking in an Arab population. Clin Epigenet 7(1):7–6

    Article  Google Scholar 

  • Zbieć-Piekarska R et al (2015) Examination of DNA methylation status of the ELOVL2 marker may be useful for human age prediction in forensic science. Forensic Sci Int Genet 14:161–167

    Article  Google Scholar 

  • Zbieć-Piekarska R et al (2015) Development of a forensically useful age prediction method based on DNA methylation analysis. Forensic Sci Int Genet 17:173–179

    Article  Google Scholar 

  • Zhao W et al (2019) Education and lifestyle factors are associated with DNA methylation clocks in older African Americans. Int J Environ Res Public Health 16(17):3141

    Article  Google Scholar 

  • Zheng SC, Widschwendter M, Teschendorff AE (2016) Epigenetic drift, epigenetic clocks and cancer risk. Epigenomics 8(5):705–719

    Article  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (stat Methodol) 67(2):301–320

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Saniee Abadeh.

Ethics declarations

Conflict of interest

There is no conflict of interest regarding this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 20 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Momeni, Z., Saniee Abadeh, M. Adaptive feature selection framework for DNA methylation-based age prediction. Soft Comput 26, 3777–3788 (2022). https://doi.org/10.1007/s00500-022-06844-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-06844-z

Keywords

Navigation