Predicting body fat percentage from anthropometric and laboratory measurements using artificial neural networks

doi:10.1016/j.asoc.2017.05.063

Applied Soft Computing

Volume 67, June 2018, Pages 834-839

https://doi.org/10.1016/j.asoc.2017.05.063 Get rights and content

Highlights

•
Body fat percentage is predicted from easily measurable data to quantify obesity risk.
•
Linear regression, neural networks and support vector machines are used.
•
Models built on empirical data from a representative US health survey (n = 862).
•
Optimal parameters are chosen and bootstrap validation is used.
•
Linear regression is slightly outperformed by support vector machines, but not neural networks.

Abstract

Purpose of the research

Obesity is a major public health problem with rapidly growing prevalence and serious associated health risks. Characterized by excess body fat, the accurate measurement of obesity is a non-trivial question. Widely used indicators, such as the body mass index often poorly predict actual risk, but the direct measurement of body fat mass is complicated. The aim of the present research is to investigate how well can body fat percentage be predicted from easily measureable data: age, gender, weight, height, waist circumference and different laboratory results. For that end, linear regression, feedforward neural networks and support vector machines are applied on the data of a representative US health survey (n = 862) using adult males. Optimal parameters are chosen and bootstrap validation is used to get realistic error estimates.

Results

No methods can well predict the body fat percentage, but support vector machines slightly outperformed feedforward neural networks and linear regression (root mean square error 0.0988 ± 0.00288, 0.108 ± 0.00928 and 0.107 ± 0.012 respectively).

Conclusion

Even this best performance means that soft computing methods had an R² of 44%, but this slight advantage is balanced by the fact that regression models are clinically interpretable.

Graphical abstract

Introduction

Obesity [1] is widely considered to be one of the most important current public health problems due to its continuously increasing prevalence in the developed world (affecting both adults [2], [3] and children [4], [5]) on the one hand, and the seriousness of the health risks it gives rise to on the other hand. Increased risk of a number of diseases have been casually linked to obesity, including type 2 diabetes mellitus, hypertension, ischaemic heart disease, stroke, infertility, osteoarthritis, liver and gallbladder disease and certain tumors [6]. Not surprisingly, obesity also increases all-cause mortality [7] and poses a significant economical burden as well [8], [9].

Screening for the disease and accurate tracking of the severity for the already ill both underline the importance of the exact measurement of obesity. This is, however, not a trivial question: the definition of obesity (“condition of excess body fat” [1,p. 3]) does not directly give rise to any quantitative metric. Weight is a straightforward proxy for body fat and is easy to measure but is almost meaningless without information on the overall stature of the person. Usually height is used for that purpose, leading to indicators such as body mass index (BMI) [10], which is so widely used that even the definition of obesity is sometimes linked to it, and is endorsed by the World Health Organization [11].

It is, however, well-known that these indicators, even though stature is taken into account, often perform poorly [12] in predicting health outcomes because they do not measure body fat itself, much less its distribution (which is also known to be prognostic: visceral fat, i.e. abdominal obesity is especially associated with negative outcome [13]), among others. Methods such as waist circumference or waist-to-hip ratio measurement try to correct for this aspect [14].

A much better approach would be the direct measurement of body fat mass itself, or body fat percentage (BFP), i.e. body fat mass divided by body weight, but it is hindered by the fact that its measurement is difficult, unfit for wider use. (Precise methods include dual energy X-ray absorptiometry (DXA), bioelectrical impedance analysis (BIA) and air displacement plethysmography [15].)

It would be therefore important if BFP could be predicted from easily measurable parameters such as basic sociodemographic data (age, gender), basic anthropometric data (weight, height, waist circumference) and basic laboratory parameters obtained from routine blood drawing. The rationale of this last component is that obesity is associated with a systemic inflammation state [16] and is demonstrated to be associated with changes in clinical chemistry parameters [17]. It is therefore appealing intuitively to include these parameters too.

The aim of the present research is to investigate how well BFP can be predicted from these parameters. That is, clinical prediction models [18] were built and validated from these variables using an empirical database (where BFP was measured with BIA as gold standard).

The current aim was not the analysis of the relationships between the predictor variables and the response aimed to provide new clinical knowledge, rather the building of a purely predictive model. This allowed us to use modern tools of soft computing (machine learning) which are black-box in that sense, but might provide better predictions. Neural networks were chosen as an example, in particular ordinary multi-layer feedforward neural networks [19] and support vector machines [20]. As a comparison, linear regression was used, illustrating the more traditional biostatistical approach.

Section snippets

Database

National Health and Nutrition Examination Survey (NHANES) is now a continuous American public health program, with results published in biannual cycles [21]. It is a nation-wide survey aimed to be representative for the whole civilian non-institutionalized US population, by employing a complex, stratified multi-stage probability sampling plan. The amount of collected data is tremendous (although sometimes varying from cycle to cycle), including demographic data, physical examination, collection

Results

The results of the linear regression for the whole database are shown in Fig. 2.

This illustrates that this model is interpretable, i.e. a clinical meaning can be associated with its results. (But note that it is the model for the whole dataset, without validation.)

Results of the parameter search, and the fitted vs. actual plot of the best model can be seen in Fig. 3 for the ordinary feedforward neural network and in Fig. 4 for the support vector machine. As the figures show, the optimal

Discussion

Results obtained with modern soft computing techniques were not convincingly better than linear regression.

Only SVM was able to clearly outperform simple regression, but this was rather an advantage in terms of stability of the results across bootstrap replicates and not substantially improved average value (RMSE: 0.0988 ± 0.00288 vs. 0.107 ± 0.012). Feedforward neural networks had an average performance nearly identical to regression with only minimally improved stability (RMSE: 0.108 ± 0.00928).

Conclusions

The soft computing methods investigated in the present study were not able to substantially outperform simple linear regression. While support vector machine did exhibit some advantage (but even it had an R² of 44%), it is overall balanced by the fact that regression models are clinically interpretable, i.e. “white-box”. While for some modern methods of machine learning, exploration of the models is a possibility, we aimed to focus on very well established, classical methods. Inclusion of other

Funding sources

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author declaration

There are no conflicts of interest associated with this publication and there has been no financial support for this work that could have influenced its outcome.

Acknowledgements

Tamás Ferenci was supported by UNKP-16-4/IIINew National Excellence Program of the Ministry of Human Capacities.

References (43)

M. Ng et al.
Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the Global Burden of Disease Study 2013
Lancet
(2014)
Y.C. Wang et al.
Health and economic burden of the projected obesity trends in the USA and the UK
Lancet
(2011)
A. Keys et al.
Indices of relative weight and obesity
J. Chron. Dis.
(1972)
C.M.Y. Lee et al.
Indices of abdominal obesity are better discriminators of cardiovascular risk factors than BMI: a meta-analysis
J. Clin. Epidemiol.
(2008)
P. Deurenberg et al.
The assessment of obesity: methods for measuring body fat and global prevalence of obesity
Best Pract. Res. Clin. Endocrinol. Metab.
(1999)
D. Gallagher et al.
Healthy percentage body fat ranges: an approach for developing guidelines based on body mass index
Am. J. Clin. Nutr.
(2000)
S. Meeuwsen et al.
The relationship between BMI and percent body fat, measured by bioelectrical impedance, in a large adult sample is curvilinear and influenced by age and sex
Clin. Nutr.
(2010)
A. Kupusinac et al.
Predicting body fat percentage based on gender, age and BMI by using artificial neural networks
Comput. Methods Programs Biomed.
(2014)
D. Bagchi et al.
Obesity: Epidemiology, Pathophysiology, and Prevention
(2012)
K.M. Flegal et al.
Prevalence of obesity and trends in the distribution of body mass index among US adults, 1999–2010
JAMA
(2012)

E.S. Ford et al.

Epidemiology of obesity in the western hemisphere

J. Clin. Endocrinol. Metab.

(2008)

C.L. Ogden et al.

Prevalence of obesity and trends in body mass index among US children and adolescents, 1999–2010

JAMA

(2012)

P. Kopelman

Health risks associated with overweight and obesity

Obes. Rev.

(2007)

K.M. Flegal et al.

Association of all-cause mortality with overweight and obesity using standard body mass index categories: a systematic review and meta-analysis

JAMA

(2013)

D. Withrow et al.

The economic burden of obesity worldwide: a systematic review of the direct costs of obesity

Obes. Rev.

(2011)

World Health Organization

Technical report series 894: Obesity: Preventing and Managing the Global Epidemic

(2000)

R. Huxley et al.

Body mass index, waist circumference and waist:hip ratio as predictors of cardiovascular risk – a review of the literature

Eur. J. Clin. Nutr.

(2010)

J.-P. Després et al.

Abdominal obesity and the metabolic syndrome: contribution to global cardiometabolic risk

Arterioscler. Thromb. Vasc. Biol.

(2008)

A. Fernández-Sánchez et al.

Inflammation, oxidative stress, and obesity

Int. J. Mol. Sci.

(2011)

T. Ferenci

Two applications of biostatistics in the analysis of pathophysiological processes

(2013)

E. Steyerberg

Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating

(2008)

Cited by (23)

Sex-based approach to estimate human body fat percentage from 2D camera images with deep learning and machine learning
2023, Measurement: Journal of the International Measurement Confederation
Obesity is one of the most concerning nutritional issues since it is a significant risk factor for chronic diseases, including cardiovascular disease and diabetes. Many dietary disorders require an anthropometry assessment and body fat percentage (BFP) information. Dual-energy X-ray absorptiometry (DXA) is the most precise and automated method for determining BFP; nevertheless, it is costly and difficult to locate in clinics. This paper proposes the utilization of digital image processing and machine learning techniques to estimate BFP, considering four 2D camera images and additional factors such as age, weight, height, and sex. Our proposal specifically adopts a sex-specific approach. Our experiments included pre-processing steps and several regressors. Moreover, we built a dataset composed of 912 samples, including male and female individuals. The sex-based approach to estimating the BFP achieved satisfactory results for both males and females. Thus, it can assist monitor patients as a mobile application, especially in areas where experts and technology, such as equipment, are scarce.
Predicting body fat using a novel fuzzy-weighted approach optimized by the whale optimization algorithm
2023, Expert Systems with Applications
Obesity, termed as excessive body fat, is a major public health problem. Being able to accurately predict body fat percentage provides an opportunity for making appropriate prevention and diagnosis. Although various assessment techniques have been applied for body fat estimation, high-cost tests are required using special equipment, and the performance of standard machine learning models are typically impacted by noise data. In this paper, we propose a novel fuzzy-weighted approach for improving body fat prediction using easily accessed body measurements in a cost-effective manner. Given the noise data problem, we design a novel fuzzy weighted operation by extending the minimum and maximum values for calculating the fuzzy weights to cover more possibilities. The fuzzy weights are incorporated into the error constraints of a relative error support vector machine model to improve its robustness to noise data. In addition, the Whale Optimization Algorithm (WOA) is applied in our proposed model for parameter optimization. Our proposed fuzzy-weighted approach delivers excellent results for body fat prediction by outperforming other well-known machine learning models. Further, the statistical test based on the 20 runs of experiments not only confirms that our proposed model could significantly outperform the models under comparison, but also show the fuzzy weighted operation and WOA are effective to improve the proposed approach. This offers a viable alternative for medical decision makers to make appropriate prevention, early diagnosis, and early treatment of body fat related diseases when high-cost measurement techniques are not available.
Information fusion via symbolic regression: A tutorial in the context of human health
2023, Information Fusion
This tutorial paper provides a general overview of symbolic regression (SR) with specific focus on standards of interpretability. We posit that interpretable modeling, although its definition is still disputed in the literature, is a practical way to support the evaluation of successful information fusion. In order to convey the benefits of SR as a modeling technique, we demonstrate an application within the field of health and nutrition using publicly available National Health and Nutrition Examination Survey (NHANES) data from the Centers for Disease Control and Prevention (CDC), fusing together anthropometric markers into a simple mathematical expression to estimate body fat percentage. We discuss the advantages and challenges associated with SR modeling and provide qualitative and quantitative analyses of the learned models.
Artificial intelligence and body composition
2023, Diabetes and Metabolic Syndrome: Clinical Research and Reviews
Although obesity is associated with chronic disease, a large section of the population with high BMI does not have an increased risk of metabolic disease. Increased visceral adiposity and sarcopenia are also risk factors for metabolic disease in people with normal BMI. Artificial Intelligence (AI) techniques can help assess and analyze body composition parameters for predicting cardiometabolic health. The purpose of the study was to systematically explore literature involving AI techniques for body composition assessment and observe general trends.
We searched the following databases: Embase, Web of Science, and PubMed. There was a total of 354 search results. After removing duplicates, irrelevant studies, and reviews(a total of 303), 51 studies were included in the systematic review.
AI techniques have been studied for body composition analysis in the context of diabetes mellitus, hypertension, cancer and many specialized diseases. Imaging techniques employed for AI methods include CT (Computerized Tomography), MRI (Magnetic Resonance Imaging), ultrasonography, plethysmography, and EKG(Electrocardiogram). Automatic segmentation of body composition by deep learning with convolutional networks has helped determine and quantify muscle mass. Limitations include heterogeneity of study populations, inherent bias in sampling, and lack of generalizability. Different bias mitigation strategies should be evaluated to address these problems and improve the applicability of AI to body composition analysis.
AI assisted measurement of body composition might assist in improved cardiovascular risk stratification when applied in the appropriate clinical context.
A hybrid feature selection algorithm using simplified swarm optimization for body fat prediction
2022, Computer Methods and Programs in Biomedicine
Obesity is one of the chronic diseases that seriously threaten people's health outcomes globally. Since the prevalence of obesity is increasing among people of all ages, measuring the body fat percentages is vital before treatment. However, the body fat percentage cannot be accurately measured by weighing. While many devices are commonly used to measure the body fat percentage, these devices are expensive and depend on complex instruments. Therefore, more practical and cost-effective solutions are desired to measure body fat accurately. This study presents a hybrid feature selection method based on a VIKOR-based multi-filter ensemble technique (VMFET) and an improved simplified swarm optimization (iSSO) to predict the body fat percentage with low prediction error.
The study followed a two-phase process. First, VMFET was used to aggregate the statistical outcomesof individual filters to filter the most informative features from the original dataset. Then, the selected features are applied to the next phase. Second, iSSO was tailored with a biased random initialization scheme, effect-based feature pruning scheme, and multiple linear regression as a wrapper method to improve the prediction performance and select the optimal feature subset.
Extensive experiments were performed using nine datasets to verify the performance of the proposed method empirically, and the corresponding results were compared with up-to-date studies.
The statistical results demonstrated that the proposed method offers a promising and effective tool for predicting body fat.
The hybrid feature selection model can enhance prediction accuracy and lower prediction error.
Determination of Body Fat Percentage by Gender Based with Photoplethysmography Signal Using Machine Learning Algorithm
2022, IRBM
Citation Excerpt :
The use of machine learning algorithms is more efficient than statistical methods. As a matter of fact, it provides better performance in BFP estimation [9–11]. In a study comparing BIA and skinfold thickness measurement methods in BFP calculation, it was concluded that the methods could be used interchangeably [4].
Calculation of body fat percentage (BFP) is a frequently encountered problem in the literature. BFP is one of the most significant parameters which should be processed in body weight control programs. Anthropometric measurements and statistical methods are being used generally in the literature for BFP estimation. Artificial intelligence and gender-based models with a photoplethysmography signal (PPG) were proposed for BFP estimation in this study.
In the study, the PPG signal is divided into lower frequency bands, and 25 features are taken out from each frequency band. Artificial intelligence algorithms were created by reducing the extracted features with the help of a feature selection algorithm.
According to the results obtained, models with performance values of $R M S E = 0.35$ , $R = 1$ for men, $R M S E = 0.87$ , $R = 1$ for women were created.
In the best performing models, the PPG signal's high-frequency components are used for men, whereas the low-frequency band of the PPG signal is used for women. As a result, the proposed model in this study is considered to be used for BFP measurement.

View all citing articles on Scopus

View full text

Predicting body fat percentage from anthropometric and laboratory measurements using artificial neural networks

Highlights

Abstract

Purpose of the research

Results

Conclusion

Graphical abstract

Introduction

Section snippets

Database

Results

Discussion

Conclusions

Funding sources

Author declaration

Acknowledgements

Lancet

Lancet

J. Chron. Dis.

J. Clin. Epidemiol.

Best Pract. Res. Clin. Endocrinol. Metab.

Am. J. Clin. Nutr.

Clin. Nutr.

Comput. Methods Programs Biomed.

Obesity: Epidemiology, Pathophysiology, and Prevention

Prevalence of obesity and trends in the distribution of body mass index among US adults, 1999–2010

JAMA

Epidemiology of obesity in the western hemisphere

J. Clin. Endocrinol. Metab.

Prevalence of obesity and trends in body mass index among US children and adolescents, 1999–2010

JAMA

Health risks associated with overweight and obesity

Obes. Rev.

Association of all-cause mortality with overweight and obesity using standard body mass index categories: a systematic review and meta-analysis

JAMA

The economic burden of obesity worldwide: a systematic review of the direct costs of obesity

Obes. Rev.

Technical report series 894: Obesity: Preventing and Managing the Global Epidemic

Body mass index, waist circumference and waist:hip ratio as predictors of cardiovascular risk – a review of the literature

Eur. J. Clin. Nutr.

Abdominal obesity and the metabolic syndrome: contribution to global cardiometabolic risk

Arterioscler. Thromb. Vasc. Biol.

Inflammation, oxidative stress, and obesity

Int. J. Mol. Sci.

Two applications of biostatistics in the analysis of pathophysiological processes

Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating