Computer vision model for food identification in meals from the segmentation obtained by a set of fully convolutional networks

Carvalho, Marcos A.; Pimenta, Tales C.; Silvério, Alessandra C. P.; Carvalho, Jaqueline C. S.

doi:10.1007/s12652-023-04703-9

Computer vision model for food identification in meals from the segmentation obtained by a set of fully convolutional networks

Original Research
Published: 04 October 2023

Volume 14, pages 16879–16890, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

101 Accesses
Explore all metrics

Abstract

The strategy of counting the carbohydrates in consumed foods is recommended by scientific societies as a way to improve the quality of life of life of diabetes patients. Monitoring food intake can be facilitated through the use of a mobile application that automatically recognizes the foods in a meal. Automatically recognizing food images is considered a challenging task for computer vision due to the similarity between foods. This challenge increases when the goal is to classify foods from a specific region and with a dataset containing only foods from that region and therefore small compared to public datasets from other countries. For this task, this work presents a model that uses a set of Fully Convolutional Networks (FCNs) to generate segmentations of foods in a meal. These segmentations are processed by an algorithm that uses digital image processing techniques to identify the foods. The model has low training costs due to being scalable, that is, the model can be trained to recognize a new food without the need to retrain the entire model. In the tests, foods consumed in Brazil were used, obtaining an accuracy of 98% and a recall of 88%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Food Recognition for Dietary Assessment Using Deep Convolutional Neural Networks

Estimation of Dietary Calories Using Image Processing

Exploring Food Detection Using CNNs

Data availability

The datasets generated by the research are available in a data repository and can be accessed at https://doi.org/10.17632/7n36jtcpv3.1.

References

Bossard L, Guillaumin M, Gool LV (2014) Food-101 - mining discriminative components with random forests. In: Fleet D, Pajdla T, Schiele B et al (eds) Computer Vision - ECCV 2014. Springer International Publishing, Cham, pp 446–461
Chapter Google Scholar
Brain G (2015) Tensorflow. https://www.tensorflow.org/
Carvalho MA (2023) Brazilian food images. https://doi.org/10.17632/7n36jtcpv3.1
Chang X, Ren P, Xu P et al (2023) A comprehensive survey of scene graphs: Generation and application. IEEE Trans Pattern Anal Mach Intell 45(1):1–26. https://doi.org/10.1109/TPAMI.2021.3137605
Article Google Scholar
Cheng Z, Chang X, Zhu L, et al (2019) Mmalfm: Explainable recommendation by leveraging reviews and images. ACM Trans Inf Syst 37(2). https://doi.org/10.1145/3291060, https://doi-org.ez38.periodicos.capes.gov.br/10.1145/3291060
Cho NH, Shaw JE, Huang Y et al (2018) Diabetes research and clinical practice. Diabetes Res Clin Pract 138:271–281. https://doi.org/10.1016/j.diabres.2018.02.023
Article Google Scholar
Chun M, Jeong H, Lee H et al (2022) Development of korean food image classification model using public food image dataset and deep learning methods. IEEE Access 10:128732–128741. https://doi.org/10.1109/ACCESS.2022.3227796
Article Google Scholar
Community N (2005) Numpy:. https://numpy.org/
Core T (2022) Image segmentation. https://www.tensorflow.org/tutorials/images/segmentation
Corporation I (2000) Opencv:. https://docs.opencv.org
EGE T, YANAI K (2018) Image-based food calorie estimation using recipe information. IEICE Transactions on Information and Systems E101.D(5):1333–1341. https://doi.org/10.1587/transinf.2017MVP0027
Freitas CNC, Cordeiro FR, Macario V (2020) Myfood: A food segmentation and classification system to aid nutritional monitoring. In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp 234–239, https://doi.org/10.1109/SIBGRAPI51738.2020.00039
Géron A (2019) Mãos à Obra: Aprendizado de Máquina com Scikit-Learn & Tensorflow. Alta Boks, Rio de Janeiro
Google Scholar
Gonzales RC, Woods RE (2010) Digital image processing, 3rd edn. Person Education, São Paulo
Google Scholar
Google (2014) Colab. https://colab.research.google.com/notebooks/welcome.ipynb?hl=en
Hollemans M (2018) Mobilenet version 2. https://machinethink.net/blog/mobilenet-v2/
Howard AG, Zhu M, Chen B, et al (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861
Ibtehaz N, Rahman MS (2020) Multiresunet : Rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87. https://doi.org/10.1016/j.neunet.2019.08.025
Article Google Scholar
Islam KT, Wijewickrema S, Pervez M, et al (2018a) An exploration of deep transfer learning for food image classification. In: 2018 Digital Image Computing: Techniques and Applications (DICTA), pp 1–5, https://doi.org/10.1109/DICTA.2018.8615812
Islam MT, Karim Siddique BN, Rahman S, et al (2018b) Food image classification with convolutional neural network. In: 2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), pp 257–262, https://doi.org/10.1109/ICIIBMS.2018.8550005
Isola P, Zhu JY, Zhou T, et al (2017) Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 5967–5976, https://doi.org/10.1109/CVPR.2017.632
Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. In: Proceedings of the 22nd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, pp 1085–1088, https://doi.org/10.1145/2647868.2654970
Kawano Y, Yanai K (2014) Foodcam-256: A large-scale real-time mobile food recognitionsystem employing high-dimensional features and compression of classifier weights. In: Proceedings of the 22nd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, pp 761–762, https://doi.org/10.1145/2647868.2654869
Kingma D, Ba J (2014) Adam: A method for stochastic optimization. International Conference on Learning Representations
Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Li M, Huang PY, Chang X et al (2023) Video pivoting unsupervised multi-modal machine translation. IEEE Trans Pattern Anal Mach Intell 45(3):3918–3932. https://doi.org/10.1109/TPAMI.2022.3181116
Article Google Scholar
Li Z, Nie F, Chang X et al (2018) Rank-constrained spectral clustering with flexible embedding. IEEE Transactions on Neural Networks and Learning Systems 29(12):6073–6082. https://doi.org/10.1109/TNNLS.2018.2817538
Article MathSciNet Google Scholar
Li Z, Nie F, Chang X et al (2018) Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Transactions on Neural Networks and Learning Systems 29(12):6323–6332. https://doi.org/10.1109/TNNLS.2018.2829867
Article MathSciNet Google Scholar
Liu C, Cao Y, Luo Y et al (2018) A new deep learning-based food recognition system for dietary assessment on an edge computing service infrastructure. IEEE Trans Serv Comput 11(2):249–261. https://doi.org/10.1109/TSC.2017.2662008
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3431–3440, https://doi.org/10.1109/CVPR.2015.7298965
Memis S, Arslan B, Batur OZ, et al (2020) A comparative study of deep learning methods on food classification problem. In: 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), pp 1–4, https://doi.org/10.1109/ASYU50717.2020.9259904
Myers A, Johnston N, Rathod V, et al (2015) Im2calories: Towards an automated mobile vision food diary. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp 1233–1241, https://doi.org/10.1109/ICCV.2015.146
Pedrine H, Schwartz WR (2008) Análise de imagens digitais: princípios, algoritmos e aplicações. Cengage Learning, São Paulo
Google Scholar
Pouladzadeh P, Yassine A, Shirmohammadi S (2015) Foodd: Food detection dataset for calorie measurement using food images. In: Murino V, Puppo E, Sona D et al (eds) New Trends in Image Analysis and Processing - ICIAP 2015 Workshops. Springer International Publishing, Cham, pp 441–448
Google Scholar
Reddy VH, Kumari S, Muralidharan V, et al (2019) Food recognition and calorie measurement using image processing and convolutional neural network. In: 2019 4th International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT), pp 109–115, https://doi.org/10.1109/RTEICT46194.2019.9016694
Rother C, Kolmogorov V, Blake A (2004) “grabcut”: Interactive foreground extraction using iterated graph cuts. In: ACM SIGGRAPH 2004 Papers. Association for Computing Machinery, New York, NY, USA, pp 309–314, https://doi.org/10.1145/1186562.1015720
Samraj A, D. S, K.A. D, et al (2020) Food genre classification from food images by deep neural network with tensorflow and keras. In: 2020 Seventh International Conference on Information Technology Trends (ITT), pp 228–231, https://doi.org/10.1109/ITT51279.2020.9320870
Sandler M, Howard A, Zhu M, et al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4510–4520, https://doi.org/10.1109/CVPR.2018.00474
Singla A, Yuan L, Ebrahimi T (2016) Food/non-food image classification and food categorization using pre-trained googlenet model. In: Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management. Association for Computing Machinery, New York, NY, USA, pp 3–11, https://doi.org/10.1145/2986035.2986039
Suzuki K, Horiba I, Sugie N (2003) Linear-time connected-component labeling based on sequential local operations. Comput Vis Image Underst 89(1):1–23. https://doi.org/10.1016/S1077-3142(02)00030-9
Article Google Scholar
Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. CoRR abs/1602.07261. http://arxiv.org/abs/1602.07261
Vaz EC, Porfírio GJM, de Carvalho Nunes HR, et al (2018) Effectiveness and safety of carbohydrate counting in the management of adult patients with type 1 diabetes mellitus: a systematic review and meta-analysis. Archives of Endocrinology and Metabolism https://doi.org/10.20945/2359-3997000000045
Yan C, Chang X, Luo M, et al (2020) Self-weighted robust lda for multiclass classification with edge classes. ACM Trans Intell Syst Technol 12(1). https://doi.org/10.1145/3418284, https://doi-org.ez38.periodicos.capes.gov.br/10.1145/3418284
Yan C, Chang X, Li Z et al (2022) Zeronas: Differentiable generative adversarial networks search for zero-shot learning. IEEE Trans Pattern Anal Mach Intell 44(12):9733–9740. https://doi.org/10.1109/TPAMI.2021.3127346
Article Google Scholar
Yuan D, Chang X, Li Z, et al (2022) Learning adaptive spatial-temporal context-aware correlation filters for uav tracking. ACM Trans Multimedia Comput Commun Appl 18(3). https://doi.org/10.1145/3486678, https://doi-org.ez38.periodicos.capes.gov.br/10.1145/3486678
Yu X, Yu Z, Ramalingam S (2018) Learning strict identity mappings in deep residual networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4432–4440, https://doi.org/10.1109/CVPR.2018.00466
Zhang L, Chang X, Liu J et al (2023) Tn-zstad: Transferable network for zero-shot temporal activity detection. IEEE Trans Pattern Anal Mach Intell 45(3):3848–3861. https://doi.org/10.1109/TPAMI.2022.3183586
Article Google Scholar
Zhang Z, Sabuncu MR (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. CoRR abs/1805.07836
Zoph B, Vasudevan V, Shlens J, et al (2018) Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8697–8710, https://doi.org/10.1109/CVPR.2018.00907

Download references

Acknowledgements

This work was supported by CAPES, CNPq, and FAPEMIG.

Author information

Authors and Affiliations

Federal University of Itajubá, Av. BPS, 1303, Itajubá, 37500-093, MG, Brazil
Marcos A. Carvalho, Tales C. Pimenta & Jaqueline C. S. Carvalho
Edson Antônio Velano University, Rod. MG-179 Km 0, Alfenas, 37132-440, MG, Brazil
Marcos A. Carvalho, Alessandra C. P. Silvério & Jaqueline C. S. Carvalho

Authors

Marcos A. Carvalho
View author publications
You can also search for this author in PubMed Google Scholar
Tales C. Pimenta
View author publications
You can also search for this author in PubMed Google Scholar
Alessandra C. P. Silvério
View author publications
You can also search for this author in PubMed Google Scholar
Jaqueline C. S. Carvalho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcos A. Carvalho.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Carvalho, M.A., Pimenta, T.C., Silvério, A.C.P. et al. Computer vision model for food identification in meals from the segmentation obtained by a set of fully convolutional networks. J Ambient Intell Human Comput 14, 16879–16890 (2023). https://doi.org/10.1007/s12652-023-04703-9

Download citation

Received: 17 February 2023
Accepted: 19 September 2023
Published: 04 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s12652-023-04703-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computer vision model for food identification in meals from the segmentation obtained by a set of fully convolutional networks

Abstract

Access this article

Similar content being viewed by others

Food Recognition for Dietary Assessment Using Deep Convolutional Neural Networks

Estimation of Dietary Calories Using Image Processing

Exploring Food Detection Using CNNs

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Computer vision model for food identification in meals from the segmentation obtained by a set of fully convolutional networks

Abstract

Access this article

Similar content being viewed by others

Food Recognition for Dietary Assessment Using Deep Convolutional Neural Networks

Estimation of Dietary Calories Using Image Processing

Exploring Food Detection Using CNNs

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation