The multi-learning for food analyses in computer vision: a survey

Dai, Jingzhao; Hu, Xuejiao; Li, Ming; Li, Yang; Du, Sidan

doi:10.1007/s11042-023-14373-6

The multi-learning for food analyses in computer vision: a survey

Published: 26 January 2023

Volume 82, pages 25615–25650, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jingzhao Dai¹,
Xuejiao Hu¹,
Ming Li¹,
Yang Li¹ &
…
Sidan Du ORCID: orcid.org/0000-0001-6432-3704¹

370 Accesses
1 Citation
Explore all metrics

Abstract

With the rapid development of food production and health management, analyses of food samples have been essential for preventing diseases and understanding human culture. Recently, food analyses have become increasingly complex and are not limited in food categorization. They also contain many advanced tasks (e.g., nutrition estimation and recipe retrieval). From existing works, two points can be concluded. First, food features are much more comprehensive and sophisticated than general samples. Second, for food analyses, multiple learning strategies (MLSs) usually achieve outperformance over general deep learning methods. However, there are few survey papers reporting food analyses with MLSs, and the main factors lead to difficulty of operation. Therefore, we intend to conduct a survey for applications of MLSs to food analyses. In this survey paper, three types of common MLSs, which are multi-task learning (MTL), multi-view learning (MVL) and multi-scale learning (MSL) strategies, are presented in terms of their guidance, typical works, algorithms and final aggregation methods. Additionally, food characteristics are proposed to be closely related to the difficulty of food analyses. We comprehensively conclude food characteristics as nonrigid, complex in arrangement, and large (small) in intraclass (interclass) variance. Moreover, some experimental results of MLSs are also presented and analyzed in this paper. Based on these results, insightful suggestions for MLSs implementation are proposed. Finally, the promising tendency of MLSs applications in the future is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Food Image Classification: The Benefit of In-Domain Transfer Learning

Comparing Machine Learning vs. Humans for Dietary Assessment

S2ML-TL Framework for Multi-label Food Recognition

Data availability

In this section, we present a data available statement in this paper as follows.

1. The datasets analyzed during the current survey paper are openly available

The dataset Food-101 is openly available at https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101 as cited in Herranz L. et al. (2014).

The dataset UEC-Food 256 is available at http://foodcam.mobi/ as cited in Kawano Y. et al. (2014).

The dataset VIREO Food-172 is available at http://vireo.cs.cityu.edu.hk/vireofood172/ as cited in Chen J. et al. (2016).

The dataset ChineseFoodNet is available at https://sites.google.com/view/chinesefoodnet/ as cited in Chen X. et al. (2017).

The dataset Recipe1M is available at http://im2recipe.csail.mit.edu as cited in Salvador A. et al. (2017).

The dataset ISIA Food-200 is available at http://123.57.42.89/Dataset_ict/WIKI Food/ISIA Food200_v2/ as cited in Min W. et al. (2019).

The dataset UNICT-FD889 is available at https://iplab.dmi.unict.it/UNICT-FD889/ as cited in Y. et al. (2015).

The dataset MAFood-121 is available at http://www.ub.edu/cvub/mafood121/as cited in Aguilar E. et al. (2019).

2. The datasets analyzed during the current survey paper are published or included in these articles

The data Food-50 is published in Taichi J. et al. (2009).

The data Food-85 is included in Hoashi H. et al. (2010).

The dataset FOODD is available in Pouladzadeh P. et al. (2015).

The data presented as “A manual dataset” in this paper is published in Chen J.-j. et al. (2017).

The data TurkishFoods-15 is published in Güngör C. et al. (2017).

The data Drink101 is published in Park H. et al. (2019).

The data ETH Food-101 is included in Min W. et al. (2019) and Jiang S. et al. (2020).

References

Aguilar E, Bolaños M, Radeva P (2019) Regularized uncertainty-based multi-task learning model for food analysis. J Vis Commun Image Represent 60:360–370. https://doi.org/10.1016/j.jvcir.2019.03.011
Article Google Scholar
AlZu’bi S, Hawashin B, Mujahed M, Jararweh Y, Gupta BB (2019) An efficient employment of internet of multimedia things in smart and future agriculture. Multimed Tools Appl 78(20):29581–29605. https://doi.org/10.1007/s11042-019-7367-0
Anis S, Lai KW, Chuah JH, Ali SM, Mohafez H, Hadizadeh M, Yan D, Ong ZC (2020) An overview of deep learning approaches in chest radiograph. IEEE Access 8:182347–182354. https://doi.org/10.1109/ACCESS.2020.3028390
Article Google Scholar
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. International conference on learning representations (ICLR). https://doi.org/10.48550/arXiv.1409.0473
Bettadapura V, Thomaz E, Parnami A et al (2015) Leveraging context to support automated food recognition in restaurants. 2015 IEEE Winter Conference on Applications of Computer Vision 580–587. https://doi.org/10.1109/WACV.2015.83
Bossard L, Guillaumin M, Van Gool L (2014) Food-101 – mining discriminative components with random forests. European conference on computer vision (ECCV). Cham 446-461
Chen J, Ngo C-W (2016) Deep-based ingredient recognition for cooking recipe retrieval. Proceedings of the 24th ACM international conference on multimedia. 32-41. https://doi.org/10.1145/2964284.2964315
Chen M, Dhingra K, Wu W et al (2009) PFID: Pittsburgh fast-food image dataset. 2009 16th IEEE international conference on image processing (ICIP). 289-292. https://doi.org/10.1109/ICIP.2009.5413511
Chen J-J, Ngo C-W, Chua T-S (2017) Cross-modal recipe retrieval with rich food attributes. Proceedings of the 25th ACM international conference on multimedia. 1771-1779. https://doi.org/10.1145/3123266.3123428
Chen X, Zhu Y, Zhou H et al (2017) ChineseFoodNet: a large-scale image dataset for Chinese food recognition. arXiv:1705.02743
Chen L, Papandreou G, Kokkinos I et al (2018) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184
Article Google Scholar
Chen Y, Bai Y, Zhang W et al (2019) Destruction and construction learning for fine-grained image recognition. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Ciocca G, Napoletano P, Schettini R (2017) Food recognition: a new dataset, experiments, and results. IEEE J Biomed Health Inf 21(3):588–598. https://doi.org/10.1109/JBHI.2016.2636441
Article Google Scholar
Ciocca G, Micali G, Napoletano P (2020) State recognition of food images using deep features. IEEE Access 8:32003–32017. https://doi.org/10.1109/ACCESS.2020.2973704
Article Google Scholar
Cipolla R, Gal Y, Kendall A (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7482–7491. https://doi.org/10.1109/CVPR.2018.00781
Doersch C, Zisserman A (2017) Multi-task self-supervised visual learning. 2017 IEEE International Conference on Computer Vision (ICCV). 2070–2079. https://doi.org/10.1109/ICCV.2017.226
Ege T, Yanai K (2017) Simultaneous estimation of food categories and calories with multi-task CNN. 2017 fifteenth IAPR international conference on machine vision applications (MVA). 198-201. https://doi.org/10.23919/MVA.2017.7986835
Ege T, Yanai K (2017) Image-based food calorie estimation using knowledge on food categories, ingredients and cooking directions. 367-375
Ege T, Yanai K (2018) Multi-task learning of dish detection and calorie estimation: in CEA/MADiMa’18: joint workshop on multimedia for cooking and eating activities and multimedia assisted dietary management in conjunction with the 27th international joint conference on artificial intelligence IJCAI
Fakhrou A, Kunhoth J, Al MS (2021) Smartphone-based food recognition system using multiple deep CNN models. Multimed Tools Appl 80(21–23):33011–33032. https://doi.org/10.1007/s11042-021-11329-6
Article Google Scholar
Farinella GM, Moltisanti M, Battiato S (2015) Classifying food images represented as bag of Textons. IEEE international conference on image processing. 5212-5216. https://doi.org/10.1109/ICIP.2014.7026055
Fu J, Zheng H, Mei T (2017) Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 4476–4484. https://doi.org/10.1109/CVPR.2017.476
Fu H, Wu R, Liu C et al (2020) MCEN: bridging cross-modal gap between cooking recipes and dish images with latent variable model. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 14558–14568. https://doi.org/10.1109/CVPR42600.2020.01458
Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8695. Springer, Cham. https://doi.org/10.1007/978-3-319-10584-0_26
Güngör C, Baltacı F, Erdem A et al (2017) Turkish cuisine: a benchmark dataset with Turkish meals for food recognition. In: 2017 25th Signal Processing and Communications Applications Conference (SIU), Antalya, pp 1–4. https://doi.org/10.1109/SIU.2017.7960494
Guo S, Huang W, Zhang H et al (2018) CurriculumNet: weakly supervised learning from large-scale web images.In: computer vision – ECCV 2018. Pp. 139-154
Hassannejad H, Matrella G, Ciampolini P et al (2016) Food image recognition using very deep convolutional networks. Proceedings of the 2nd international workshop on multimedia assisted dietary management. 41-49. https://doi.org/10.1145/2986035.2986042
He H, Kong F, Tan J (2016) DietCam: Multiview food recognition using a multikernel SVM. IEEE J Biomed Health Inf 20(3):848–855. https://doi.org/10.1109/JBHI.2015.2419251
Article Google Scholar
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, pp 770–778, https://doi.org/10.1109/CVPR.2016.90
He J, Shao Z, Wright J et al (2020) Multi-task image-based dietary assessment for food recognition and portion size estimation. 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) 49–54. https://doi.org/10.1109/MIPR49039.2020.00018
Herranz L, Jiang S, Xu R (2017) Modeling restaurant context for food recognition. IEEE Trans Multimed 19(2):430–440. https://doi.org/10.1109/TMM.2016.2614861
Article Google Scholar
Hoashi H, Joutou T, Yanai K (2010) Image recognition of 85 food categories by feature fusion. IEEE Int Symp Multimed 2010:296–301. https://doi.org/10.1109/ISM.2010.51
Article Google Scholar
Horiguchi S, Amano S, Ogawa M, Aizawa K (2018) Personalized classifier for food image recognition. IEEE Trans Multimed 20(10):2836–2848. https://doi.org/10.1109/TMM.2018.2814339
Article Google Scholar
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Article Google Scholar
Jha R (2022) A novel hybrid intelligent technique to enhance customer relationship management in online food delivery system. Multimed Tools Appl 81:28583–28606. https://doi.org/10.1007/s11042-022-12877-1
Article Google Scholar
Jiang S, Min W, Liu L, Luo Z (2020) Multi-scale multi-view deep feature aggregation for food recognition. IEEE Trans Image Process 29:265–276. https://doi.org/10.1109/TIP.2019.2929447
Article MathSciNet MATH Google Scholar
Jiang S, Min W, Lyu Y, Liu L (2020) Few-shot food recognition via multi-view representation learning. ACM Trans Multimed Comput Commun Appl 16(3):1–20. https://doi.org/10.1145/3391624
Article Google Scholar
Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. Proceedings of the ACM international conference on multimedia - MM '14. 1085-1088. https://doi.org/10.1145/2647868.2654970
Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. Acm Int Conf Multim
Kawano Y, Yanai K (2014) FoodCam-256: a large-scale real-time Mobile food RecognitionSystem employing high-dimensional features and compression of classifier weights. Proceedings of the 22nd ACM international conference on multimedia
Kawano Y, Yanai K (2015) Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. European conference on computer vision (ECCV). Cham. 3-17. https://doi.org/10.1007/978-3-319-16199-0_1
Kazi A, Panda SP (2022) Determining the freshness of fruits in the food industry by image classification using transfer learning. Multimed Tools Appl 81(6):7611–7624. https://doi.org/10.1007/s11042-022-12150-5
Article Google Scholar
Kong F, He H, Raynor HA, Tan J (2015) DietCam: multi-view regular shape food recognition with a camera phone. Pervasive Mob Comput 19:108–121. https://doi.org/10.1016/j.pmcj.2014.05.012
Article Google Scholar
Liang Y, Li J (2017) Computer vision-based food calorie estimation: dataset, method, and experiment. arXiv:1705.07632
Liang H, Wen G, Hu Y et al (2021) MVANet: multi-tasks guided multi-view attention network for Chinese food recognition. EEE Trans Multimedia 23:3551–3561. https://doi.org/10.1109/TMM.2020.3028478
Lin TY, Roychowdhury A, Maji S (2017) Bilinear convolutional neural networks for fine-grained visual recognition. IEEE Trans Pattern Anal Mach Intell, 1-1
Liu X, Xia T, Wang J et al (2017) Fully convolutional attention networks for fine-grained recognition. 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv:1603.06765v4
Liu C, Cao Y, Luo Y et al (2016) DeepFood: deep learning-based food image recognition for computer-aided dietary assessment. DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In: Chang C, Chiari L, Cao Y, Jin H, Mokhtari M, Aloulou H (eds) Inclusive Smart Cities and Digital Health. ICOST 2016. Lecture Notes in Computer Science, vol 9677. Springer, Cham. https://doi.org/10.1007/978-3-319-39601-9_4
Liu Q, Zhang Y, Liu Z, Yuan Y, Cheng L, Zimmermann R (2018). Multi-modal multi-task learning for automatic dietary assessment. Thirty-Second AAAI Conf Artif Intell (AAAI-18). 2347–2354
Liu C, Liang Y, Xue Y et al (2020) Food and ingredient joint learning for fine-grained recognition. IEEE transactions on circuits and Systems for Video Technology, 1-1. https://doi.org/10.1109/TCSVT.2020.3020079
Liu Y, Chen J, Bao N, Gupta BB, Lv Z (2021) Survey on atrial fibrillation detection from a single-lead ECG wave for internet of medical things. Comput Commun 178:245–258. https://doi.org/10.1016/j.comcom.2021.08.002
Article Google Scholar
Lo FPW, Sun Y, Qiu J, Lo B (2020) Image-based food classification and volume estimation for dietary assessment: a review. IEEE J Biomed Health Inform 24(7):1926–1939. https://doi.org/10.1109/JBHI.2020.2987943
Article Google Scholar
Luvizon DC, Picard D, Tabia H (2018) 2D/3D pose estimation and action recognition using multitask deep learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 5137–5146. https://doi.org/10.1109/CVPR.2018.00539
Martinel N, Foresti GL, Micheloni C (2016) Wide-slice residual networks for food recognition. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, pp 567–576. https://doi.org/10.1109/WACV.2018.00068
Matsuda Y, Hoashi H, Yanai K (2012) Recognition of multiple-food images by detecting candidate regions. 2012 IEEE International Conference on Multimedia and Expo, Melbourne, VIC, Australia, 2012, pp. 25-30. https://doi.org/10.1109/ICME.2012.157
Min W, Bao BK, Mei S et al (2017) You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans Multimed, 1–1
Min W, Jiang S, Wang S et al (2017) A delicious recipe analysis framework for exploring multi-modal recipes with various attributes. Proceedings of the 25th ACM international conference on multimedia. 402-410. https://doi.org/10.1145/3123266.3123272
Min W, Jiang S, Sang J, Wang H, Liu X, Herranz L (2017) Being a Supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans Multimed 19(5):1100–1113
Article Google Scholar
Min W, Liu L, Luo Z et al (2019) Ingredient-guided cascaded multi-attention network for food recognition. The 27th ACM international conference on multimedia, pp 1331–1339. https://doi.org/10.1145/3343031.3350948
Min W, Jiang S, Liu L, Rui Y, Jain R (2020) A survey on food computing. ACM Comput Surv 52(5):1–36. https://doi.org/10.1145/3329168
Article Google Scholar
Ming ZY, Chen J, Cao Y et al (2018) Food photo recognition for dietary tracking; system and experiment. International Conference on Multimedia Modeling (MMM) https://doi.org/10.1007/978-3-319-73600-6_12
Mnih V, Heess N, Graves A et al (2014) Recurrent models of visual attention. In: NIPS'14: proceedings of the 27th international conference on neural information processing systems, pp 2204–2212. http://arxiv.org/abs/1406.6247
Myers A, Johnston N, Rathod V et al (2015) Im2Calories: towards an automated Mobile vision food diary. 2015 IEEE Int Conf Comput Vis (ICCV). 1233–1241. https://doi.org/10.1109/ICCV.2015.146
Nag N, Pandey V, Jain R (2017) Health multimedia. Proceedings of the 2017 ACM on international conference on multimedia retrieval. 99-106. https://doi.org/10.1145/3078971.3080545
Nandhini P, Jaya J, George J (2013) Computer vision system for food quality evaluation — a review. 2013 International Conference on Current Trends in Engineering and Technology (ICCTET) 85–87. https://doi.org/10.1109/ICCTET.2013.6675916
Ning Z, Donahue J, Girshick R et al (2014) Part-based R-CNNs for fine-grained category detection. European conference on computer vision (ECCV). https://doi.org/10.48550/arXiv.1407.3867
Pandey P, Deepthi A, Mandal B, Puhan NB (2017) FoodNet: recognizing foods using Ensemble of Deep Networks. IEEE Signal Process Lett 24(12):1758–1762. https://doi.org/10.1109/LSP.2017.2758862
Article Google Scholar
Papyan V, Elad M (2015) Multi-scale patch-based image restoration. IEEE transactions on image processing, 249-261
Park H, Bharadhwaj H, Lim BY (2019) Hierarchical multi-task learning for healthy drink classification. 2019 Int Joint Conf Neural Netw (IJCNN) 1–8. https://doi.org/10.1109/IJCNN.2019.8851796
Pouladzadeh P, Yassine A, Shirmohammadi S (2015) FooDD: food detection dataset for calorie measurement using food images.In: new trends in image analysis and processing -- ICIAP 2015 workshops. Pp. 441-448
Sajadmanesh S, Jafarzadeh S, Ossia SA et al (2016) Kissing cuisines: exploring worldwide culinary habits on the web. World Wide Web Conference, Web Science Companion
Salvador A, Hynes N, Aytar Y et al (2017) Learning cross-modal Embeddings for cooking recipes and food images. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3068–3076. https://doi.org/10.1109/CVPR.2017.327
Sarker MMK, Rashwan HA, Akram F, Talavera E, Banu SF, Radeva P, Puig D (2019) Recognizing food places in egocentric photo-streams using multi-scale Atrous convolutional networks and self-attention mechanism. IEEE Access 7:39069–39082. https://doi.org/10.1109/ACCESS.2019.2902225
Sarker MMK, Rashwan HA, Talavera E et al (2019) MACNet: multi-scale Atrous convolution networks for food places classification in egocentric photo-streams. 423-433
Sasano S, Han X, Chen Y (2016) Food recognition by combined bags of color features and texture features. 2016 9th international congress on image and signal processing, BioMedical engineering and informatics (CISP-BMEI). 815-819. https://doi.org/10.1109/CISP-BMEI.2016.7852822
Selvaraju RR, Cogswell M, Das A et al (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. 2017 IEEE Int Conf Comput Vis (ICCV) 618–626. https://doi.org/10.1109/ICCV.2017.74
Shimoda W, Yanai K (2017) Learning food image similarity for food image retrieval. In: 2017 IEEE Third International Conference on Multimedia Big Data (BigMM), Laguna Hills, pp 165–168. https://doi.org/10.1109/BigMM.2017.73
Situju SF, Takimoto H, Sato S, Yamauchi H, Kanagawa A, Lawi A (2019) Food constituent estimation for lifestyle disease prevention by multi-task CNN. Appl Artif Intell 33(8):732–746. https://doi.org/10.1080/08839514.2019.1602318
Article Google Scholar
Sood S, Singh H (2021) Computer vision and machine learning based approaches for food security: a review. Multimed Tools Appl 80:27973–27999. https://doi.org/10.1007/s11042-021-11036-2
Srivastava N, Salakhutdinov R (2012) Multimodal learning with deep Boltzmann machines. J Mach Learn Res 15(1):2949–2980
MathSciNet MATH Google Scholar
Subhi MA, Ali SH, Mohammed MA (2019) Vision-based approaches for automatic food recognition and dietary assessment: a survey. IEEE Access 7:35370–35381. https://doi.org/10.1109/ACCESS.2019.2904519
Article Google Scholar
Sung F, Yang Y, Zhang L et al (2018) Learning to compare: relation network for few-shot learning. 2018 IEEE/CVF Conf Comput Vis Pattern Recognition 1199–1208. https://doi.org/10.1109/CVPR.2018.00131
Taichi J, Keiji Y (2009) A food image recognition system with multiple kernel learning. 2009 16th IEEE international conference on image processing (ICIP). 285-288. https://doi.org/10.1109/ICIP.2009.5413400
Tanno R, Okamoto K, Yanai K (2016) DeepFoodCam: A DCNN-based real-time mobile food recognition system. In: Proceedings of the 2nd international workshop on multimedia assisted dietary management - MADiMa '16, pp 89–89. https://doi.org/10.1145/2986035.2986044
Wang H, Min W, Li X et al (2016) Where and what to eat: simultaneous restaurant and dish recognition from food image. Pacific Rim Conference on Multimedia
Wang Z, Chen T, Li G et al (2017) Multi-label image recognition by recurrently discovering attentional regions. In: 2017 IEEE international conference on computer vision (ICCV), Venice, pp 464–472. https://doi.org/10.1109/ICCV.2017.58
Woo S, Park J, Lee JY et al (2018) CBAM: convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science, vol 11211. Springer, Cham. https://doi.org/10.1007/978-3-030-01234-2_1
Wu R, Wang B, Wang W et al (2015) Harvesting discriminative Meta objects with deep CNN features for scene classification. In: 2015 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/ICCV.2015.152
Xinhang, Song, Shuqiang et al (2017). Multi-scale multi-feature context modeling for scene recognition in the semantic manifold. IEEE Trans Image Process, 26(6), 2721–2735.
Xu R, Herranz L, Jiang S, Wang S, Song X, Jain R (2015) Geolocalized modeling for dish recognition. IEEE Trans Multimed 17(8):1187–1199
Article Google Scholar
Xu D, Ouyang W, Wang X et al (2018) PAD-Net: multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp 675–684. https://doi.org/10.1109/CVPR.2018.00077
Yang S, Chen M, Pomerleau D et al (2010) Food recognition using statistics of pairwise local features. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, pp 2249–2256. https://doi.org/10.1109/CVPR.2010.5539907
Yang J, Shen X, Tian X et al (2018) Local convolutional neural networks for person re-identification. In: Proceedings of the 26th ACM international conference on multimedia. October 2018, pp 1074–1082. https://doi.org/10.1145/3240508.3240645
Yu Q, Anzawa M, Amano S et al (2018) Food image recognition by personalized classifier. In: 2018 25th IEEE international conference on image processing (ICIP), Athens, pp 171–175. https://doi.org/10.1109/ICIP.2018.8451422
Zhang X-J, Lu Y-F, Zhang S-H (2016) Multi-task learning for food identification and analysis with deep convolutional neural networks. J Comput Sci Technol 31(3):489–500. https://doi.org/10.1007/s11390-016-1642-6
Article Google Scholar
Zhang H, Xu G, Liang X, Zhang W, Sun X, Huang T (2019) Multi-view multitask learning for knowledge base relation detection. Knowl-Based Syst 183:104870. https://doi.org/10.1016/j.knosys.2019.104870
Article Google Scholar
Zhang W, Wu J, Yang Y (2020) Wi-HSNN: a subnetwork-based encoding structure for dimension reduction and food classification via harnessing multi-CNN model high-level features. Neurocomputing 414:57–66. https://doi.org/10.1016/j.neucom.2020.07.018
Article Google Scholar
Zheng H, Fu J, Mei T et al (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. IEEE Int Conf Comput Vis (ICCV) 2017:5219–5227. https://doi.org/10.1109/ICCV.2017.557
Article Google Scholar
Zhou F, Lin Y (2016) Fine-grained image classification by exploring bipartite-graph labels. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1124–1133. https://doi.org/10.1109/CVPR.2016.127
Zhu Y, Wang J, Xie L et al (2018) Attention-based pyramid aggregation network for visual place recognition. Proceedings of the 26th ACM international conference on multimedia. 99-107. https://doi.org/10.1145/3240508.3240525

Download references

Acknowledgments

We acknowledge the computational resources supported by High-Performance Computing Center of Collaborative Innovation Center of Advanced Microstructures, Nanjing University.

Author information

Authors and Affiliations

School of Electronic Science and Engineering, Nanjing University, Nanjing, China
Jingzhao Dai, Xuejiao Hu, Ming Li, Yang Li & Sidan Du

Authors

Jingzhao Dai
View author publications
You can also search for this author in PubMed Google Scholar
Xuejiao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Li
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar
Sidan Du
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sidan Du.

Ethics declarations

Conflict of interest

We have no conflicts of interest to disclose with regard to this survey paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dai, J., Hu, X., Li, M. et al. The multi-learning for food analyses in computer vision: a survey. Multimed Tools Appl 82, 25615–25650 (2023). https://doi.org/10.1007/s11042-023-14373-6

Download citation

Received: 11 August 2021
Revised: 02 December 2022
Accepted: 06 January 2023
Published: 26 January 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11042-023-14373-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The multi-learning for food analyses in computer vision: a survey

Abstract

Access this article

Similar content being viewed by others

Food Image Classification: The Benefit of In-Domain Transfer Learning

Comparing Machine Learning vs. Humans for Dietary Assessment

S2ML-TL Framework for Multi-label Food Recognition

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The multi-learning for food analyses in computer vision: a survey

Abstract

Access this article

Similar content being viewed by others

Food Image Classification: The Benefit of In-Domain Transfer Learning

Comparing Machine Learning vs. Humans for Dietary Assessment

S2ML-TL Framework for Multi-label Food Recognition

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation