Skip to main content

Advertisement

Log in

The multi-learning for food analyses in computer vision: a survey

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the rapid development of food production and health management, analyses of food samples have been essential for preventing diseases and understanding human culture. Recently, food analyses have become increasingly complex and are not limited in food categorization. They also contain many advanced tasks (e.g., nutrition estimation and recipe retrieval). From existing works, two points can be concluded. First, food features are much more comprehensive and sophisticated than general samples. Second, for food analyses, multiple learning strategies (MLSs) usually achieve outperformance over general deep learning methods. However, there are few survey papers reporting food analyses with MLSs, and the main factors lead to difficulty of operation. Therefore, we intend to conduct a survey for applications of MLSs to food analyses. In this survey paper, three types of common MLSs, which are multi-task learning (MTL), multi-view learning (MVL) and multi-scale learning (MSL) strategies, are presented in terms of their guidance, typical works, algorithms and final aggregation methods. Additionally, food characteristics are proposed to be closely related to the difficulty of food analyses. We comprehensively conclude food characteristics as nonrigid, complex in arrangement, and large (small) in intraclass (interclass) variance. Moreover, some experimental results of MLSs are also presented and analyzed in this paper. Based on these results, insightful suggestions for MLSs implementation are proposed. Finally, the promising tendency of MLSs applications in the future is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

In this section, we present a data available statement in this paper as follows.

1. The datasets analyzed during the current survey paper are openly available

The dataset Food-101 is openly available at https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101 as cited in Herranz L. et al. (2014).

The dataset UEC-Food 256 is available at http://foodcam.mobi/ as cited in Kawano Y. et al. (2014).

The dataset VIREO Food-172 is available at http://vireo.cs.cityu.edu.hk/vireofood172/ as cited in Chen J. et al. (2016).

The dataset ChineseFoodNet is available at https://sites.google.com/view/chinesefoodnet/ as cited in Chen X. et al. (2017).

The dataset Recipe1M is available at http://im2recipe.csail.mit.edu as cited in Salvador A. et al. (2017).

The dataset ISIA Food-200 is available at http://123.57.42.89/Dataset_ict/WIKI Food/ISIA Food200_v2/ as cited in Min W. et al. (2019).

The dataset UNICT-FD889 is available at https://iplab.dmi.unict.it/UNICT-FD889/ as cited in Y. et al. (2015).

The dataset MAFood-121 is available at http://www.ub.edu/cvub/mafood121/as cited in Aguilar E. et al. (2019).

2. The datasets analyzed during the current survey paper are published or included in these articles

The data Food-50 is published in Taichi J. et al. (2009).

The data Food-85 is included in Hoashi H. et al. (2010).

The dataset FOODD is available in Pouladzadeh P. et al. (2015).

The data presented as “A manual dataset” in this paper is published in Chen J.-j. et al. (2017).

The data TurkishFoods-15 is published in Güngör C. et al. (2017).

The data Drink101 is published in Park H. et al. (2019).

The data ETH Food-101 is included in Min W. et al. (2019) and Jiang S. et al. (2020).

References

  1. Aguilar E, Bolaños M, Radeva P (2019) Regularized uncertainty-based multi-task learning model for food analysis. J Vis Commun Image Represent 60:360–370. https://doi.org/10.1016/j.jvcir.2019.03.011

    Article  Google Scholar 

  2. AlZu’bi S, Hawashin B, Mujahed M, Jararweh Y, Gupta BB (2019) An efficient employment of internet of multimedia things in smart and future agriculture. Multimed Tools Appl 78(20):29581–29605. https://doi.org/10.1007/s11042-019-7367-0

  3. Anis S, Lai KW, Chuah JH, Ali SM, Mohafez H, Hadizadeh M, Yan D, Ong ZC (2020) An overview of deep learning approaches in chest radiograph. IEEE Access 8:182347–182354. https://doi.org/10.1109/ACCESS.2020.3028390

    Article  Google Scholar 

  4. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. International conference on learning representations (ICLR). https://doi.org/10.48550/arXiv.1409.0473

  5. Bettadapura V, Thomaz E, Parnami A et al (2015) Leveraging context to support automated food recognition in restaurants. 2015 IEEE Winter Conference on Applications of Computer Vision 580–587. https://doi.org/10.1109/WACV.2015.83

  6. Bossard L, Guillaumin M, Van Gool L (2014) Food-101 – mining discriminative components with random forests. European conference on computer vision (ECCV). Cham 446-461

  7. Chen J, Ngo C-W (2016) Deep-based ingredient recognition for cooking recipe retrieval. Proceedings of the 24th ACM international conference on multimedia. 32-41. https://doi.org/10.1145/2964284.2964315

  8. Chen M, Dhingra K, Wu W et al (2009) PFID: Pittsburgh fast-food image dataset. 2009 16th IEEE international conference on image processing (ICIP). 289-292. https://doi.org/10.1109/ICIP.2009.5413511

  9. Chen J-J, Ngo C-W, Chua T-S (2017) Cross-modal recipe retrieval with rich food attributes. Proceedings of the 25th ACM international conference on multimedia. 1771-1779. https://doi.org/10.1145/3123266.3123428

  10. Chen X, Zhu Y, Zhou H et al (2017) ChineseFoodNet: a large-scale image dataset for Chinese food recognition. arXiv:1705.02743

  11. Chen L, Papandreou G, Kokkinos I et al (2018) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184

    Article  Google Scholar 

  12. Chen Y, Bai Y, Zhang W et al (2019) Destruction and construction learning for fine-grained image recognition. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  13. Ciocca G, Napoletano P, Schettini R (2017) Food recognition: a new dataset, experiments, and results. IEEE J Biomed Health Inf 21(3):588–598. https://doi.org/10.1109/JBHI.2016.2636441

    Article  Google Scholar 

  14. Ciocca G, Micali G, Napoletano P (2020) State recognition of food images using deep features. IEEE Access 8:32003–32017. https://doi.org/10.1109/ACCESS.2020.2973704

    Article  Google Scholar 

  15. Cipolla R, Gal Y, Kendall A (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7482–7491. https://doi.org/10.1109/CVPR.2018.00781

  16. Doersch C, Zisserman A (2017) Multi-task self-supervised visual learning. 2017 IEEE International Conference on Computer Vision (ICCV). 2070–2079. https://doi.org/10.1109/ICCV.2017.226

  17. Ege T, Yanai K (2017) Simultaneous estimation of food categories and calories with multi-task CNN. 2017 fifteenth IAPR international conference on machine vision applications (MVA). 198-201. https://doi.org/10.23919/MVA.2017.7986835

  18. Ege T, Yanai K (2017) Image-based food calorie estimation using knowledge on food categories, ingredients and cooking directions. 367-375

  19. Ege T, Yanai K (2018) Multi-task learning of dish detection and calorie estimation: in CEA/MADiMa’18: joint workshop on multimedia for cooking and eating activities and multimedia assisted dietary management in conjunction with the 27th international joint conference on artificial intelligence IJCAI

  20. Fakhrou A, Kunhoth J, Al MS (2021) Smartphone-based food recognition system using multiple deep CNN models. Multimed Tools Appl 80(21–23):33011–33032. https://doi.org/10.1007/s11042-021-11329-6

    Article  Google Scholar 

  21. Farinella GM, Moltisanti M, Battiato S (2015) Classifying food images represented as bag of Textons. IEEE international conference on image processing. 5212-5216. https://doi.org/10.1109/ICIP.2014.7026055

  22. Fu J, Zheng H, Mei T (2017) Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 4476–4484. https://doi.org/10.1109/CVPR.2017.476

  23. Fu H, Wu R, Liu C et al (2020) MCEN: bridging cross-modal gap between cooking recipes and dish images with latent variable model. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 14558–14568. https://doi.org/10.1109/CVPR42600.2020.01458

  24. Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8695. Springer, Cham. https://doi.org/10.1007/978-3-319-10584-0_26

  25. Güngör C, Baltacı F, Erdem A et al (2017) Turkish cuisine: a benchmark dataset with Turkish meals for food recognition. In: 2017 25th Signal Processing and Communications Applications Conference (SIU), Antalya, pp 1–4. https://doi.org/10.1109/SIU.2017.7960494

  26. Guo S, Huang W, Zhang H et al (2018) CurriculumNet: weakly supervised learning from large-scale web images.In: computer vision – ECCV 2018. Pp. 139-154

  27. Hassannejad H, Matrella G, Ciampolini P et al (2016) Food image recognition using very deep convolutional networks. Proceedings of the 2nd international workshop on multimedia assisted dietary management. 41-49. https://doi.org/10.1145/2986035.2986042

  28. He H, Kong F, Tan J (2016) DietCam: Multiview food recognition using a multikernel SVM. IEEE J Biomed Health Inf 20(3):848–855. https://doi.org/10.1109/JBHI.2015.2419251

    Article  Google Scholar 

  29. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, pp 770–778, https://doi.org/10.1109/CVPR.2016.90

  30. He J, Shao Z, Wright J et al (2020) Multi-task image-based dietary assessment for food recognition and portion size estimation. 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) 49–54. https://doi.org/10.1109/MIPR49039.2020.00018

  31. Herranz L, Jiang S, Xu R (2017) Modeling restaurant context for food recognition. IEEE Trans Multimed 19(2):430–440. https://doi.org/10.1109/TMM.2016.2614861

    Article  Google Scholar 

  32. Hoashi H, Joutou T, Yanai K (2010) Image recognition of 85 food categories by feature fusion. IEEE Int Symp Multimed 2010:296–301. https://doi.org/10.1109/ISM.2010.51

    Article  Google Scholar 

  33. Horiguchi S, Amano S, Ogawa M, Aizawa K (2018) Personalized classifier for food image recognition. IEEE Trans Multimed 20(10):2836–2848. https://doi.org/10.1109/TMM.2018.2814339

    Article  Google Scholar 

  34. Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372

    Article  Google Scholar 

  35. Jha R (2022) A novel hybrid intelligent technique to enhance customer relationship management in online food delivery system. Multimed Tools Appl 81:28583–28606. https://doi.org/10.1007/s11042-022-12877-1

    Article  Google Scholar 

  36. Jiang S, Min W, Liu L, Luo Z (2020) Multi-scale multi-view deep feature aggregation for food recognition. IEEE Trans Image Process 29:265–276. https://doi.org/10.1109/TIP.2019.2929447

    Article  MathSciNet  MATH  Google Scholar 

  37. Jiang S, Min W, Lyu Y, Liu L (2020) Few-shot food recognition via multi-view representation learning. ACM Trans Multimed Comput Commun Appl 16(3):1–20. https://doi.org/10.1145/3391624

    Article  Google Scholar 

  38. Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. Proceedings of the ACM international conference on multimedia - MM '14. 1085-1088. https://doi.org/10.1145/2647868.2654970

  39. Kagaya H, Aizawa K, Ogawa M (2014) Food detection and recognition using convolutional neural network. Acm Int Conf Multim

  40. Kawano Y, Yanai K (2014) FoodCam-256: a large-scale real-time Mobile food RecognitionSystem employing high-dimensional features and compression of classifier weights. Proceedings of the 22nd ACM international conference on multimedia

  41. Kawano Y, Yanai K (2015) Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. European conference on computer vision (ECCV). Cham. 3-17. https://doi.org/10.1007/978-3-319-16199-0_1

  42. Kazi A, Panda SP (2022) Determining the freshness of fruits in the food industry by image classification using transfer learning. Multimed Tools Appl 81(6):7611–7624. https://doi.org/10.1007/s11042-022-12150-5

    Article  Google Scholar 

  43. Kong F, He H, Raynor HA, Tan J (2015) DietCam: multi-view regular shape food recognition with a camera phone. Pervasive Mob Comput 19:108–121. https://doi.org/10.1016/j.pmcj.2014.05.012

    Article  Google Scholar 

  44. Liang Y, Li J (2017) Computer vision-based food calorie estimation: dataset, method, and experiment. arXiv:1705.07632

  45. Liang H, Wen G, Hu Y et al (2021) MVANet: multi-tasks guided multi-view attention network for Chinese food recognition. EEE Trans Multimedia 23:3551–3561. https://doi.org/10.1109/TMM.2020.3028478

  46. Lin TY, Roychowdhury A, Maji S (2017) Bilinear convolutional neural networks for fine-grained visual recognition. IEEE Trans Pattern Anal Mach Intell, 1-1

  47. Liu X, Xia T, Wang J et al (2017) Fully convolutional attention networks for fine-grained recognition. 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv:1603.06765v4

  48. Liu C, Cao Y, Luo Y et al (2016) DeepFood: deep learning-based food image recognition for computer-aided dietary assessment. DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In: Chang C, Chiari L, Cao Y, Jin H, Mokhtari M, Aloulou H (eds) Inclusive Smart Cities and Digital Health. ICOST 2016. Lecture Notes in Computer Science, vol 9677. Springer, Cham. https://doi.org/10.1007/978-3-319-39601-9_4

  49. Liu Q, Zhang Y, Liu Z, Yuan Y, Cheng L, Zimmermann R (2018). Multi-modal multi-task learning for automatic dietary assessment. Thirty-Second AAAI Conf Artif Intell (AAAI-18). 2347–2354

  50. Liu C, Liang Y, Xue Y et al (2020) Food and ingredient joint learning for fine-grained recognition. IEEE transactions on circuits and Systems for Video Technology, 1-1. https://doi.org/10.1109/TCSVT.2020.3020079

  51. Liu Y, Chen J, Bao N, Gupta BB, Lv Z (2021) Survey on atrial fibrillation detection from a single-lead ECG wave for internet of medical things. Comput Commun 178:245–258. https://doi.org/10.1016/j.comcom.2021.08.002

    Article  Google Scholar 

  52. Lo FPW, Sun Y, Qiu J, Lo B (2020) Image-based food classification and volume estimation for dietary assessment: a review. IEEE J Biomed Health Inform 24(7):1926–1939. https://doi.org/10.1109/JBHI.2020.2987943

    Article  Google Scholar 

  53. Luvizon DC, Picard D, Tabia H (2018) 2D/3D pose estimation and action recognition using multitask deep learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 5137–5146. https://doi.org/10.1109/CVPR.2018.00539

  54. Martinel N, Foresti GL, Micheloni C (2016) Wide-slice residual networks for food recognition. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, pp 567–576. https://doi.org/10.1109/WACV.2018.00068

  55. Matsuda Y, Hoashi H, Yanai K (2012) Recognition of multiple-food images by detecting candidate regions. 2012 IEEE International Conference on Multimedia and Expo, Melbourne, VIC, Australia, 2012, pp. 25-30. https://doi.org/10.1109/ICME.2012.157

  56. Min W, Bao BK, Mei S et al (2017) You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans Multimed, 1–1

  57. Min W, Jiang S, Wang S et al (2017) A delicious recipe analysis framework for exploring multi-modal recipes with various attributes. Proceedings of the 25th ACM international conference on multimedia. 402-410. https://doi.org/10.1145/3123266.3123272

  58. Min W, Jiang S, Sang J, Wang H, Liu X, Herranz L (2017) Being a Supercook: joint food attributes and multimodal content modeling for recipe retrieval and exploration. IEEE Trans Multimed 19(5):1100–1113

    Article  Google Scholar 

  59. Min W, Liu L, Luo Z et al (2019) Ingredient-guided cascaded multi-attention network for food recognition. The 27th ACM international conference on multimedia, pp 1331–1339. https://doi.org/10.1145/3343031.3350948

  60. Min W, Jiang S, Liu L, Rui Y, Jain R (2020) A survey on food computing. ACM Comput Surv 52(5):1–36. https://doi.org/10.1145/3329168

    Article  Google Scholar 

  61. Ming ZY, Chen J, Cao Y et al (2018) Food photo recognition for dietary tracking; system and experiment. International Conference on Multimedia Modeling (MMM) https://doi.org/10.1007/978-3-319-73600-6_12

  62. Mnih V, Heess N, Graves A et al (2014) Recurrent models of visual attention. In: NIPS'14: proceedings of the 27th international conference on neural information processing systems, pp 2204–2212. http://arxiv.org/abs/1406.6247

  63. Myers A, Johnston N, Rathod V et al (2015) Im2Calories: towards an automated Mobile vision food diary. 2015 IEEE Int Conf Comput Vis (ICCV). 1233–1241. https://doi.org/10.1109/ICCV.2015.146

  64. Nag N, Pandey V, Jain R (2017) Health multimedia. Proceedings of the 2017 ACM on international conference on multimedia retrieval. 99-106. https://doi.org/10.1145/3078971.3080545

  65. Nandhini P, Jaya J, George J (2013) Computer vision system for food quality evaluation — a review. 2013 International Conference on Current Trends in Engineering and Technology (ICCTET) 85–87. https://doi.org/10.1109/ICCTET.2013.6675916

  66. Ning Z, Donahue J, Girshick R et al (2014) Part-based R-CNNs for fine-grained category detection. European conference on computer vision (ECCV). https://doi.org/10.48550/arXiv.1407.3867

  67. Pandey P, Deepthi A, Mandal B, Puhan NB (2017) FoodNet: recognizing foods using Ensemble of Deep Networks. IEEE Signal Process Lett 24(12):1758–1762. https://doi.org/10.1109/LSP.2017.2758862

    Article  Google Scholar 

  68. Papyan V, Elad M (2015) Multi-scale patch-based image restoration. IEEE transactions on image processing, 249-261

  69. Park H, Bharadhwaj H, Lim BY (2019) Hierarchical multi-task learning for healthy drink classification. 2019 Int Joint Conf Neural Netw (IJCNN) 1–8. https://doi.org/10.1109/IJCNN.2019.8851796

  70. Pouladzadeh P, Yassine A, Shirmohammadi S (2015) FooDD: food detection dataset for calorie measurement using food images.In: new trends in image analysis and processing -- ICIAP 2015 workshops. Pp. 441-448

  71. Sajadmanesh S, Jafarzadeh S, Ossia SA et al (2016) Kissing cuisines: exploring worldwide culinary habits on the web. World Wide Web Conference, Web Science Companion

  72. Salvador A, Hynes N, Aytar Y et al (2017) Learning cross-modal Embeddings for cooking recipes and food images. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 3068–3076. https://doi.org/10.1109/CVPR.2017.327

  73. Sarker MMK, Rashwan HA, Akram F, Talavera E, Banu SF, Radeva P, Puig D (2019) Recognizing food places in egocentric photo-streams using multi-scale Atrous convolutional networks and self-attention mechanism. IEEE Access 7:39069–39082. https://doi.org/10.1109/ACCESS.2019.2902225

  74. Sarker MMK, Rashwan HA, Talavera E et al (2019) MACNet: multi-scale Atrous convolution networks for food places classification in egocentric photo-streams. 423-433

  75. Sasano S, Han X, Chen Y (2016) Food recognition by combined bags of color features and texture features. 2016 9th international congress on image and signal processing, BioMedical engineering and informatics (CISP-BMEI). 815-819. https://doi.org/10.1109/CISP-BMEI.2016.7852822

  76. Selvaraju RR, Cogswell M, Das A et al (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. 2017 IEEE Int Conf Comput Vis (ICCV) 618–626. https://doi.org/10.1109/ICCV.2017.74

  77. Shimoda W, Yanai K (2017) Learning food image similarity for food image retrieval. In: 2017 IEEE Third International Conference on Multimedia Big Data (BigMM), Laguna Hills, pp 165–168. https://doi.org/10.1109/BigMM.2017.73

  78. Situju SF, Takimoto H, Sato S, Yamauchi H, Kanagawa A, Lawi A (2019) Food constituent estimation for lifestyle disease prevention by multi-task CNN. Appl Artif Intell 33(8):732–746. https://doi.org/10.1080/08839514.2019.1602318

    Article  Google Scholar 

  79. Sood S, Singh H (2021) Computer vision and machine learning based approaches for food security: a review. Multimed Tools Appl 80:27973–27999.  https://doi.org/10.1007/s11042-021-11036-2

  80. Srivastava N, Salakhutdinov R (2012) Multimodal learning with deep Boltzmann machines. J Mach Learn Res 15(1):2949–2980

    MathSciNet  MATH  Google Scholar 

  81. Subhi MA, Ali SH, Mohammed MA (2019) Vision-based approaches for automatic food recognition and dietary assessment: a survey. IEEE Access 7:35370–35381. https://doi.org/10.1109/ACCESS.2019.2904519

    Article  Google Scholar 

  82. Sung F, Yang Y, Zhang L et al (2018) Learning to compare: relation network for few-shot learning. 2018 IEEE/CVF Conf Comput Vis Pattern Recognition 1199–1208. https://doi.org/10.1109/CVPR.2018.00131

  83. Taichi J, Keiji Y (2009) A food image recognition system with multiple kernel learning. 2009 16th IEEE international conference on image processing (ICIP). 285-288. https://doi.org/10.1109/ICIP.2009.5413400

  84. Tanno R, Okamoto K, Yanai K (2016) DeepFoodCam: A DCNN-based real-time mobile food recognition system. In: Proceedings of the 2nd international workshop on multimedia assisted dietary management - MADiMa '16, pp 89–89. https://doi.org/10.1145/2986035.2986044

  85. Wang H, Min W, Li X et al (2016) Where and what to eat: simultaneous restaurant and dish recognition from food image. Pacific Rim Conference on Multimedia

  86. Wang Z, Chen T, Li G et al (2017) Multi-label image recognition by recurrently discovering attentional regions. In: 2017 IEEE international conference on computer vision (ICCV), Venice, pp 464–472. https://doi.org/10.1109/ICCV.2017.58

  87. Woo S, Park J, Lee JY et al (2018) CBAM: convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science, vol 11211. Springer, Cham. https://doi.org/10.1007/978-3-030-01234-2_1

  88. Wu R, Wang B, Wang W et al (2015) Harvesting discriminative Meta objects with deep CNN features for scene classification. In: 2015 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/ICCV.2015.152

  89. Xinhang, Song, Shuqiang et al (2017). Multi-scale multi-feature context modeling for scene recognition in the semantic manifold. IEEE Trans Image Process, 26(6), 2721–2735.

  90. Xu R, Herranz L, Jiang S, Wang S, Song X, Jain R (2015) Geolocalized modeling for dish recognition. IEEE Trans Multimed 17(8):1187–1199

    Article  Google Scholar 

  91. Xu D, Ouyang W, Wang X et al (2018) PAD-Net: multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp 675–684. https://doi.org/10.1109/CVPR.2018.00077

  92. Yang S, Chen M, Pomerleau D et al (2010) Food recognition using statistics of pairwise local features. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, pp 2249–2256. https://doi.org/10.1109/CVPR.2010.5539907

  93. Yang J, Shen X, Tian X et al (2018) Local convolutional neural networks for person re-identification. In: Proceedings of the 26th ACM international conference on multimedia. October 2018, pp 1074–1082. https://doi.org/10.1145/3240508.3240645

  94. Yu Q, Anzawa M, Amano S et al (2018) Food image recognition by personalized classifier. In: 2018 25th IEEE international conference on image processing (ICIP), Athens, pp 171–175. https://doi.org/10.1109/ICIP.2018.8451422

  95. Zhang X-J, Lu Y-F, Zhang S-H (2016) Multi-task learning for food identification and analysis with deep convolutional neural networks. J Comput Sci Technol 31(3):489–500. https://doi.org/10.1007/s11390-016-1642-6

    Article  Google Scholar 

  96. Zhang H, Xu G, Liang X, Zhang W, Sun X, Huang T (2019) Multi-view multitask learning for knowledge base relation detection. Knowl-Based Syst 183:104870. https://doi.org/10.1016/j.knosys.2019.104870

    Article  Google Scholar 

  97. Zhang W, Wu J, Yang Y (2020) Wi-HSNN: a subnetwork-based encoding structure for dimension reduction and food classification via harnessing multi-CNN model high-level features. Neurocomputing 414:57–66. https://doi.org/10.1016/j.neucom.2020.07.018

    Article  Google Scholar 

  98. Zheng H, Fu J, Mei T et al (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. IEEE Int Conf Comput Vis (ICCV) 2017:5219–5227. https://doi.org/10.1109/ICCV.2017.557

    Article  Google Scholar 

  99. Zhou F, Lin Y (2016) Fine-grained image classification by exploring bipartite-graph labels. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1124–1133. https://doi.org/10.1109/CVPR.2016.127

  100. Zhu Y, Wang J, Xie L et al (2018) Attention-based pyramid aggregation network for visual place recognition. Proceedings of the 26th ACM international conference on multimedia. 99-107. https://doi.org/10.1145/3240508.3240525

Download references

Acknowledgments

We acknowledge the computational resources supported by High-Performance Computing Center of Collaborative Innovation Center of Advanced Microstructures, Nanjing University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sidan Du.

Ethics declarations

Conflict of interest

We have no conflicts of interest to disclose with regard to this survey paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, J., Hu, X., Li, M. et al. The multi-learning for food analyses in computer vision: a survey. Multimed Tools Appl 82, 25615–25650 (2023). https://doi.org/10.1007/s11042-023-14373-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14373-6

Keywords

Navigation