skip to main content
10.1145/3123266.3123272acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

A Delicious Recipe Analysis Framework for Exploring Multi-Modal Recipes with Various Attributes

Published: 19 October 2017 Publication History

Abstract

Human beings have developed a diverse food culture. Many factors like ingredients, visual appearance, courses (e.g., breakfast and lunch), flavor and geographical regions affect our food perception and choice. In this work, we focus on multi-dimensional food analysis based on these food factors to benefit various applications like summary and recommendation. For that solution, we propose a delicious recipe analysis framework to incorporate various types of continuous and discrete attribute features and multi-modal information from recipes. First, we develop a Multi-Attribute Theme Modeling (MATM) method, which can incorporate arbitrary types of attribute features to jointly model them and the textual content. We then utilize a multi-modal embedding method to build the correlation between the learned textual theme features from MATM and visual features from the deep learning network. By learning attribute-theme relations and multi-modal correlation, we are able to fulfill different applications, including (1) flavor analysis and comparison for better understanding the flavor patterns from different dimensions, such as the region and course, (2) region-oriented multi-dimensional food summary with both multi-modal and multi-attribute information and (3) multi-attribute oriented recipe recommendation. Furthermore, our proposed framework is flexible and enables easy incorporation of arbitrary types of attributes and modalities. Qualitative and quantitative evaluation results have validated the effectiveness of the proposed method and framework on the collected Yummly dataset.

References

[1]
Yong Yeol Ahn and Sebastian Ahnert. 2013. The Flavor Network. Leonardo, Vol. 46, 3 (2013), 272--273.
[2]
Yong-Yeol Ahn, Sebastian E Ahnert, James P Bagrow, and Albert-LŒ#225;szló Barabási. 2011. Flavor network and the principles of food pairing. Scientific reports Vol. 1 (2011).
[3]
Kiyoharu Aizawa and Makoto Ogawa. 2015. FoodLog: Multimedia Tool for Healthcare Applications. Multimedia IEEE, Vol. 22, 2 (2015), 4--8.
[4]
Roberto Camacho Barranco, Laura M. Rodriguez, Rebecca Urbina, and M. Shahriar Hossain. 2016. Is a Picture Worth Ten Thousand Words in a Review Dataset? arXiv:1606.07496 (2016).
[5]
David M Blei. 2012. Probabilistic topic models. Communications of The ACM Vol. 55, 4 (2012), 77--84.
[6]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research Vol. 3 (2003), 993--1022.
[7]
Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. 2014. Food-101--mining discriminative components with random forests. European Conference on Computer Vision. 446--461.
[8]
Jingjing Chen and Chong-Wah Ngo. 2016. Deep-based ingredient recognition for cooking recipe retrieval Proceedings of the 2016 ACM on Multimedia Conference. 32--41.
[9]
Munmun De Choudhury, Sanket Sharma, and Emre Kiciman. 2016. Characterizing Dietary Choices, Nutrition, and Language in Food Deserts via Social Media ACM Conference on Computer-Supported Cooperative Work and Social Computing. 1157--1170.
[10]
José G. Dias and Michel Wedel. 2004. An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods. Statistics and Computing Vol. 14, 4 (2004), 323--332.
[11]
Asja Fischer and Christian Igel. 2014. Training restricted Boltzmann machines: An introduction. Pattern Recognition, Vol. 47, 1 (2014), 25--39.
[12]
Peter Forbes and Mu Zhu. 2011. Content-boosted matrix factorization for recommender systems: experiments with recipe recommendation. In Proceedings of the fifth ACM conference on Recommender systems. 261--264.
[13]
Mouzhi Ge, Francesco Ricci, and David Massimo. 2015. Health-aware Food Recommender System. In Proceedings of the Conference on Recommender Systems. 333--334.
[14]
Andrea Giampiccoli and Janet Hayward Kalis. 2012. Tourism, Food, and Culture: Community-Based Tourism, Local Food, and Community Development in Mpondoland. Culture and Agriculture Vol. 34, 2 (2012), 101--123.
[15]
T. L. Griffiths and M. Steyvers. 2004. Finding scientific topics. Proceedings of the National academy of Sciences of the United States of America, Vol. 101, Suppl 1 (2004), 5228--5235.
[16]
Jack Hessel, Nicolas Savva, and Michael J. Wilber. 2015. Image Representations and New Domains in Neural Image Captioning. Computer Science (2015).
[17]
Geoffrey E Hinton and Ruslan Salakhutdinov. 2009. Replicated softmax: an undirected topic model. In Advances in neural information processing systems. 1607--1614.
[18]
Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science, Vol. 313, 5786 (2006), 504--507.
[19]
D. B. Hipgrave, S. Chang, X. Li, and Y. Wu. 2016. Salt and Sodium Intake in China. Jama, Vol. 315, 7 (2016), 703.
[20]
Patrick D Howell, Layla D Martin, Hesamoddin Salehian, Chul Lee, Kyler M Eastman, and Joohyun Kim. 2016. Analyzing Taste Preferences From Crowdsourced Food Entries International Conference on Digital Health Conference. 131--140.
[21]
Yangqing Jia, Mathieu Salzmann, and Trevor Darrell. 2011. Learning cross-modality similarity for multinomial data Computer Vision, IEEE International Conference on. 2407--2414.
[22]
Hokuto Kagaya, Kiyoharu Aizawa, and Makoto Ogawa. 2014 a. Food Detection and Recognition Using Convolutional Neural Network Proceedings of the ACM International Conference on Multimedia. ACM, 1085--1088.
[23]
Hokuto Kagaya, Kiyoharu Aizawa, and Makoto Ogawa. 2014 b. Food detection and recognition using convolutional neural network Proceedings of the ACM International Conference on Multimedia. 1085--1088.
[24]
Chia-Jen Lin, Tsung-Ting Kuo, and Shou-De Lin. 2014. A content-based matrix factorization model for recipe recommendation. Advances in Knowledge Discovery and Data Mining. 560--571.
[25]
Alex M. 2016. Finding Beautiful Yelp Photos Using Deep Learning. https://engineeringblog.yelp.com/2016/11/finding-beautiful-yelp-photos-using-deep-learning.html (2016).
[26]
Austin Meyers, Nick Johnston, Vivek Rathod, Anoop Korattikara, Alex Gorban, Nathan Silberman, Sergio Guadarrama, George Papandreou, Jonathan Huang, and Kevin P Murphy. 2015. Im2Calories: towards an automated mobile vision food diary Proceedings of the IEEE International Conference on Computer Vision. 1233--1241.
[27]
David Mimno and Andrew Mccallum. 2012. Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression. University of Massachusetts - Amherst Vol. 2008 (2012), 411--418.
[28]
Weiqing Min, Bing Kun Bao, and Changsheng Xu. 2014. Multimodal Spatio-Temporal Theme Modeling for Landmark Analysis. IEEE Multimedia, Vol. 21, 3 (2014), 20--29.
[29]
Weiqing Min, Bing Kun Bao, Changsheng Xu, and M. Shamim Hossain. 2015. Cross-Platform Multi-Modal Topic Modeling for Personalized Inter-Platform Recommendation. IEEE Transactions on Multimedia Vol. 17, 10 (2015), 1787--1801.
[30]
Weiqing Min, Shuqiang Jiang, Jitao Sang, Huayang Wang, Xinda Liu, and Luis Herranz. 2017. Being a Super Cook: Joint Food Attributes and Multi-Modal Content Modeling for Recipe Retrieval and Exploration. IEEE Transactions on Multimedia Vol. 19, 5 (2017), 1100--1113.
[31]
Ferda Ofli, Yusuf Aytar, Ingmar Weber, Raggi Al Hammouri, and Antonio Torralba. 2017. Is Saki delicious? The Food Perception Gap on Instagram and Its Relation to Health. arXiv preprint arXiv:1702.06318 (2017).
[32]
Shengsheng Qian, Tianzhu Zhang, and Changsheng Xu. 2016. Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis ACM on Multimedia Conference. 2--11.
[33]
Jaclyn Rich, Hamed Haddadi, and Timothy M Hospedales. 2016. Towards Bottom-Up Analysis of Social Food. In Proceedings of the 6th International Conference on Digital Health Conference. ACM, 111--120.
[34]
M Rosenbaum, M Sy, K Pavlovich, R. L. Leibel, and J Hirsch. 2008. Leptin reverses weight loss-induced changes in regional neural activity responses to visual food stimuli. Journal of Clinical Investigation Vol. 118, 7 (2008), 2583--2591.
[35]
Sina Sajadmanesh, Sina Jafarzadeh, Seyed Ali Ossia, Hamid R Rabiee, Hamed Haddadi, Yelena Mejova, Mirco Musolesi, Emiliano De Cristofaro, and Gianluca Stringhini. 2016. Kissing Cuisines: Exploring Worldwide Culinary Habits on the Web. arXiv preprint arXiv:1610.08469 (2016).
[36]
Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton. 2007. Restricted Boltzmann machines for collaborative filtering Machine Learning, Proceedings of the Twenty-Fourth International Conference. 791--798.
[37]
R. Salakhutdinov, J. B. Tenenbaum, and A. Torralba. 2013. Learning with Hierarchical-Deep Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 8 (2013), 1958.
[38]
Tiago Simas, Michal Ficek, Albert DiazGuilera, Pere Obrador, and Pablo R. Rodriguez. 2017. Food-bridging: a new network construction to unveil the principles of cooking. arXiv preprint arXiv:1610.08469 (2017).
[39]
Nitish Srivastava and Ruslan Salakhutdinov. 2014. Multimodal learning with deep Boltzmann machines. The Journal of Machine Learning Research Vol. 15, 1 (2014), 2949--2980.
[40]
Chong Wang, D. Blei, and Fei Fei Li. 2009. Simultaneous image classification and annotation. IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1903--1910.
[41]
Hui Wu, Michele Merler, Rosario Uceda-Sosa, and John R Smith. 2016. Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition ACM on Multimedia Conference. 172--176.
[42]
Ruihan Xu, Luis Herranz, Shuqiang Jiang, Shuang Wang, Xinhang Song, and Ramesh Jain. 2015. Geolocalized Modeling for Dish Recognition. Multimedia, IEEE Transactions on Vol. 17, 8 (2015), 1187--1199.
[43]
Longqi Yang, Yin Cui, Fan Zhang, John P Pollak, Serge Belongie, and Deborah Estrin. 2015. PlateClick: Bootstrapping Food Preferences Through an Adaptive Visual Interface Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 183--192.
[44]
Wanlei Zhao, Yu Gang Jiang, and Chong Wah Ngo. 2006. Keyframe Retrieval by Keypoints: Can Point-to-Point Matching Help? Vol. 4071 (2006), 72--81.

Cited By

View all
  • (2025)Adaptafood: an intelligent system to adapt recipes to specialised diets and healthy lifestylesMultimedia Systems10.1007/s00530-025-01667-y31:2Online publication date: 1-Feb-2025
  • (2024)Food Computing for Nutrition and Health2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW61823.2024.00066(29-31)Online publication date: 13-May-2024
  • (2023)Transferring Knowledge From Text to Video: Zero-Shot Anticipation for Procedural ActionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.321859645:6(7836-7852)Online publication date: 1-Jun-2023
  • Show More Cited By

Index Terms

  1. A Delicious Recipe Analysis Framework for Exploring Multi-Modal Recipes with Various Attributes

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '17: Proceedings of the 25th ACM international conference on Multimedia
      October 2017
      2028 pages
      ISBN:9781450349062
      DOI:10.1145/3123266
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 October 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. flavor analysis
      2. food summary
      3. multi-attribute theme modeling
      4. multi-dimensional food analysis
      5. recipe recommendation

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • Beijing Natural Science Foundation
      • the Lenovo Outstanding Young Scientists Program
      • China Postdoctoral Science Foundation
      • Beijing Municipal Commission of Science and Technology
      • National Program for Special Support of Eminent Professionals and National Program for Support of Top-notch Young Professionals

      Conference

      MM '17
      Sponsor:
      MM '17: ACM Multimedia Conference
      October 23 - 27, 2017
      California, Mountain View, USA

      Acceptance Rates

      MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;
      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)42
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Adaptafood: an intelligent system to adapt recipes to specialised diets and healthy lifestylesMultimedia Systems10.1007/s00530-025-01667-y31:2Online publication date: 1-Feb-2025
      • (2024)Food Computing for Nutrition and Health2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW61823.2024.00066(29-31)Online publication date: 13-May-2024
      • (2023)Transferring Knowledge From Text to Video: Zero-Shot Anticipation for Procedural ActionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.321859645:6(7836-7852)Online publication date: 1-Jun-2023
      • (2023)The multi-learning for food analyses in computer vision: a surveyMultimedia Tools and Applications10.1007/s11042-023-14373-682:17(25615-25650)Online publication date: 26-Jan-2023
      • (2023)Link prediction in food heterogeneous graphs for personalised recipe recommendation based on user interactions and dietary restrictionsComputing10.1007/s00607-023-01233-2106:7(2133-2155)Online publication date: 15-Nov-2023
      • (2023)Food Recipe and Nutritional Information GeneratorMachine Learning and Computational Intelligence Techniques for Data Engineering10.1007/978-981-99-0047-3_32(369-378)Online publication date: 16-May-2023
      • (2022)Unbalanced Incomplete Multi-View Clustering Via the Scheme of View Evolution: Weak Views are Meat; Strong Views Do EatIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2021.30779096:4(913-927)Online publication date: Aug-2022
      • (2022)The Deep Features and Attention Mechanism-Based Method to Dish Healthcare Under Social IoT Systems: An Empirical Study With a Hand-Deep Local–Global NetIEEE Transactions on Computational Social Systems10.1109/TCSS.2021.31025919:1(336-347)Online publication date: Feb-2022
      • (2022)ANIMC: A Soft Approach for Autoweighted Noisy and Incomplete Multiview ClusteringIEEE Transactions on Artificial Intelligence10.1109/TAI.2021.31165463:2(192-206)Online publication date: Apr-2022
      • (2022)Learning Program Representations for Food Images and Cooking Recipes2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.01606(16538-16548)Online publication date: Jun-2022
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media