research-article

A Delicious Recipe Analysis Framework for Exploring Multi-Modal Recipes with Various Attributes

Authors:

Shuqiang Jiang,

Shuhuan MeiAuthors Info & Claims

MM '17: Proceedings of the 25th ACM international conference on Multimedia

Pages 402 - 410

https://doi.org/10.1145/3123266.3123272

Published: 19 October 2017 Publication History

Abstract

Human beings have developed a diverse food culture. Many factors like ingredients, visual appearance, courses (e.g., breakfast and lunch), flavor and geographical regions affect our food perception and choice. In this work, we focus on multi-dimensional food analysis based on these food factors to benefit various applications like summary and recommendation. For that solution, we propose a delicious recipe analysis framework to incorporate various types of continuous and discrete attribute features and multi-modal information from recipes. First, we develop a Multi-Attribute Theme Modeling (MATM) method, which can incorporate arbitrary types of attribute features to jointly model them and the textual content. We then utilize a multi-modal embedding method to build the correlation between the learned textual theme features from MATM and visual features from the deep learning network. By learning attribute-theme relations and multi-modal correlation, we are able to fulfill different applications, including (1) flavor analysis and comparison for better understanding the flavor patterns from different dimensions, such as the region and course, (2) region-oriented multi-dimensional food summary with both multi-modal and multi-attribute information and (3) multi-attribute oriented recipe recommendation. Furthermore, our proposed framework is flexible and enables easy incorporation of arbitrary types of attributes and modalities. Qualitative and quantitative evaluation results have validated the effectiveness of the proposed method and framework on the collected Yummly dataset.

References

[1]

Yong Yeol Ahn and Sebastian Ahnert. 2013. The Flavor Network. Leonardo, Vol. 46, 3 (2013), 272--273.

[2]

Yong-Yeol Ahn, Sebastian E Ahnert, James P Bagrow, and Albert-LŒ#225;szló Barabási. 2011. Flavor network and the principles of food pairing. Scientific reports Vol. 1 (2011).

[3]

Kiyoharu Aizawa and Makoto Ogawa. 2015. FoodLog: Multimedia Tool for Healthcare Applications. Multimedia IEEE, Vol. 22, 2 (2015), 4--8.

Digital Library

[4]

Roberto Camacho Barranco, Laura M. Rodriguez, Rebecca Urbina, and M. Shahriar Hossain. 2016. Is a Picture Worth Ten Thousand Words in a Review Dataset? arXiv:1606.07496 (2016).

[5]

David M Blei. 2012. Probabilistic topic models. Communications of The ACM Vol. 55, 4 (2012), 77--84.

Digital Library

[6]

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research Vol. 3 (2003), 993--1022.

Digital Library

[7]

Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. 2014. Food-101--mining discriminative components with random forests. European Conference on Computer Vision. 446--461.

[8]

Jingjing Chen and Chong-Wah Ngo. 2016. Deep-based ingredient recognition for cooking recipe retrieval Proceedings of the 2016 ACM on Multimedia Conference. 32--41.

Digital Library

[9]

Munmun De Choudhury, Sanket Sharma, and Emre Kiciman. 2016. Characterizing Dietary Choices, Nutrition, and Language in Food Deserts via Social Media ACM Conference on Computer-Supported Cooperative Work and Social Computing. 1157--1170.

Digital Library

[10]

José G. Dias and Michel Wedel. 2004. An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods. Statistics and Computing Vol. 14, 4 (2004), 323--332.

Digital Library

[11]

Asja Fischer and Christian Igel. 2014. Training restricted Boltzmann machines: An introduction. Pattern Recognition, Vol. 47, 1 (2014), 25--39.

Digital Library

[12]

Peter Forbes and Mu Zhu. 2011. Content-boosted matrix factorization for recommender systems: experiments with recipe recommendation. In Proceedings of the fifth ACM conference on Recommender systems. 261--264.

Digital Library

[13]

Mouzhi Ge, Francesco Ricci, and David Massimo. 2015. Health-aware Food Recommender System. In Proceedings of the Conference on Recommender Systems. 333--334.

Digital Library

[14]

Andrea Giampiccoli and Janet Hayward Kalis. 2012. Tourism, Food, and Culture: Community-Based Tourism, Local Food, and Community Development in Mpondoland. Culture and Agriculture Vol. 34, 2 (2012), 101--123.

[15]

T. L. Griffiths and M. Steyvers. 2004. Finding scientific topics. Proceedings of the National academy of Sciences of the United States of America, Vol. 101, Suppl 1 (2004), 5228--5235.

[16]

Jack Hessel, Nicolas Savva, and Michael J. Wilber. 2015. Image Representations and New Domains in Neural Image Captioning. Computer Science (2015).

[17]

Geoffrey E Hinton and Ruslan Salakhutdinov. 2009. Replicated softmax: an undirected topic model. In Advances in neural information processing systems. 1607--1614.

Digital Library

[18]

Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science, Vol. 313, 5786 (2006), 504--507.

[19]

D. B. Hipgrave, S. Chang, X. Li, and Y. Wu. 2016. Salt and Sodium Intake in China. Jama, Vol. 315, 7 (2016), 703.

[20]

Patrick D Howell, Layla D Martin, Hesamoddin Salehian, Chul Lee, Kyler M Eastman, and Joohyun Kim. 2016. Analyzing Taste Preferences From Crowdsourced Food Entries International Conference on Digital Health Conference. 131--140.

Digital Library

[21]

Yangqing Jia, Mathieu Salzmann, and Trevor Darrell. 2011. Learning cross-modality similarity for multinomial data Computer Vision, IEEE International Conference on. 2407--2414.

Digital Library

[22]

Hokuto Kagaya, Kiyoharu Aizawa, and Makoto Ogawa. 2014 a. Food Detection and Recognition Using Convolutional Neural Network Proceedings of the ACM International Conference on Multimedia. ACM, 1085--1088.

Digital Library

[23]

Hokuto Kagaya, Kiyoharu Aizawa, and Makoto Ogawa. 2014 b. Food detection and recognition using convolutional neural network Proceedings of the ACM International Conference on Multimedia. 1085--1088.

Digital Library

[24]

Chia-Jen Lin, Tsung-Ting Kuo, and Shou-De Lin. 2014. A content-based matrix factorization model for recipe recommendation. Advances in Knowledge Discovery and Data Mining. 560--571.

[25]

Alex M. 2016. Finding Beautiful Yelp Photos Using Deep Learning. https://engineeringblog.yelp.com/2016/11/finding-beautiful-yelp-photos-using-deep-learning.html (2016).

[26]

Austin Meyers, Nick Johnston, Vivek Rathod, Anoop Korattikara, Alex Gorban, Nathan Silberman, Sergio Guadarrama, George Papandreou, Jonathan Huang, and Kevin P Murphy. 2015. Im2Calories: towards an automated mobile vision food diary Proceedings of the IEEE International Conference on Computer Vision. 1233--1241.

Digital Library

[27]

David Mimno and Andrew Mccallum. 2012. Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression. University of Massachusetts - Amherst Vol. 2008 (2012), 411--418.

Digital Library

[28]

Weiqing Min, Bing Kun Bao, and Changsheng Xu. 2014. Multimodal Spatio-Temporal Theme Modeling for Landmark Analysis. IEEE Multimedia, Vol. 21, 3 (2014), 20--29.

[29]

Weiqing Min, Bing Kun Bao, Changsheng Xu, and M. Shamim Hossain. 2015. Cross-Platform Multi-Modal Topic Modeling for Personalized Inter-Platform Recommendation. IEEE Transactions on Multimedia Vol. 17, 10 (2015), 1787--1801.

Digital Library

[30]

Weiqing Min, Shuqiang Jiang, Jitao Sang, Huayang Wang, Xinda Liu, and Luis Herranz. 2017. Being a Super Cook: Joint Food Attributes and Multi-Modal Content Modeling for Recipe Retrieval and Exploration. IEEE Transactions on Multimedia Vol. 19, 5 (2017), 1100--1113.

Digital Library

[31]

Ferda Ofli, Yusuf Aytar, Ingmar Weber, Raggi Al Hammouri, and Antonio Torralba. 2017. Is Saki delicious? The Food Perception Gap on Instagram and Its Relation to Health. arXiv preprint arXiv:1702.06318 (2017).

Digital Library

[32]

Shengsheng Qian, Tianzhu Zhang, and Changsheng Xu. 2016. Multi-modal Multi-view Topic-opinion Mining for Social Event Analysis ACM on Multimedia Conference. 2--11.

Digital Library

[33]

Jaclyn Rich, Hamed Haddadi, and Timothy M Hospedales. 2016. Towards Bottom-Up Analysis of Social Food. In Proceedings of the 6th International Conference on Digital Health Conference. ACM, 111--120.

Digital Library

[34]

M Rosenbaum, M Sy, K Pavlovich, R. L. Leibel, and J Hirsch. 2008. Leptin reverses weight loss-induced changes in regional neural activity responses to visual food stimuli. Journal of Clinical Investigation Vol. 118, 7 (2008), 2583--2591.

[35]

Sina Sajadmanesh, Sina Jafarzadeh, Seyed Ali Ossia, Hamid R Rabiee, Hamed Haddadi, Yelena Mejova, Mirco Musolesi, Emiliano De Cristofaro, and Gianluca Stringhini. 2016. Kissing Cuisines: Exploring Worldwide Culinary Habits on the Web. arXiv preprint arXiv:1610.08469 (2016).

Digital Library

[36]

Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton. 2007. Restricted Boltzmann machines for collaborative filtering Machine Learning, Proceedings of the Twenty-Fourth International Conference. 791--798.

Digital Library

[37]

R. Salakhutdinov, J. B. Tenenbaum, and A. Torralba. 2013. Learning with Hierarchical-Deep Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 8 (2013), 1958.

Digital Library

[38]

Tiago Simas, Michal Ficek, Albert DiazGuilera, Pere Obrador, and Pablo R. Rodriguez. 2017. Food-bridging: a new network construction to unveil the principles of cooking. arXiv preprint arXiv:1610.08469 (2017).

[39]

Nitish Srivastava and Ruslan Salakhutdinov. 2014. Multimodal learning with deep Boltzmann machines. The Journal of Machine Learning Research Vol. 15, 1 (2014), 2949--2980.

Digital Library

[40]

Chong Wang, D. Blei, and Fei Fei Li. 2009. Simultaneous image classification and annotation. IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1903--1910.

[41]

Hui Wu, Michele Merler, Rosario Uceda-Sosa, and John R Smith. 2016. Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition ACM on Multimedia Conference. 172--176.

Digital Library

[42]

Ruihan Xu, Luis Herranz, Shuqiang Jiang, Shuang Wang, Xinhang Song, and Ramesh Jain. 2015. Geolocalized Modeling for Dish Recognition. Multimedia, IEEE Transactions on Vol. 17, 8 (2015), 1187--1199.

[43]

Longqi Yang, Yin Cui, Fan Zhang, John P Pollak, Serge Belongie, and Deborah Estrin. 2015. PlateClick: Bootstrapping Food Preferences Through an Adaptive Visual Interface Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 183--192.

Digital Library

[44]

Wanlei Zhao, Yu Gang Jiang, and Chong Wah Ngo. 2006. Keyframe Retrieval by Keypoints: Can Point-to-Point Matching Help? Vol. 4071 (2006), 72--81.

Digital Library

Cited By

Morales-Garzón AGutiérrez-Batista KMartin-Bautista M(2025)Adaptafood: an intelligent system to adapt recipes to specialised diets and healthy lifestylesMultimedia Systems10.1007/s00530-025-01667-y31:2Online publication date: 1-Feb-2025
https://doi.org/10.1007/s00530-025-01667-y
Jiang S(2024)Food Computing for Nutrition and Health2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW61823.2024.00066(29-31)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDEW61823.2024.00066
Sener FSaraf RYao A(2023)Transferring Knowledge From Text to Video: Zero-Shot Anticipation for Procedural ActionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.321859645:6(7836-7852)Online publication date: 1-Jun-2023
https://doi.org/10.1109/TPAMI.2022.3218596
Show More Cited By

Index Terms

A Delicious Recipe Analysis Framework for Exploring Multi-Modal Recipes with Various Attributes
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
  2. Information systems applications
    1. Multimedia information systems

Recommendations

Cross-modal Recipe Retrieval with Rich Food Attributes
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Food is rich of visible (e.g., colour, shape) and procedural (e.g., cutting, cooking) attributes. Proper leveraging of these attributes, particularly the interplay among ingredients, cutting and cooking methods, for health-related applications has not ...
"Easy" Cooking Recipe Recommendation Considering User's Conditions
WI-IAT '09: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03

It is natural to think that couples who work at a company or a person who lives by her/himself want to cook food for themselves as quickly and easily as possible when they are busy. However, to keep having the same food they can easily cook fed them up, ...
Node.js Recipes: A Problem-Solution Approach

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '17: Proceedings of the 25th ACM international conference on Multimedia

October 2017

2028 pages

ISBN:9781450349062

DOI:10.1145/3123266

General Chairs:
Qiong Liu
FXPAL, USA
,
Rainer Lienhart
Universität Augsburg, Germany
,
Haohong Wang
TCL America, USA
,
Program Chairs:
Sheng-Wei "Kuan-Ta" Chen
Academia Sinica, Taiwan
,
Susanne Boll
University of Oldenburg, Germany
,
Phoebe Chen
La Trobe University, Australia
,
Gerald Friedland
Lawrence Livermore National Lab, USA
,
Jia Li
Google, USA
,
Shuicheng Yan
Qihoo 360, China

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Beijing Natural Science Foundation
the Lenovo Outstanding Young Scientists Program
China Postdoctoral Science Foundation
Beijing Municipal Commission of Science and Technology
National Program for Special Support of Eminent Professionals and National Program for Support of Top-notch Young Professionals

Conference

MM '17

Sponsor:

SIGMM

MM '17: ACM Multimedia Conference

October 23 - 27, 2017

California, Mountain View, USA

Acceptance Rates

MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
536
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)5

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Morales-Garzón AGutiérrez-Batista KMartin-Bautista M(2025)Adaptafood: an intelligent system to adapt recipes to specialised diets and healthy lifestylesMultimedia Systems10.1007/s00530-025-01667-y31:2Online publication date: 1-Feb-2025
https://doi.org/10.1007/s00530-025-01667-y
Jiang S(2024)Food Computing for Nutrition and Health2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW61823.2024.00066(29-31)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDEW61823.2024.00066
Sener FSaraf RYao A(2023)Transferring Knowledge From Text to Video: Zero-Shot Anticipation for Procedural ActionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.321859645:6(7836-7852)Online publication date: 1-Jun-2023
https://doi.org/10.1109/TPAMI.2022.3218596
Dai JHu XLi MLi YDu S(2023)The multi-learning for food analyses in computer vision: a surveyMultimedia Tools and Applications10.1007/s11042-023-14373-682:17(25615-25650)Online publication date: 26-Jan-2023
https://doi.org/10.1007/s11042-023-14373-6
Morales-Garzón AGutiérrez-Batista KMartin-Bautista M(2023)Link prediction in food heterogeneous graphs for personalised recipe recommendation based on user interactions and dietary restrictionsComputing10.1007/s00607-023-01233-2106:7(2133-2155)Online publication date: 15-Nov-2023
https://doi.org/10.1007/s00607-023-01233-2
Mishra AGupta ASahu AKumar ADwivedi P(2023)Food Recipe and Nutritional Information GeneratorMachine Learning and Computational Intelligence Techniques for Data Engineering10.1007/978-981-99-0047-3_32(369-378)Online publication date: 16-May-2023
https://doi.org/10.1007/978-981-99-0047-3_32
Fang XHu YZhou PWu D(2022)Unbalanced Incomplete Multi-View Clustering Via the Scheme of View Evolution: Weak Views are Meat; Strong Views Do EatIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2021.30779096:4(913-927)Online publication date: Aug-2022
https://doi.org/10.1109/TETCI.2021.3077909
Gao HXu KCao MXiao JXu QYin Y(2022)The Deep Features and Attention Mechanism-Based Method to Dish Healthcare Under Social IoT Systems: An Empirical Study With a Hand-Deep Local–Global NetIEEE Transactions on Computational Social Systems10.1109/TCSS.2021.31025919:1(336-347)Online publication date: Feb-2022
https://doi.org/10.1109/TCSS.2021.3102591
Fang XHu YZhou PWu D(2022)ANIMC: A Soft Approach for Autoweighted Noisy and Incomplete Multiview ClusteringIEEE Transactions on Artificial Intelligence10.1109/TAI.2021.31165463:2(192-206)Online publication date: Apr-2022
https://doi.org/10.1109/TAI.2021.3116546
Papadopoulos DMora EChepurko NHuang KOfli FTorralba A(2022)Learning Program Representations for Food Images and Cooking Recipes2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.01606(16538-16548)Online publication date: Jun-2022
https://doi.org/10.1109/CVPR52688.2022.01606
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten