Abstract
In this paper, we proposed a multi-task system that can identify dish types, food ingredients, and cooking methods from food images with deep convolutional neural networks. We built up a dataset of 360 classes of different foods with at least 500 images for each class. To reduce the noises of the data, which was collected from the Internet, outlier images were detected and eliminated through a one-class SVM trained with deep convolutional features. We simultaneously trained a dish identifier, a cooking method recognizer, and a multi-label ingredient detector. They share a few low-level layers in the deep network architecture. The proposed framework shows higher accuracy than traditional method with handcrafted features, and the cooking method recognizer and ingredient detector can be applied to dishes which are not included in the training dataset to provide reference information for users.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Yang S, Chen M, Pomerleau D, Sukthankar R. Food recognition using statistics of pairwise local features. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2010, pp.2249-2256.
Retna Swami M S S K, Karuppiah M. Optimal feature extraction using greedy approach for random image components and subspace approach in face recognition. Journal of Computer Science and Technology, 2013, 28(2): 322–328.
Hall P, Cai H, Wu Q, Corradi T. Crossdepiction problem: Recognition and synthesis of photographs and artwork. Computational Visual Media, 2015, 1(2): 91–103.
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In Proc. the 26th Conference on Neural Information Processing Systems (NIPS), December 2012, pp.1106-1114.
Ghosh S, Laksana E, Scherer S, Morency L P. A multi-label convolutional neural network approach to cross-domain action unit detection. In Proc. IEEE Int. Conf. Affective Computing and Intelligent Interaction (ACII), May 2015, pp.609-615.
Li S, Liu Z Q, Chan A. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. International Journal of Computer Vision, 2015, 113(1): 19–36.
Chen M Y, Yang Y H, Ho C J, Wang S H, Liu S M, Chang E, Yeh C H, Ouhyoung M. Automatic Chinese food identification and quantity estimation. In Proc. SIGGRAPH Asia 2012 Technical Briefs, November 2012, pp.29:1–29:4.
Kagaya H, Aizawa K, Ogawa M. Food detection and recognition using convolutional neural network. In Proc. the 22nd ACM International Conference on Multimedia, November 2014, pp.1085-1088.
Kawano Y, Yanai K. Food image recognition with deep convolutional features. In Proc. the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, September 2014, pp.589-593.
Chen M, Dhingra K, Wu W, Yang L, Sukthankar R, Yang J. PFID: Pittsburgh fastfood image dataset. In Proc. the 16th IEEE International Conference on Image Processing (ICIP), November 2009, pp.289-292.
Hoashi H, Joutou T, Yanai K. Image recognition of 85 food categories by feature fusion. In Proc. IEEE International Symposium on Multimedia (ISM), December 2010, pp.296-301.
Joutou T, Yanai K. A food image recognition system with multiple kernel learning. In Proc. the 16th IEEE International Conference on Image Processing (ICIP), November 2009, pp.285-288.
Matsuda Y, Hoashi H, Yanai K. Recognition of multiplefood images by detecting candidate regions. In Proc. IEEE International Conference on Multimedia and Expo (ICME), July 2012, pp.25-30.
Kawano Y, Yanai K. Real-time mobile food recognition system. In Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), June 2013.
Bosch M, Zhu F, Khanna N, Boushey C J, Delp E J. Combining global and local features for food identification in dietary assessment. In Proc. the 18th IEEE International Conference on Image Processing (ICIP), September 2011, pp.1789-1792.
Maruyama T, Kawano Y, Yanai K. Realtime mobile recipe recommendation system using food ingredient recognition. In Proc. the 2nd ACM International Workshop on Interactive Multimedia on Mobile and Portable Devices, Oct. 29-Nov. 2, 2012, pp.27-34.
Wang C, Huang K Q. VFM: Visual feedback model for robust object recognition. Journal of Computer Science and Technology, 2015, 30(2): 325–339.
Yang X, Kim S, Xing E P. Heterogeneous multitask learning with joint sparsity constraints. In Proc. the 23rd Annual Conference on Neural Information Processing Systems (NIPS), December 2009, pp.2151-2159.
Wang X, Fouhey D F, Gupta A. Designing deep networks for surface normal estimation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015, pp.539-547.
Amer M, Goldstein M, Abdennadher S. Enhancing oneclass support vector machines for unsupervised anomaly detection. In Proc. the ACM SIGKDD Workshop on Outlier Detection and Description, August 2013, pp.8-15.
Nair V, Hinton G E. Rectified linear units improve restricted boltzmann machines. In Proc. the 27th International Conference on Machine Learning (ICML), June 2010, pp.807-814.
Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323(6088): 533–536.
Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580, 2012. http://arxiv.org/abs/1207.0580, Mar. 2016.
Branson S, Beijbom O, Belongie S. Efficient large-scale structured learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2013, pp.1806-1813.
Author information
Authors and Affiliations
Corresponding author
Additional information
Special Section of CVM 2016
This work was supported by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013903, the National Natural Science Foundation of China under Grant No. 61373069, the Research Grant of Beijing Higher Institution Engineering Research Center, and the Tsinghua University Initiative Scientific Research Program.
Rights and permissions
About this article
Cite this article
Zhang, XJ., Lu, YF. & Zhang, SH. Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks. J. Comput. Sci. Technol. 31, 489–500 (2016). https://doi.org/10.1007/s11390-016-1642-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-016-1642-6