Abstract
With the widespread use of smartphones, people are taking more and more images of their foods. These images can be used for automatic recognition of foods present and potentially providing an indication of eating habits. Traditional methods rely on computing a number of user derived features from image and then use a classification method to classify food images into different food categories. Pertained deep neural network architectures can be used for automatically extracting features from images for different classification tasks. This work proposes the use of convolutional neural networks (CNN) for feature extraction from food images. A linear support vector machine classifier was trained using 3-fold cross-validation scheme on a publically available Pittsburgh fast-food image dataset. Features from 3 different fully connected layers of CNN were used for classification. Two classification tasks were defined. The first task was to classify images into 61 categories and the second task was to classify images into 7 categories. Best results were obtained using 4096 features with an accuracy of 70.13% and 94.01% for 61 class and 7 class tasks respectively. This shows improvement over previously reported results on the same dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, M., Dhingra, K., Wu, W., Yang, L., Sukthankar, R., Yang, J.: PFID: pittsburgh fast-food image dataset. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 289–292 (2009)
Yang, S., Chen, M., Pomerleau, D., Sukthankar, R.: Food recognition using statistics of pairwise local features. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2249–2256 (2010)
Joutou, T., Yanai, K.: A food image recognition system with multiple kernel learning. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 285–288 (2009)
Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)
Farinella, G.M., Allegra, D., Stanco, F.: A benchmark dataset to study the representation of food images. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8927, pp. 584–599. Springer, Cham (2015). doi:10.1007/978-3-319-16199-0_41
Anthimopoulos, M.M., Gianola, L., Scarnato, L., Diem, P., Mougiakakou, S.G.: A food recognition system for diabetic patients based on an optimized bag-of-features model. IEEE J. Biomed. Health Inform. 18, 1261–1271 (2014)
Food-101 – Mining Discriminative Components with Random Forests. https://www.vision.ee.ethz.ch/datasets_extra/food-101/
Zhu, F., Bosch Ruiz, M., Khanna, N., Boushey, C., Delp, E.: Multiple hypotheses image segmentation and classification with application to dietary assessment. IEEE J. Biomed. Health Inform. 19(1), 377–388 (2015)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Le, Q.V.: Building high-level features using large scale unsupervised learning. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8595–8598 (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010)
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: DeCAF: a deep convolutional activation feature for generic visual recognition (2013) ArXiv13101531
Qi, X., Xiao, R., Li, C.G., Qiao, Y., Guo, J., Tang, X.: Pairwise rotation invariant co-occurrence local binary pattern. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2199–2213 (2014)
Fontana, J.M., Farooq, M., Sazonov, E.: Automatic ingestion monitor: a novel wearable device for monitoring of ingestive behavior. IEEE Trans. Biomed. Eng. 61, 1772–1779 (2014)
Farooq, M., Sazonov, E.: A novel wearable device for food intake and physical activity recognition. Sensors 16, 1067 (2016)
Farooq, M., Fontana, J.M., Sazonov, E.: A novel approach for food intake detection using electroglottography. Physiol. Meas. 35, 739 (2014)
Farooq, M., Sazonov, E.: Segmentation and characterization of chewing bouts by monitoring temporalis muscle using smart glasses with piezoelectric sensor. IEEE J. Biomed. Health Inform., 1–1 (2016)
Chae, J., Woo, I., Kim, S., Maciejewski, R., Zhu, F., Delp, E.J., Boushey, C.J., Ebert, D.S.: Volume estimation using food specific shape templates in mobile image-based dietary assessment. Proc. SPIE. 7873, 78730K (2011)
Acknowledgement
Research reported in this publication was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (grants number: R01DK100796). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Farooq, M., Sazonov, E. (2017). Feature Extraction Using Deep Learning for Food Type Recognition. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10208. Springer, Cham. https://doi.org/10.1007/978-3-319-56148-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-56148-6_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56147-9
Online ISBN: 978-3-319-56148-6
eBook Packages: Computer ScienceComputer Science (R0)