Abstract
Motivated by applications in nutritional epidemiology and food journaling, computing researchers have proposed numerous techniques for automating dietary monitoring over the years. Although progress has been made, a truly practical system that can automatically recognize what people eat in real-world settings remains elusive. Eating detection is a foundational element of automated dietary monitoring (ADM) since automatically recognizing when a person is eating is required before identifying what and how much is being consumed. Additionally, eating detection can serve as the basis for new types of dietary self-monitoring practices such as semi-automated food journaling.This chapter discusses the problem of automated eating detection and presents a variety of practical techniques for detecting eating activities in real-world settings. These techniques center on three sensing modalities: first-person images taken with wearable cameras, ambient sounds, and on-body inertial sensors [34–37]. The chapter begins with an analysis of how first-person images reflecting everyday experiences can be used to identify eating moments using two approaches: human computation and convolutional neural networks. Next, we present an analysis showing how certain sounds associated with eating can be recognized and used to infer eating activities. Finally, we introduce a method for detecting eating moments with on-body inertial sensors placed on the wrist.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Amft, O. and Tröster, G., “On-Body Sensing Solutions for Automatic Dietary Monitoring,” IEEE pervasive computing, vol. 8, Apr. 2009.
Bäckström, T. and Magi, C., “Properties of line spectrum pair polynomials—A review,” Signal Processing, vol. 86, pp. 3286–3298, Nov. 2006.
Boushey, C. J., Coulston, A. M., Rock, C. L., and Monsen, E., Nutrition in the Prevention and Treatment of Disease. Academic Press, 2001.
Castro, D., Hickson, S., Bettadapura, V., Thomaz, E., Abowd, G.D., Christensen, H. and Essa, I., “Predicting daily activities from egocentric images using deep learning,” in Proceedings of the 2015 ACM International symposium on Wearable Computers, pp.75–82, 2015.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L., “Imagenet: A large-scale hierarchical image database,” in CVPR, pp. 248–255, IEEE, 2009.
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X., “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.,” KDD, pp. 226–231, 1996.
Farb, P. and Armelagos, G., Consuming passions, the anthropology of eating. Houghton Mifflin, 1980.
Fouse, A., Weibel, N., Hutchins, E., and Hollan, J. D., “ChronoViz: a system for supporting navigation of time-coded data.,” CHI Extended Abstracts, pp. 299–304, 2011.
Gillet, O. and Richard, G., “Automatic transcription of drum loops,” in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. iv–269–iv–272, IEEE, 2004.
Go, V. L. W., Nguyen, C. T. H., Harris, D. M., and Lee, W.-N. P., “Nutrient-gene interaction: metabolic genotype-phenotype relationship.,” The Journal of nutrition, vol. 135, pp. 3016S–3020S, Dec. 2005.
Gowdy, J., Limited wants, unlimited means: A reader on hunter-gatherer economics and the environment. Island Press, 1997.
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. R., “Improving neural networks by preventing co-adaptation of feature detectors,” CoRR, 2012.
Hoyle, R., Templeman, R., Armes, S., Anthony, D., Crandall, D., and Kapadia, A., “Privacy behaviors of lifeloggers using wearable cameras,” in the 2014 ACM International Joint Conference, (New York, New York, USA), pp. 571–582, ACM Press, 2014.
Jacobs, D. R., “Challenges in research in nutritional epidemiology,” Nutritional Health, pp. 29–42, 2012.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T., “Caffe: Convolutional architecture for fast feature embedding,” in ACM Multimedia, pp. 675–678, 2014.
Kahneman, D., Krueger, A. B., Schkade, D. A., and Schwarz, N., “A Survey Method for Characterizing Daily Life Experience: The Day Reconstruction Method,” Science, 2004.
Kelly, P., Marshall, S. J., Badland, H., Kerr, J., Oliver, M., Doherty, A. R., and Foster, C., “An ethical framework for automated, wearable cameras in health behavior research.,” American journal of preventive medicine, vol. 44, pp. 314–319, Mar. 2013.
Kleitman, N., Sleep and wakefulness. Chicago: The University of Chicago Press, July 1963.
Krizhevsky, A., Sutskever, I., and Hinton, G. E., “Imagenet classification with deep convolutional neural networks,” in NIPS, pp. 1097–1105, 2012.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P., “Gradient-based learning applied to document recognition,” IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
Liu, J., Johns, E., Atallah, L., Pettitt, C., Lo, B., Frost, G., and Yang, G.-Z., “An Intelligent Food-Intake Monitoring System Using Wearable Sensors,” in Wearable and Implantable Body Sensor Networks (BSN), 2012 Ninth International Conference on, pp. 154–160, IEEE Computer Society, 2012.
Lu, H., Pan, W., Lane, N., Choudhury, T., and Campbell, A., “SoundSense: scalable sound sensing for people-centric applications on mobile phones,” Proceedings of the 7th international conference on Mobile systems, applications, and services, pp. 165–178, 2009.
Makhoul, J., “Linear prediction: A tutorial review,” Proceedings of the IEEE, vol. 63, pp. 561–580, Apr. 1975.
Mathieu, B., Essid, S., Fillon, T., Prado, J., and Richard, G., “YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software,” in proceedings of the 11th ISMIR conference, 2010, Sept. 2010.
Michels, K. B., “A renaissance for measurement error.,” International journal of epidemiology, vol. 30, pp. 421–422, June 2001.
Mintz, S. W. and Du Bois, C. M., “The anthropology of food and eating,” Annual review of anthropology, pp. 99–119, 2002.
Moore, B. C. J., Glasberg, B. R., and Baer, T., “A Model for the Prediction of Thresholds, Loudness, and Partial Loudness,” Journal of the Audio Engineering Society, vol. 45, no. 4, pp. 224–240, 1997.
Nguyen, D. H., Marcu, G., Hayes, G. R., Truong, K. N., Scott, J., Langheinrich, M., and Roduner, C., “Encountering SenseCam: personal recording technologies in everyday life,” pp. 165–174, 2009.
Rossi, M., Feese, S., Amft, O., Braune, N., Martis, S., and Tröster, G., “AmbientSense: A real-time ambient sound recognition system for smartphones,” in Pervasive Computing and Communications Workshops (PERCOM Workshops), 2013 IEEE International Conference on, pp. 230–235, 2013.
Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T., “LabelMe: A Database and Web-Based Tool for Image Annotation,” International Journal of Computer Vision, vol. 77, May 2008.
Scheirer, E. and Slaney, M., “Construction and evaluation of a robust multifeature speech/music discriminator,” IEEE Internation Conference on Acoustics, Speech and Signal Processing, p.1331–1334, 1997., vol. 2, pp. 1331–1334, 1997.
Schussler, H., “A stability theorem for discrete systems,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 24, pp. 87–89, Feb. 1976.
Sorokin, A. and Forsyth, D., “Utility data annotation with Amazon Mechanical Turk,” Audio, Transactions of the IRE Professional Group on, pp. 1–8, June 2008.
Thomaz, E., Abowd, G., and Essa, I., “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing,” in UbiComp ’15: Proceedings of the 2015 ACM international joint conference on Pervasive and ubiquitous computing, pp. 1–12, July 2015.
Thomaz, E., Parnami, A., Bidwell, J., Essa, I. A., and Abowd, G. D., “Technological approaches for addressing privacy concerns when recognizing eating behaviors with wearable cameras.,” UbiComp, pp. 739–748, 2013.
Thomaz, E., Parnami, A., Essa, I. A., and Abowd, G. D., “Feasibility of identifying eating moments from first-person images leveraging human computation.,” SenseCam, pp. 26–33, 2013.
Thomaz, E., Zhang, C., Essa, I., and Abowd, G. D., “Inferring Meal Eating Activities in Real World Settings from Ambient Sounds,” in the 20th Intelligent User Interfaces Conference (IUI), (New York, New York, USA), pp. 427–431, ACM Press, 2015.
von Ahn, L. and Dabbish, L., “Labeling images with a computer game,” in CHI ’04: Proceedings of the SIGCHI conference on Human factors in computing systems, ACM Request Permissions, Apr. 2004.
von Ahn, L., Liu, R., and Blum, M., “Peekaboom: a game for locating objects in images,” in CHI ’06: Proceedings of the SIGCHI conference on Human Factors in computing systems, ACM Request Permissions, Apr. 2006.
Willett, W., Nutritional Epidemiology. Oxford University Press, Oct. 2012.
Wyatt, D., Choudhury, T., and Bilmes, J., “Conversation detection and speaker segmentation in privacy-sensitive situated speech data.,” Proceedings of Interspeech, pp. 586–589, 2007.
Yatani, K. and Truong, K. N., “BodyScope: a wearable acoustic sensor for activity recognition,” UbiComp ’12: Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pp. 341–350, 2012.
Zeiler, M. D. and Fergus, R., “Visualizing and understanding convolutional networks,” in ECCV, pp. 818–833, Springer, 2014.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Thomaz, E., Essa, I.A., Abowd, G.D. (2017). Challenges and Opportunities in Automated Detection of Eating Activity. In: Rehg, J., Murphy, S., Kumar, S. (eds) Mobile Health. Springer, Cham. https://doi.org/10.1007/978-3-319-51394-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-51394-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51393-5
Online ISBN: 978-3-319-51394-2
eBook Packages: Computer ScienceComputer Science (R0)