skip to main content
10.1145/3686215.3686219acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
short-paper
Open access

Automatic Recognition of Commensal Activities in Co-located and Online settings

Published: 04 November 2024 Publication History

Abstract

Technological advancement has profoundly impacted how people share meals, fostering research interest in new forms of commensality such as tele-dining and eating with artificial companions. Consequently, there is a need to develop computational methods for recognizing commensal activities, that is, actions related to food consumption and social signals displayed during meal-time. This paper introduces the first dataset that consists of synchronized video data of co-located dining dyads. The dataset is annotated with key social signals such as speaking activity, smiling, and food-related activities like chewing and food intake. Unlike previous studies that use remote settings, this work emphasizes the differences between online and co-located setups. A set of machine learning experiments is conducted on our and existing datasets, reaching the best F-score of 0.82. The cross-dataset analysis between co-located and online datasets also evidences the significant disparity between these two settings. While mixing co-located and online recordings may increase the model’s generalizability, we notice strong differences between the two settings, highlighting the importance of in-person data recordings for accurate recognition.

References

[1]
Pierre-Emmanuel Aguera, Karim Jerbi, Anne Caclin, and Olivier Bertrand. 2011. ELAN: a software package for analysis and visualization of MEG, EEG, and LFP signals. Computational intelligence and neuroscience 2011, 1 (2011), 158970.
[2]
Reem K Al-Halimi and Medhat Moussa. 2016. Performing complex tasks by users with upper-extremity disabilities using a 6-DOF robotic arm: a study. IEEE Transactions on Neural Systems and Rehabilitation Engineering 25, 6 (2016), 686–693.
[3]
Sana Alshboul and Mohammad Fraiwan. 2021. Determination of chewing count from video recordings using discrete wavelet decomposition and low pass filtration. Sensors 21, 20 (2021), 6806.
[4]
Oliver Amft, Martin Kusserow, and Gerhard Troster. 2009. Bite weight prediction from acoustic recognition of chewing. IEEE Transactions on Biomedical Engineering 56, 6 (2009), 1663–1672.
[5]
Tadas Baltrusaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. OpenFace 2.0: Facial Behavior Analysis Toolkit. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). 59–66. https://doi.org/10.1109/FG.2018.00019
[6]
Cigdem Beyan, Vasiliki-Maria Katsageorgiou, and Vittorio Murino. 2017. Moving as a leader: Detecting emergent leadership in small groups using body pose. In Proceedings of the 25th ACM international conference on Multimedia. 1425–1433.
[7]
Cigdem Beyan, Alessandro Vinciarelli, and Alessio Del Bue. 2023. Co-Located Human–Human Interaction Analysis Using Nonverbal Cues: A Survey. Comput. Surveys 56, 5 (2023), 1–41.
[8]
Yin Bi, Wenyao Xu, Nan Guan, Yangjie Wei, and Wang Yi. 2014. Pervasive eating habits monitoring and recognition through a wearable acoustic sensor. In Proceedings of the 8th International Conference on Pervasive Computing Technologies for Healthcare. 174–177.
[9]
Julie Boland, Pedro Fonseca, Ilana Mermelstein, and Myles Williamson. 2021. Zoom Disrupts the Rhythm of Conversation. Journal of Experimental Psychology: General 151 (11 2021). https://doi.org/10.1037/xge0001150
[10]
Eleni Diamantidou, Dimitrios Giakoumis, Konstantinos Votis, Dimitrios Tzovaras, and Spiridon Likothanassis. 2022. Comparing deep learning and human crafted features for recognising hand activities of daily living from wearables. In 2022 23rd IEEE International Conference on Mobile Data Management (MDM). IEEE, 381–384.
[11]
Dhia-Elhak Goumri, Thomas Janssoone, Leonor Becerra-Bonache, and Abdellah Fourtassi. 2023. Automatic Detection of Gaze and Smile in Children’s Video Calls. In Companion Publication of the 25th International Conference on Multimodal Interaction. 383–388.
[12]
Simone Hantke, Felix Weninger, Richard Kurle, Fabien Ringeval, Anton Batliner, Amr El-Desoky Mousa, and Björn Schuller. 2016. I hear you eat and speak: Automatic recognition of eating condition and food type, use-cases, and impact on asr performance. PloS one 11, 5 (2016), e0154486.
[13]
Rohit Ashok Khot, Eshita Sri Arza, Harshitha Kurra, and Yan Wang. 2019. Fobo: Towards designing a robotic companion for solo dining. In Extended abstracts of the 2019 CHI conference on human factors in computing systems. 1–6.
[14]
Konstantinos Kyritsis, Christos Diou, and Anastasios Delopoulos. 2020. A data driven end-to-end approach for in-the-wild monitoring of eating behavior using smartwatches. IEEE Journal of Biomedical and Health Informatics 25, 1 (2020), 22–34.
[15]
Jie Li, Na Zhang, Lizhen Hu, Ze Li, Rui Li, Cong Li, and Shuran Wang. 2011. Improvement in chewing activity reduces energy intake in one meal and modulates plasma gut hormone concentrations in obese and lean young Chinese men. The American journal of clinical nutrition 94, 3 (2011), 709–716.
[16]
Maurizio Mancini, Radoslaw Niewiadomski, Gijs Huisman, Merijn Bruijnes, and Conor Patrick Gallagher. 2020. Room for One More? - Introducing Artificial Commensal Companions. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems Extended Abstracts (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3334480.3383027
[17]
Mark Mirtchouk, Drew Lustig, Alexandra Smith, Ivan Ching, Min Zheng, and Samantha Kleinberg. 2017. Recognizing eating from body-worn sensors: Combining free-living and laboratory data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 3 (2017), 1–20.
[18]
Radoslaw Niewiadomski, Eleonora Ceccaldi, Gijs Huisman, Gualtiero Volpe, and Maurizio Mancini. 2019. Computational Commensality: from theories to computational models for social food preparation and consumption in HCI. Frontiers in Robotics and AI 6 (2019), 119. https://doi.org/10.3389/frobt.2019.00119
[19]
Radoslaw Niewiadomski, Gabriele De Lucia, Gabriele Grazzi, and Maurizio Mancini. 2022. Towards Commensal Activities Recognition. In Proceedings of the 2022 International Conference on Multimodal Interaction. 549–557. https://doi.org/10.1145/3536221.3556566
[20]
R. Niewiadomski, M. Mancini, G. Varni, G. Volpe, and A. Camurri. 2016. Automated Laughter Detection From Full-Body Movements. IEEE Transactions on Human-Machine Systems 46, 1 (Feb 2016), 113–123. https://doi.org/10.1109/THMS.2015.2480843
[21]
Elinor Ochs and Merav Shohet. 2006. The cultural structuring of mealtime socialization. New directions for child and adolescent development 2006, 111 (2006), 35–49.
[22]
Jan Ondras, Abrar Anwar, Tong Wu, Fanjun Bu, Malte Jung, Jorge Jose Ortiz, and Tapomayukh Bhattacharjee. 2022. Human-robot commensality: Bite timing prediction for robot-assisted feeding in groups. In 6th Annual Conference on Robot Learning.
[23]
Grégoire Python, Cyrielle Demierre, Marion Bourqui, Angelina Bourbon, Estelle Chardenon, Roland Trouville, Marina Laganaro, and Cécile Fougeron. 2023. Comparison of In-Person and Online Recordings in the Clinical Teleassessment of Speech Production: A Pilot Study. Brain Sciences 13, 2 (2023). https://doi.org/10.3390/brainsci13020342
[24]
Justine Reverdy, Sam O’Connor Russell, Louise Duquenne, Diego Garaialde, Benjamin R. Cowan, and Naomi Harte. 2022. RoomReader: A Multimodal Corpus of Online Multiparty Conversational Interactions. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, and Stelios Piperidis (Eds.). European Language Resources Association, Marseille, France, 2517–2527. https://aclanthology.org/2022.lrec-1.268
[25]
Inka Schmitz and Wolfgang Einhäuser. 2023. Gaze estimation in videoconferencing settings. Computers in Human Behavior 139 (2023), 107517. https://doi.org/10.1016/j.chb.2022.107517
[26]
Muhammad Shahid, Cigdem Beyan, and Vittorio Murino. 2019. Comparisons of visual activity primitives for voice activity detection. In Image Analysis and Processing–ICIAP 2019: 20th International Conference, Trento, Italy, September 9–13, 2019, Proceedings, Part I 20. Springer, 48–59.
[27]
Chunzhuo Wang, T Sunil Kumar, Walter De Raedt, Guido Camps, Hans Hallez, and Bart Vanrumste. 2022. Eat-Radar: Continuous Fine-Grained Eating Gesture Detection Using FMCW Radar and 3D Temporal Convolutional Network. arXiv preprint arXiv:2211.04253 (2022).
[28]
Fan Yang, Mohamed A Sehili, Claude Barras, and Laurence Devillers. 2015. Smile and laughter detection for elderly people-robot interaction. In Social Robotics: 7th International Conference, ICSR 2015, Paris, France, October 26-30, 2015, Proceedings 7. Springer, 694–703.
[29]
Bo Zhou, Jingyuan Cheng, Mathias Sundholm, Attila Reiss, Wuhuang Huang, Oliver Amft, and Paul Lukowicz. 2015. Smart table surface: A novel approach to pervasive dining monitoring. In 2015 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 155–162.
[30]
Julian Zubek, Ewa Nagórska, Joanna Komorowska-Mach, Katarzyna Skowrońska, Konrad Zieliński, and Joanna Rączaszek-Leonardi. 2022. Dynamics of Remote Communication: Movement Coordination in Video-Mediated and Face-to-Face Conversations. Entropy 24, 4 (2022). https://doi.org/10.3390/e24040559

Index Terms

  1. Automatic Recognition of Commensal Activities in Co-located and Online settings

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMI Companion '24: Companion Proceedings of the 26th International Conference on Multimodal Interaction
      November 2024
      252 pages
      ISBN:9798400704635
      DOI:10.1145/3686215
      This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 November 2024

      Check for updates

      Author Tags

      1. Activity recognition
      2. co-located
      3. commensality
      4. datasets
      5. in-person
      6. social interactions

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Funding Sources

      • Next Generation EU (NGEU) Programme, National Recovery and Resilience Plan (PNRR) and by the Italian Ministry of University and Research

      Conference

      ICMI '24
      Sponsor:
      ICMI '24: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
      November 4 - 8, 2024
      San Jose, Costa Rica

      Acceptance Rates

      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 103
        Total Downloads
      • Downloads (Last 12 months)103
      • Downloads (Last 6 weeks)44
      Reflects downloads up to 08 Mar 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media