skip to main content
10.1145/3654777.3676350acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article
Open access

PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch

Published: 11 October 2024 Publication History

Abstract

We routinely perform procedures (such as cooking) that include a set of atomic steps. Often, inadvertent omission or misordering of a single step can lead to serious consequences, especially for those experiencing cognitive challenges such as dementia. This paper introduces PrISM-Observer, a smartwatch-based, context-aware, real-time intervention system designed to support daily tasks by preventing errors. Unlike traditional systems that require users to seek out information, the agent observes user actions and intervenes proactively. This capability is enabled by the agent’s ability to continuously update its belief in the user’s behavior in real-time through multimodal sensing and forecast optimal intervention moments and methods. We first validated the steps-tracking performance of our framework through evaluations across three datasets with different complexities. Then, we implemented a real-time agent system using a smartwatch and conducted a user study in a cooking task scenario. The system generated helpful interventions, and we gained positive feedback from the participants. The general applicability of PrISM-Observer to daily tasks promises broad applications, for instance, including support for users requiring more involved interventions, such as people with dementia or post-surgical patients.

References

[1]
2013. Memory Lapse – Four Things Slip Our Mind Every Day. https://en.paperblog.com/memory-lapse-four-things-slip-our-mind-every-day-639158/
[2]
Saleema Amershi, Daniel S. Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi T. Iqbal, Paul N. Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 3. https://doi.org/10.1145/3290605.3300233
[3]
Apple. 2022. Accelerate. https://developer.apple.com/documentation/accelerate
[4]
Apple. 2022. CoreML. https://developer.apple.com/documentation/coreml
[5]
Apple. 2022. Handwashing on Apple Watch. https://support.apple.com/guide/watch/set-up-handwashing-apdc9b9f04a8/watchos
[6]
Riku Arakawa and Hiromu Yakura. 2019. REsCUE: A framework for REal-time feedback on behavioral CUEs using multimodal anomaly detection. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019, Glasgow, Scotland, UK, May 04-09, 2019. ACM, 572. https://doi.org/10.1145/3290605.3300802
[7]
Riku Arakawa and Hiromu Yakura. 2021. Mindless Attractor: A False-Positive Resistant Intervention for Drawing Attention Using Auditory Perturbation. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021. ACM, 99:1–99:15. https://doi.org/10.1145/3411764.3445339
[8]
Riku Arakawa, Hiromu Yakura, Vimal Mollyn, Suzanne Nie, Emma Russell, Dustin P. DeMeo, Haarika A. Reddy, Alexander K. Maytin, Bryan T. Carroll, Jill Fain Lehman, and Mayank Goel. 2022. PrISM-Tracker: A Framework for Multimodal Procedure Tracking Using Wearable Sensors and State Transition Information with User-Driven Handling of Errors and Uncertainty. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 4 (2022), 156:1–156:27. https://doi.org/10.1145/3569504
[9]
Sara Ashry, Tetsuji Ogawa, and Walid Gomaa. 2020. CHARM-Deep: Continuous Human Activity Recognition Model Based on Deep Neural Network Using IMU Sensors of Smartwatch. IEEE Sensors Journal 20, 15 (Aug. 2020), 8757–8770. https://doi.org/10.1109/jsen.2020.2985374
[10]
Jenna Beaver, Kaci B. Wilson, and Maureen Schmitter-Edgecombe. 2017. Characterising omission errors in everyday task completion and cognitive correlates in individuals with mild cognitive impairment and dementia. Neuropsychological Rehabilitation 29, 5 (June 2017), 804–820. https://doi.org/10.1080/09602011.2017.1337039
[11]
Vincent Becker, Linus Fessler, and Gábor Sörös. 2019. GestEar: combining audio and motion sensing for gesture recognition on smartwatches. In Proceedings of the 23rd International Symposium on Wearable Computers, UbiComp 2019, London, UK, September 09-13, 2019. ACM, 10–19. https://doi.org/10.1145/3341163.3347735
[12]
Riccardo Bovo, Nicola Binetti, Duncan P. Brumby, and Simon Julier. 2020. Detecting errors in pick and place procedures: detecting errors in multi-stage and sequence-constrained manual retrieve-assembly procedures. In IUI ’20: 25th International Conference on Intelligent User Interfaces, Cagliari, Italy, March 17-20, 2020. ACM, 536–545. https://doi.org/10.1145/3377325.3377497
[13]
Michael D. Byrne and Susan Bovair. 1997. A Working Memory Model of a Common Procedural Error. Cognitive Science 21, 1 (Jan. 1997), 31–61. https://doi.org/10.1207/s15516709cog2101_2
[14]
Jiawen Chu. 2021. Recipe Bot: The Application of Conversational AI in Home Cooking Assistant. In 2021 2nd International Conference on Big Data; Artificial Intelligence; Software Engineering (ICBASE). IEEE. https://doi.org/10.1109/icbase53849.2021.00136
[15]
Fred D. Davis. 1989. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly 13, 3 (1989), 319–340. https://doi.org/10.2307/249008
[16]
Alexander Frummet, Alessandro Speggiorin, David Elsweiler, Anton Leuski, and Jeff Dalton. 2024. Cooking with Conversation: Enhancing User Engagement and Learning with a Knowledge-Enhancing Assistant. ACM Transactions on Information Systems (2024).
[17]
Wayne D. Gray and Deborah A. Boehm-Davis. 2000. Milliseconds matter: An introduction to microstrategies and to their use in describing and predicting interactive behavior.Journal of Experimental Psychology: Applied 6, 4 (2000), 322–335. https://doi.org/10.1037/1076-898x.6.4.322
[18]
Yu Guan and Thomas Plötz. 2017. Ensembles of Deep LSTM Learners for Activity Recognition using Wearables. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 2 (2017), 11:1–11:28. https://doi.org/10.1145/3090076
[19]
Reiko Hamada, Jun Okabe, Ichiro Ide, Shin’ichi Satoh, Shuichi Sakai, and Hidehiko Tanaka. 2005. Cooking navi: assistant for daily cooking in kitchen. In Proceedings of the 13th ACM International Conference on Multimedia, Singapore, November 6-11, 2005. ACM, 371–374. https://doi.org/10.1145/1101149.1101228
[20]
Hiroki Hasada, Junjian Zhang, Kenta Yamamoto, Bektur Ryskeldiev, and Yoichi Ochiai. 2019. AR Cooking: Comparing Display Methods for the Instructions of Cookwares on AR Goggles. In Human Interface and the Management of Information. Information in Intelligent Systems - Thematic Area, HIMI 2019, Held as Part of the 21st HCI International Conference, HCII 2019, Orlando, FL, USA, July 26-31, 2019, Proceedings, Part II(Lecture Notes in Computer Science, Vol. 11570). Springer, 127–140. https://doi.org/10.1007/978-3-030-22649-7_11
[21]
Jinhui Hu, Cong Xin, Manman Zhang, and Youzhen Chen. 2023. The effect of cognitive load and time stress on prospective memory and its components. Current Psychology 43, 2 (Feb. 2023), 1670–1684. https://doi.org/10.1007/s12144-023-04354-1
[22]
Gaoping Huang, Xun Qian, Tianyi Wang, Fagun Patel, Maitreya Sreeram, Yuanzhi Cao, Karthik Ramani, and Alexander J. Quinn. 2021. AdapTutAR: An adaptive tutoring system for machine tasks in augmented reality. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 417:1–417:15. https://doi.org/10.1145/3411764.3445283
[23]
Razan Jaber, Sabrina Zhong, Sanna Kuoppamäki, Aida Hosseini, Iona Gessinger, Duncan P. Brumby, Benjamin R. Cowan, and Donald McMillan. 2024. Cooking With Agents: Designing Context-aware Voice Interaction. In Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI 2024, Honolulu, HI, USA, May 11-16, 2024. ACM, 551:1–551:13. https://doi.org/10.1145/3613904.3642183
[24]
Staffan Josephsson, Lars Bäckman, Lena Borell, Birgitta Bernspång, Louise Nygård, and Lisa Rönnberg. 1993. Supporting everyday activities in dementia: An intervention study. International Journal of Geriatric Psychiatry 8, 5 (May 1993), 395–400. https://doi.org/10.1002/gps.930080505
[25]
Giulio Lancioni, Lorenzo Desideri, Nirbhay Singh, Mark O’Reilly, and Jeff Sigafoos. 2021. Technology options to help people with dementia or acquired cognitive impairment perform multistep daily tasks: a scoping review. Journal of Enabling Technologies 15, 3 (May 2021), 208–223. https://doi.org/10.1108/jet-11-2020-0048
[26]
Gierad Laput, Karan Ahuja, Mayank Goel, and Chris Harrison. 2018. Ubicoustics: Plug-and-play acoustic activity recognition. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. ACM, New York, NY, 213–224. https://doi.org/10.1145/3242587.3242609
[27]
Gierad Laput, Robert Xiao, and Chris Harrison. 2016. ViBand: High-fidelity bio-acoustic sensing using commodity smartwatch accelerometers. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, New York, NY, 321–333. https://doi.org/10.1145/2984511.2984582
[28]
Ioulietta Lazarou, Anastasios Karakostas, Thanos G. Stavropoulos, Theodoros Tsompanidis, Georgios Meditskos, Ioannis Kompatsiaris, and Magda Tsolaki. 2016. A Novel and Intelligent Home Monitoring System for Care Support of Elders with Cognitive Impairment. Journal of Alzheimer’s Disease 54, 4 (Oct. 2016), 1561–1591. https://doi.org/10.3233/jad-160348
[29]
Matthew L. Lee and Anind K. Dey. 2014. Real-time feedback for improving medication taking. In CHI Conference on Human Factors in Computing Systems, CHI’14, Toronto, ON, Canada - April 26 - May 01, 2014. ACM, 2259–2268. https://doi.org/10.1145/2556288.2557210
[30]
Lowell S Levin and Ellen L Idler. 1983. Self-care in health. Annual review of public health 4, 1 (1983), 181–201.
[31]
Daniyal Liaqat, Robert Wu, Andrea Gershon, Hisham Alshaer, Frank Rudzicz, and Eyal de Lara. 2018. Challenges with real-world smartwatch based audio monitoring. In Proceedings of the 4th ACM Workshop on Wearable Systems and Applications, WearSys@MobiSys 2018, Munich, Germany, June 10, 2018. ACM, 54–59. https://doi.org/10.1145/3211960.3211977
[32]
Ziyi Liu, Zhengzhe Zhu, Enze Jiang, Feichi Huang, Ana M. Villanueva, Xun Qian, Tianyi Wang, and Karthik Ramani. 2023. InstruMentAR: Auto-Generation of Augmented Reality Tutorials for Operating Digital Instruments Through Recording Embodied Demonstration. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI 2023, Hamburg, Germany, April 23-28, 2023. ACM, 32:1–32:17. https://doi.org/10.1145/3544548.3581442
[33]
Nicholas Metropolis and Stanislaw Ulam. 1949. The monte carlo method. Journal of the American statistical association 44, 247 (1949), 335–341.
[34]
Vimal Mollyn, Karan Ahuja, Dhruv Verma, Chris Harrison, and Mayank Goel. 2022. SAMoSA: Sensing Activities with Motion and Subsampled Audio. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 3 (2022), 132:1–132:19. https://doi.org/10.1145/3550284
[35]
Yasushi Nakauchi, Takuo Suzuki, Akira Tokumasu, and Sho Murakami. 2009. Cooking procedure recognition and inference in sensor embedded kitchen. In Proceedings of the 18th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, New York, NY, 593–600. https://doi.org/10.1109/ROMAN.2009.5326050
[36]
Denise C. Park, Roger W. Morrell, David Frieske, and Deborah Kincaid. 1992. Medication adherence behaviors in older adults: Effects of external cognitive supports.Psychology and Aging 7, 2 (1992), 252–256. https://doi.org/10.1037/0882-7974.7.2.252
[37]
Prasoon Patidar, Mayank Goel, and Yuvraj Agarwal. 2023. VAX: Using Existing Video and Audio-based Activity Recognition Models to Bootstrap Privacy-Sensitive Sensors. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 7, 3 (2023), 117:1–117:24. https://doi.org/10.1145/3610907
[38]
Rohith Peddi, Shivvrat Arya, Bharath Challa, Likhitha Pallapothula, Akshay Vyas, Jikai Wang, Qifan Zhang, Vasundhara Komaragiri, Eric D. Ragan, Nicholas Ruozzi, Yu Xiang, and Vibhav Gogate. 2023. CaptainCook4D: A dataset for understanding errors in procedural activities. CoRR abs/2312.14556 (2023). https://doi.org/10.48550/ARXIV.2312.14556 arXiv:2312.14556
[39]
Meghana Ratna Pydi, Petra Stankard, Neha Parikh, Purnima Ranawat, Ravneet Kaur, AG Shankar, Angela Chaudhuri, Sonjelle Shilton, Aditi Srinivasan, Joyita Chowdhury, and Elena Ivanova Reipold. 2023. Assessment of the Usability of SARS-CoV-2 Self Tests in a Peer-Assisted Model among Factory Workers in Bengaluru, India. (Nov. 2023). https://doi.org/10.1101/2023.11.20.23298784
[40]
Shanhu Qiu, Xue Cai, Xiang Chen, Bingquan Yang, and Zilin Sun. 2014. Step counter use in type 2 diabetes: a meta-analysis of randomized controlled trials. BMC medicine 12 (2014), 1–9.
[41]
James Reason. 1990. Human error. Cambridge university press.
[42]
A. Riedel, J. Gerlach, M. Dietsch, S. Herbst, F. Engelmann, N. Brehm, and T. Pfeifroth. 2021. A deep learning-based worker assistance system for error prevention: Case study in a real-world manual assembly. Advances in Production Engineering & Management 16, 4 (Dec. 2021), 393–404. https://doi.org/10.14743/apem2021.4.408
[43]
Ayaka Sato, Keita Watanabe, and Jun Rekimoto. 2014. MimiCook: A cooking assistant system with situated guidance. In Proceedings of the 8th International Conference on Tangible, Embedded, and Embodied Interaction. ACM, New York, NY, 121–124. https://doi.org/10.1145/2540930.2540952
[44]
Tim J Schoonbeek, Tim Houben, Hans Onvlee, Fons van der Sommen, 2024. IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 4365–4374.
[45]
Katie Seaborn, Norihisa P. Miyake, Peter Pennefather, and Mihoko Otake-Matsuura. 2022. Voice in Human-Agent Interaction: A Survey. ACM Comput. Surv. 54, 4 (2022), 81:1–81:43. https://doi.org/10.1145/3386867
[46]
Fadime Sener, Dibyadip Chatterjee, Daniel Shelepov, Kun He, Dipika Singhania, Robert Wang, and Angela Yao. 2022. Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. IEEE, 21064–21074. https://doi.org/10.1109/CVPR52688.2022.02042
[47]
J. Serván, F. Mas, J.L. Menéndez, and J. Ríos. 2012. Assembly Work Instruction Deployment Using Augmented Reality. Key Engineering Materials 502 (Feb. 2012), 25–30. https://doi.org/10.4028/www.scientific.net/kem.502.25
[48]
Carrie L Shandra and Nihil Sonalkar. 2016. Health self-care in the United States. Public Health 138 (2016), 26–32. https://doi.org/10.1016/j.puhe.2016.02.030
[49]
Charlotte Brun Thorup, Jan Jesper Andreasen, Erik Elgaard Sørensen, Mette Grønkjær, Birthe Irene Dinesen, and John Hansen. 2017. Accuracy of a step counter during treadmill and daily life walking by healthy adults and patients with cardiac disease. BMJ open 7, 3 (2017), e011742.
[50]
Daisuke Uriu, Mizuki Namai, Satoru Tokuhisa, Ryo Kashiwagi, Masahiko Inami, and Naohito Okude. 2012. Panavi: Recipe medium with a sensors-embedded pan for domestic users to master professional culinary arts. In Proceedings of the 2012 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 129–138. https://doi.org/10.1145/2207676.2207695
[51]
Annalise Vaccarello, Alexander K. Maytin, Yash Kumar, Toluwalashe Onamusi, Haarika A. Reddy, Mayank Goel, Riku Arakawa, Jill Fain Lehman, and Bryan T. Carroll. 2024. Barriers to use of digital assistance for postoperative wound care: a single-center survey of dermatologic surgery patients. Archives of Dermatological Research 316, 7 (June 2024). https://doi.org/10.1007/s00403-024-03025-w
[52]
Naoki Wake, Riku Arakawa, Iori Yanokura, Takuya Kiyokawa, Kazuhiro Sasabuchi, Jun Takamatsu, and Katsushi Ikeuchi. 2021. A Learning-from-Observation Framework: One-Shot Robot Teaching for Grasp-Manipulation-Release Household Operations. In IEEE/SICE International Symposium on System Integration, SII 2021, Iwaki, Japan, January 11-14, 2021. IEEE, 461–466. https://doi.org/10.1109/IEEECONF49454.2021.9382750
[53]
Xin Wang, Taein Kwon, Mahdi Rad, Bowen Pan, Ishani Chakraborty, Sean Andrist, Dan Bohus, Ashley Feniello, Bugra Tekin, Felipe Vieira Frujeri, Neel Joshi, and Marc Pollefeys. 2023. HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World. In IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. IEEE, 20213–20224. https://doi.org/10.1109/ICCV51070.2023.01854
[54]
Santosh Kumar Yadav, Kamlesh Tiwari, Hari Mohan Pandey, and Ali Akbar Shaikh. 2021. A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions. Knowl. Based Syst. 223 (2021), 106970. https://doi.org/10.1016/J.KNOSYS.2021.106970
[55]
Masahiro Yamaguchi, Shohei Mori, Peter Mohr, Markus Tatzgern, Ana Stanescu, Hideo Saito, and Denis Kalkofen. 2020. Video-Annotated Augmented Reality Assembly Tutorials. In UIST ’20: The 33rd Annual ACM Symposium on User Interface Software and Technology, Virtual Event, USA, October 20-23, 2020. ACM, 1010–1022. https://doi.org/10.1145/3379337.3415819
[56]
Yiran Zhang. 2023. The underlying reasons for making reminders: An investigation on memory offloading from the perspective of cognitive psychology. Journal of Education, Humanities and Social Sciences 8 (Feb. 2023), 2208–2213. https://doi.org/10.54097/ehss.v8i.4678
[57]
Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, and Graham Neubig. 2022. Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. Association for Computational Linguistics, 2998–3012. https://doi.org/10.18653/V1/2022.ACL-LONG.214

Cited By

View all
  • (2024)PrISM-Q&A: Step-Aware Voice Assistant on a Smartwatch Enabled by Multimodal Procedure Tracking and Large Language ModelsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997598:4(1-26)Online publication date: 21-Nov-2024
  • (2024)Unified Framework for Procedural Task Assistants powered by Human Activity RecognitionCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3678448(513-518)Online publication date: 5-Oct-2024

Index Terms

  1. PrISM-Observer: Intervention Agent to Help Users Perform Everyday Procedures Sensed using a Smartwatch

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology
      October 2024
      2334 pages
      ISBN:9798400706288
      DOI:10.1145/3654777
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 October 2024

      Check for updates

      Author Tags

      1. context-aware intervention
      2. procedure tracking
      3. task assistant

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      UIST '24

      Acceptance Rates

      Overall Acceptance Rate 561 of 2,567 submissions, 22%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)619
      • Downloads (Last 6 weeks)193
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)PrISM-Q&A: Step-Aware Voice Assistant on a Smartwatch Enabled by Multimodal Procedure Tracking and Large Language ModelsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997598:4(1-26)Online publication date: 21-Nov-2024
      • (2024)Unified Framework for Procedural Task Assistants powered by Human Activity RecognitionCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3678448(513-518)Online publication date: 5-Oct-2024

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media