Elsevier

Neurocomputing

Volume 100, 16 January 2013, Pages 163-169
Neurocomputing

Video analysis for identifying human operation difficulties and faucet usability assessment

https://doi.org/10.1016/j.neucom.2011.10.041Get rights and content

Abstract

As the world struggles to cope with a growing elderly population, concerns of how to preserve independence are becoming increasingly acute. A major hurdle to independent living is the inability to use everyday household objects. This work aims to automate the assessment of product usability for the elderly population using the tools of computer vision and machine learning. A novel video analysis technique is presented that performs temporal segmentation of video containing human–product interaction and automatically identifies time segments in which the human has difficulties in operating the product. The method has applications in the automatic assessment of the usability of various product designs via measuring the frequency of operation difficulties. The approach is applied to a case study of water faucet design for the older adult population with dementia. Experiments in the automatic analysis of a large database of real-world recorded videos confirm the effectiveness of the approach in providing valid temporal segmentation (accuracy 88.1%) and in the correct estimation of the relative advantage (or disadvantage) of one design over another in terms of operation difficulties in performing various actions.

Introduction

As the size and proportion of the elderly population is expanding globally [1], health issues associated with old age are becoming more prevalent. The number of people with dementia, for instance, is expected to double in 20 years and the worldwide cost of dementia is currently estimated to be about 1% of the global GDP [2]. Facilitating aging in place is thus of paramount importance as it alleviates healthcare costs while keeping the elderly happy, independent, and socially connected [3]. A critical factor in preserving independence is the ability of older adults to use everyday household objects and perform activities of daily living. Comparative usability analysis can enable the selection of the most appropriate product designs for this target population. Unfortunately, the number of everyday objects is vast and usability studies are expensive, laborious, and often subjective.

The goal of this work is to automate the process of analyzing the usability of a product through application of state-of-the-art computer vision techniques. Automatic assessment will replace (or ease) laborious manual analysis. Water faucets are chosen as a proof of concept product, as they are used multiple times daily and the ability of a person to use them is crucial in several self-care activities. Recorded videos of real human subjects, older adults at various stages of Alzheimer's disease, are analyzed as they wash their hands using different faucet types (Fig. 1). The method presented here provides quantitative measures of usability based on automatically estimating factors such as the rate of operation difficulties and the time it takes to complete actions such as turning the water on for each faucet type. While the objective of this work is to develop a fully automatic usability analysis tool, the intermediate steps also provide useful information to designers and caregivers. For instance, automatic temporal segmentation, even without automatic detection of operation difficulties, can enable designer to browse through hundreds of recorded handwashing videos and observe only specific actions of interest, e.g. adjusting the temperature, without having to watch extraneous segments. Such a scheme reduces the analysis and evaluation time by an order of magnitude (see Table 1 in Section 4.2).

The term usability often connotes a combination of different quality attributes including learnability, efficiency, satisfaction, frequency of operation difficulties, and the need for caregiver assistance [4], [5]. This work focuses on quantifying one major factor, namely the frequency of difficulties in completing various actions. The algorithms are therefore designed to compute statistics that reveal the relative advantage (or disadvantage) of one faucet design over another in terms of this factor. Suggestions are also provided for future work to enhance the method in order to enable accurate estimation of action completion times.

Typical difficulties in operating faucets include reaching for the wrong location, performing the wrong action (e.g. pulling instead of rotating), arriving at a wrong outcome (e.g. turning the water hotter instead of colder), or simply not doing anything when an action is required (e.g. continuous and prolonged rinsing and forgetting to turn off the tap). The focus of this work is on detecting physical difficulties, such as reaching for the wrong location or performing the wrong action, which could be identified through video analysis. Higher level cognitive difficulties, e.g. not following verbal requests from the caregiver or making the water hotter instead of colder, remain the subject of future work as they require estimation of the subjects' intent.

The major contribution of this work is the development and application of state-of-the-art computer vision methods in order to potentially make large scale empirical product usability assessment feasible and promote aging in place. Although the empirical analysis focuses on a specific product, special care was taken to ensure the generality of the approach and algorithms to allow them to be applied to a large array of everyday products. Two auxiliary contributions of this work address two major challenges that arise in a wide array of human activity monitoring problems. The first, is in dealing with the inherent ambiguity in ground truth action labels of a video sequence near the transition boundaries. This is addressed via modifying the loss function of an HM-SVM for temporal segmentation (Section 3.3). The second relates to identifying human operation difficulties, which is addressed by separating it from temporal segmentation. This separation reduces the problem of detecting operation errors into a classification problem applied to individual video segments (Section 3.4). The benefits and trade-offs of this separation are discussed in Section 3.5. Description of the specific choice of features is provided at a level of detail sufficient for re-implementation (Section 3.2). The choice of audio and video features used here is based on state-of-the-art techniques [6], [7], [8] and the authors' previous work in this area [9].

Section snippets

Previous work

To the authors' best knowledge, prior to this work, computer vision has not been applied to product usability assessment. A sample of the work applied in assistive technologies, particularly as related to handwashing or monitoring older adults, is briefly reviewed here. Mihailidis et al. [10] use computer vision and machine learning techniques to monitor the handwashing process in real-time in order to provide reminder prompts to people with dementia when necessary. Decision theoretic models,

Overview

A main ingredient of the approach is that in a two phase algorithm, temporal segmentation precedes the identification of operation difficulties. After partitioning an input video in time, segments containing the actions of interest go through further processing in order to identify those with interaction difficulties.

Feature extraction

Automatic video analysis typically involves extracting a set of features from a video sequence and applying a classifier, or multiple classifiers, to categorize the information

Setup

To evaluate the proposed algorithms, experiments were performed on a large set of handwashing videos from the Handwashing Dataset [4]. The entire data set consists of 27 older adults with Alzheimer's disease washing their hands using five different faucet types, with each human subject using each faucet 8–10 times over the course of a 4 month period, amounting to a total of over 1350 handwashing videos. The videos are from the view point of a camera installed on a bathroom ceiling overlooking

Conclusions and future work

A novel activity monitoring approach was presented which parses videos of human subjects operating water faucets and identifies time segments in which the subject has difficulties in operating the faucet. Experiments with real-world handwashing videos of older adults with dementia confirmed the effectiveness of the approach in estimating the rate of operation difficulties for comparative analysis. Future work involves enhancing the temporal segmentation via an iterative refinement process in

Acknowledgments

This work was partially supported through a grant from the National Institute on Disability and Rehabilitation Research (NIDRR), through the RERC on Universal Design and the Built Environment.

Babak Taati is a post-doctoral research associate in the Intelligent Assistive Technology and Systems Lab at the University of Toronto and the Toronto Rehabilitation Institute. Previously, he worked as lead computer vision scientist at Feeling Software. He completed his Ph.D. in computer vision systems at Queen's University (2009) and in collaboration with MDA Space Missions. His research interests include the application of computer vision in assistive and rehabilitation technologies and in

References (25)

  • Population Aging and Development, Report by the Population Division, Department of Economic and Social Affairs, United...
  • World Alzheimer Report, Report by the Alzheimer's Disease International,...
  • A. Mihailidis et al.

    The use of computer vision in an intelligent environment to support aging-in-place, safety, and independence in the home

    IEEE Trans Inf. Technol. Biomed.

    (2004)
  • A. Mihailidis, J. Boger, K. Fenton, T. Craig, Z. Li, The impact of familiarity on the usability of everyday products...
  • J. Nielsen

    Usability Engineering

    (1993)
  • N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: Computer Vision and Pattern Recognition,...
  • N. Dalal, B. Triggs, C. Schmid, Human detection using oriented histograms of flow and appearance, in: European...
  • I. Laptev

    On space–time interest points

    Int. J. Comput. Vision

    (2005)
  • J. Snoek, B. Taati, Y. Eskin, A. Mihailidis, Automatic segmentation of video to aid the study of faucet usability for...
  • A. Mihailidis et al.

    The COACH prompting system to assist older adults with dementia through handwashing: an efficacy study

    BMC Geriatr.

    (2008)
  • C. Peters, S. Wachsmuth, J. Hoey, Learning to recognise behaviours of persons with dementia using multiple cues in an...
  • F. Nater, H. Grabner, L.V. Gool, Exploiting simple hierarchies for unsupervised human behavior analysis, in: Computer...
  • Cited by (12)

    • Counter-propagation artificial neural network-based motion detection algorithm for static-camera surveillance scenarios

      2018, Neurocomputing
      Citation Excerpt :

      Video surveillance systems, which are usually implemented with static cameras, have a wide range of scientific and technological applications, such as computer vision [1], elderly care [2], traffic monitoring [3], and many others [4].

    • Activity recognition in smart homes with self verification of assignments

      2015, Neurocomputing
      Citation Excerpt :

      The development of effective, long term and technologically driven solutions in health care improves the living standards of the elderly and people with chronic physical (low mobility level) and cognitive (Alzheimers disease) impairments [1–3].

    • Big Data and AI-Driven Product Design: A Survey

      2023, Applied Sciences (Switzerland)
    • Automatic detection of compensation during robotic stroke rehabilitation therapy

      2017, IEEE Journal of Translational Engineering in Health and Medicine
    • The essence of smart homes: Application of intelligent technologies towards smarter urban future

      2016, Artificial Intelligence: Concepts, Methodologies, Tools, and Applications
    View all citing articles on Scopus

    Babak Taati is a post-doctoral research associate in the Intelligent Assistive Technology and Systems Lab at the University of Toronto and the Toronto Rehabilitation Institute. Previously, he worked as lead computer vision scientist at Feeling Software. He completed his Ph.D. in computer vision systems at Queen's University (2009) and in collaboration with MDA Space Missions. His research interests include the application of computer vision in assistive and rehabilitation technologies and in health and safety monitoring.

    Jasper Snoek is a Ph.D. candidate in the Department of Computer Science at the University of Toronto. His research interests are primarily in machine learning, computer vision and their applications to assistive technology.

    Alex Mihailidis, Ph.D., P.Eng., is the Barbara G. Stymiest Research Chair in Rehabilitation Technology at the University of Toronto and Toronto Rehab Institute. He is an Associate Professor in the Department of Occupational Science and Occupational Therapy and in the Institute of Biomaterials and Biomedical Engineering, with a cross appointment in the Department of Computer Science. He has been conducting research in the field of pervasive computing and intelligent systems in health for the past 13 years, having published over 100 journal papers, conference papers, and abstracts in this field.

    View full text