Recognizing complex instrumental activities of daily living using scene information and fuzzy logic

https://doi.org/10.1016/j.cviu.2015.04.005Get rights and content

Highlights

  • Provides a unique and robust solution to the extremely challenging task of ADL modeling.

  • Incorporates scene information to build ADL models.

  • In the absence of manually labeled surfaces, can still generate high-level activity state summaries.

  • We provide a dataset for the computer vision community that is described in this manuscript.

Abstract

We describe a novel technique to combine motion data with scene information to capture activity characteristics of older adults using a single Microsoft Kinect depth sensor. Specifically, we describe a method to learn activities of daily living (ADLs) and instrumental ADLs (IADLs) in order to study the behavior patterns of older adults to detect health changes. To learn the ADLs, we incorporate scene information to provide contextual information to build our activity model. The strength of our algorithm lies in its generalizability to model different ADLs while adding more information to the model as we instantiate ADLs from learned activity states. We validate our results in a controlled environment and compare it with another widely accepted classifier, the hidden Markov model (HMM) and its variations. We also test our system on depth data collected in a dynamic unstructured environment at TigerPlace, an independent living facility for older adults. An in-home activity monitoring system would benefit from our algorithm to alert healthcare providers of significant temporal changes in ADL behavior patterns of frail older adults for fall risk, cognitive impairment, and other health changes.

Introduction

Activities of daily living (ADLs) are a set of activities that are required for self-care such as walking, eating, dressing, and bathing. They are used to assess the functional capacity of older adults [11]. Instrumental ADLs (IADLs) are a subset of the functional tasks that older adults perform to support their independent lifestyles [9]. Examples of IADLs are housekeeping, cleaning, cooking. These activities, when measured over an extended period of time, can show deviations in health for older adults. Zisberg et al. [5] developed a new instrument called SOAR to evaluate routine patterns in the lives of older adults. Subjects from four retirement communities reported detailed information regarding ADLs like eating, meal preparation, watching television, bathing, etc. The study indicated that any deviation in the routine of frail older adults could correlate with a change in health and provides the motivation behind the work described in this paper. We describe the premise behind our study using the following case study revolving around the IADL cleaning the table. Suppose a healthy older adult living independently performs the IADL cleaning the table once every day at a certain time. However, due to some health related reason, she is unable to do so several days in a row. Once detected, this deviation from her normal routine could be a strong indicator of a health change which could help enable early interventions. The goal of this study is to build a model to learn these ADL or IADL patterns which can then be used for detection, and the changes in daily (or weekly or monthly) behavior patterns can then be used to detect early health changes.

The contributions of this paper are the following. We present a unique, vision-based method for recognizing components of ADLs and IADLs by combining their interaction with object surfaces with a set of linguistic fuzzy rules with heuristic parameters to model their activities. Specifically, in this paper we use the activities walk, sit, clean object, clutter object, move near object, rearrange object, and move object to describe our approach. We use the IADLs make bed and eat to describe the importance of combining scene information with moving object features to detect complex activities that are difficult to detect using only the foreground information or only the scene features. These activities further reinforce the importance of ontologies to provide context for each ADL or IADL that can provide the baseline for activity detection and help eliminate false alarms using contextual information. The results using our proposed algorithm are discussed and compared with another popular activity modeling algorithm, the hidden Markov model (HMM) and its variation, the details of which are provided in Section 8. We further test our method on data collected in an apartment at TigerPlace, an independent living facility for older adults. The data comprise depth information from an older resident (age 88, without any ambulatory needs such as a walker) as he goes through his daily routine in the apartment. We conclude with the discussion of the future steps for the ADL activity modeling framework. The next section reviews some of the related work in this field using vision and non-vision based sensors.

Section snippets

Background

Studies described in [4], [5] indicate the importance of longitudinal analysis of the daily routine of older adults to study anomalies or deviations in their regular patterns in an automated, non-intrusive manner. In order to detect these deviations, the activities need to be recognized and ordered in a methodical way for day-to-day behavior comparison. One approach is based on ontological activity modeling. This is described in more detail in the next section.

In related activity modeling work

Ontological framework

The idea for representing ADLs using an ontology is not new. In [6], Chen et al. proposed an ontological method to recognize ADLs such as housework, managing money, taking medicine, and using the phone. Theoretical foundations were set up to fuse information from different sensors (contact sensors, motion sensors, tilt sensors and pressure sensors), and then build an ontology of ADLs. Data from all the sensors were aggregated to describe the ADL occurring at a certain time point. Experiments

Depth video segmentation

Foreground is extracted on the raw depth images from a single Microsoft Kinect sensor using a standard background subtraction algorithm. The background is learned using the mixture of Gaussian approach. Any depth value outside this range is recognized as a foreground pixel [2]. We utilize a dynamic background update algorithm to account for the constant changes in the environment in real world settings. The ground plane in the Kinect's field of view is extracted as in [2]. Ground points are

Scene understanding

This section describes our method to obtain surface information from the scene. Prior to feature extraction, we employ a region filling operation to remove noisy depth pixels from the image described in Section 5.1.

Features for the fuzzy rule based system

In this section, we describe the features we input to our hierarchical system of fuzzy inference for activity reasoning. Once the areas of interest are extracted, features are computed for activity recognition. For a given sequence, these features are calculated only if a moving object, hereafter labeled as the Assumed Person (AP) to distinguish it from scene objects, is detected. This helps to speed up processing so that only sequences with noticeable movement are further considered for

The fuzzy inference system

The features described above are input to the three-layered FIS for automated reasoning. Our approach to monitoring human activity is based on fuzzy set theory [28] which is an extension of classical set theory. One of the more well-known branches of fuzzy set theory is fuzzy logic [29]. Fuzzy logic is a powerful automated reasoning framework which comprises an inference system that operates on a set of rules structured in an IF-THEN format (example: "IF X is A, THEN Y is B"). The IF part of

Results

To test the proposed recognition methods, we use two data sets and compare results with the HMM. The first dataset was recorded in a controlled environment with subjects performing specified activities. The second is a sample of data collected in an apartment at TigerPlace with an older resident performing activities as a part of his normal routine. There are several available datasets with RGB-D data of different activities. Specifically for ADLs, the Cornell Activity Datasets: CAD 60 and CAD

Conclusions

We demonstrate a flexible framework for detecting ADLs in an in-home environment using depth data from the Kinect sensor. Depth data provide the added advantage of unobtrusive monitoring with its ability to perform just as well under different lighting conditions. Silhouette features from the depth data as well as scene features are extracted and input to a fuzzy inference system, and activity states of the individuals are determined using fuzzy confidence measures. The resulting fuzzy rule

Future work

We highlight the importance of our algorithm's generalized framework for activity modeling. One question that needs answering is how is this useful in a dynamic unstructured environment where there are some unlabeled surfaces present? Since our algorithm updates the surfaces in the field of view on a daily basis as well as when there is a detected change in any object surface, it can identify new surfaces that are yet to be labeled. Our algorithm then identifies the occurring activity as: There

Acknowledgments

This work was supported in part by the Agency for Healthcare Research and Quality under grant R01-HS018477 and NSF grant CNS-0931607. The authors would like to thank members of the Eldercare and Rehabilitation Technology team for their support.

References (45)

  • F. Latfi et al.

    Ontology-based management of the TeleHealth smart home, dedicated to elderly in loss of cognitive autonomy

  • M.D. Rodríguez et al.

    CARe: an ontology for representing context of activity–aware healthcare environments

  • C. Graf

    The Lawton instrumental Activities of Daily Living (IADL) scale.

    Medsurg. Nurs.

    (2009)
  • M. Shelkey et al.

    Katz index of independence in activities of daily living (ADL)

    Gerontologist

    (1998)
  • T. Banerjee et al.

    Resident identification using Kinect depth image data and fuzzy clustering techniques

  • S. Tang et al.

    Histogram of oriented normal vectors for object recognition with a depth sensor

  • C. Liu et al.

    SIFT flow: dense correspondence across different scenes and its applications

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2011)
  • B. Oehler et al.

    Efficient multi-resolution plane segmentation of 3D point clouds

  • T. Banerjee, J.M. Keller, M. Skubic, Detecting foreground disambiguation of depth images using fuzzy logic, in: 2013...
  • C. Sutton et al.

    An introduction to conditional random fields for relational learning

    Introduction to Statistical Relational Learning

    (2006)
  • K. Murphy

    The Bayes net toolbox for MATLAB

    Comput. Sci. Stat.

    (2001)
  • L. Rabiner

    A tutorial on hidden Markov models and selected applications in speech recognition

    Proc. IEEE

    (1989)
  • Cited by (29)

    • Human-centered approaches that integrate sensor technology across the lifespan: Opportunities and challenges

      2020, Nursing Outlook
      Citation Excerpt :

      The current sensors (see Figure 4) have been effective in capturing health changes which include a depth sensor for capturing walking gait and detecting falls (Stone & Skubic, 2013; Stone & Skubic, 2015), a bed sensor for capturing heart rate, respiration rate, restlessness in bed, and general sleep patterns (Rosales, Bo-Yu, Skubic, & Ho, 2017); and passive infrared motion sensors for capturing room specific movement (e.g., bathroom activity) and overall activity patterns (e.g., daily sedentary vs. active patterns) (Wang, Skubic, & Zhu, 2012). The system includes numerous pattern recognition and machine learning algorithms (Banerjee et al., 2017; Banerjee, Keller, Popescu, & Skubic M, 2015; Banerjee, Keller, Skubic & Stone, 2014; Banerjee, Skubic, Keller, & Abbott, 2014; Jiao et al., 2018; Popescu and Mahnot, 2012; Stone, Skubic, Rantz, Abbott & Miller, et al., 2015; Su et al., 2019; Wallace, Abbott, Gibson-Horn, Rantz, & Skubic, 2017). Here, the focus will include the overall development and deployment using a HCA, including the challenges faced as the system was adapted and tested in new settings.

    • PRAXIS: Towards automatic cognitive assessment using gesture recognition

      2018, Expert Systems with Applications
      Citation Excerpt :

      Therefore, automatic solutions to address these problems by providing a standardized test can be considered as a significant contribution in the field. To capture changes in elderlies’ behavioral pattern and to classify their cognitive status (Alzheimerâ;;s disease - AD, mild cognitive impairment - MCI, healthy control - HC), there has been a lot of studies on patient monitoring and surveillance (Banerjee, Keller, Popescu, & Skubic, 2015; Brulin, Benezeth, & Courtial, 2012; Negin, Cogar, Bremond, & Koperski, 2015; Pirsiavash & Ramanan, 2012) with a main focus on recognition of activities of daily living (ADLs) (Avgerinakis, Briassouli, & Kompatsiaris, 2013; König et al., 2016). The main goal of such frameworks is mostly to provide cost-efficient solutions for in-home or nursing homes monitoring.

    • An overview of methods for linguistic summarization with fuzzy sets

      2016, Expert Systems with Applications
      Citation Excerpt :

      To easily understand this data is beyond the capabilities of humans and brings another problem, that of how to extract potentially useful knowledge in an efficient way. The recent works proposed by Banerjee, Keller, Popescu, and Skubic (2015); Jain and Keller (2015a,b); Jain, Keller, and Bezdek (2016) have emphasized the use of linguistic summarization to extract textual statement from the sensor data collecting from smart homes of elderly and daily living activities. The concept of bipolar linguistic summaries, an extension of linguistic summarization, is able to be better representation of the preferences and intentions of humans.

    • Exploiting IoT technologies for enhancing Health Smart Homes through patient identification and emotion recognition

      2016, Computer Communications
      Citation Excerpt :

      Computational vision has been used in many systems since it allows the acquisition of huge amounts of multidimensional data related to the monitored environment. For example, it can be used to learn the activities of daily living (ADL) of elderly people to analyze health problems or cognitive disruption [4]. In the current proposal, computational vision provides a pervasive layer with the resident, and allows non-intrusive monitoring.

    • Fuzzy inference-based fall detection using kinect and body-worn accelerometer

      2016, Applied Soft Computing Journal
      Citation Excerpt :

      Moreover, in many typical ADLs the Kinect has difficulties in tracking all joints [31]. Thus, recent systems for reliable fall detection [14,2,32] do not take into account the Kinect RGB images and only rely on depth maps to delineate the person(s). The remaining part of this paper is organized as follows.

    View all citing articles on Scopus

    This paper has been recommended for acceptance by Isabelle Bloch.

    View full text