Elsevier

Neurocomputing

Volume 74, Issue 8, 15 March 2011, Pages 1283-1289
Neurocomputing

A block-based model for monitoring of human activity

https://doi.org/10.1016/j.neucom.2010.05.023Get rights and content

Abstract

The study of human activity is applicable to a large number of science and technology fields, such as surveillance, biomechanics or sports applications. This article presents BB6-HM, a block-based human model for real-time monitoring of a large number of visual events and states related to human activity analysis, which can be used as components of a library to describe more complex activities in such important areas as surveillance, for example, luggage at airports, clients’ behaviour in banks and patients in hospitals. BB6-HM is inspired by the proportionality rules commonly used in Visual Arts, i.e., for dividing the human silhouette into six rectangles of the same height. The major advantage of this proposal is that analysis of the human can be easily broken down into regions, so that we can obtain information of activities. The computational load is very low, so it is possible to define a very fast implementation. Finally, this model has been applied to build classifiers for the detection of primitive events and visual attributes using heuristic rules and machine learning techniques.

Introduction

Nowadays it is very usual for public and private companies to have sophisticated surveillance systems that try to monitor the state of their “business” in order to detect anomalous situations and avoid adverse or undesirable situations. The final aim of an artificial vision system is to provide a task-focused scene description. Concentrating on the description of human activity, and in accordance with [25], [23], [14], the recognition of activities is considered as a problem of classifying spatial–temporal characteristics or, alternatively, as “queries” on higher-level constructions. In other words, activities are considered as complex events, which are defined by spatial–temporal composition of simpler events, and these simple events in turn are defined by other even simpler events and so on to form a hierarchy of events to link with primitive events, which are determined from changes in state of visual attributes.

This article presents BB6-HM, a model that can be used to monitor, in real time and with a minimum computational load, a large number of these primitive events related to humans, which can be used as components of a library to describe more complex activities in many surveillance tasks (luggage at airports, clients’ behaviour in banks, patients in hospitals, etc.). Finally, this model has been applied to build classifiers for the detection of primitive events and visual attributes using heuristic rules and machine learning techniques.

This article is organised as follows. The second section analyses other works related to the proposal. The third section describes the block-based human model. The fourth section describes some events and visual attributes detected using relations between the parameters of the human model and evaluates on some test sequences. Finally, the fifth section presents the conclusions and highlights future works.

Section snippets

Related work

This section gives some works related to models of humans which analyse the type of movement done. Some of them, like the model in this work, focus on surveillance. They have been divided into different groups depending on the model used.

The first group includes those works represented by the “stick models” or skeleton models. In these models the human body is represented as a set of segments (bars or volumes) that are joined together with articulations. This representation is based on the

Model description

The model presented in this work consists of vertically dividing the human blob into six regions with the same height (see blocks B1, …, B6 in Fig. 1). Each of these regions can be delimited by a bounding box which we will call “block”. This model is inspired by the proportionality rules commonly used in Visual Arts. In this case, the blocks of this division correspond to areas related to the physical position of certain parts of the body. For example, standing and in a position of repose, the

Monitoring of human activity

In order to use the BB6-HM model, the system shown in Fig. 3 has been used. With this system it is possible to detect primitive states and events from a video sequence where the human is segmented. In an initial stage, a description of the human is obtained (human description) containing constant and variable parameters. Variable parameters form the case model and characterise the human in a specific instant, t. These parameters will be obtained from the human's blocks characteristics. The

Conclusions and future works

This work has presented a new human model (BB6-HM) oriented to surveillance. It embraces both frontal and lateral movements with a low computational cost and a large amount of task-focused information can be extracted from it. Furthermore, the orientation from the start of the system to the surveillance task allows us to indicate situations particularly useful for this task.

The experimental results have demonstrated the usefulness of the human model for the detection of primitive events and

Acknowledgments

Support for this research was made possible from CICYT through TIN2007-67586-C02-01 project and UNED project call 2006 titled “Development of a knowledge based tracking system”, in the context of which this study has been carried out. Portions of the research in this paper use the CASIA Gait Database collected by Institute of Automation, Chinese Academy of Sciences.

Encarnación Folgado received the B.S. degree in Physics in 2003 through Universidad Nacional de Educación a Distancia (UNED), Spain. The M.S. degree in 2006 through the Department of Artificial Intelligence at UNED. She in currently working towards her Ph.D. at the Department of Artificial Intelligence at UNED. Her research interest lies on activity recognition for video surveillance.

References (33)

  • P. Chunhong, M. Songde, 3D motion estimation of human by genetic algorithm, in: 15th International Conference on...
  • I. Cohen, H. Li, Inference of human postures by classification of 3D human body shape, in: IEEE International Workshop...
  • T. Darrell, P. Maes, B. Blumberg, A.P. Pentland, A novel environment for situated vision and behavior, in: Workshop for...
  • J. Deutscher, R. Blake, Articulated body motion capture by annealed particle filtering, in: IEE Conference on Computer...
  • H. Fujiyoshi, A.J. Lipton, Real-time human motion analysis by image skeletonization, in: Workshop on Applications of...
  • D. Gavrila, L. Davis, 3D model-based tracking of human in action: a multiview approach, in: Proceedings of the...
  • Cited by (8)

    • An energy model approach to people counting for abnormal crowd behavior detection

      2012, Neurocomputing
      Citation Excerpt :

      The second category usually deals with occlusions by blob tracking, merging and splitting. Refs. [4,20–23] detect and track individuals in video sequences with some prior knowledge of pedestrians, and count people with the merge-split strategy. Taking advantage of kernel structural information matrices to represent object appearance, Li et al. [4] propose an object tracking framework.

    • Block based approach for automated recognition of traffic police hand gestures

      2014, International Journal of Applied Engineering Research
    • Human action recognition with block-based model and flow histograms

      2014, Lecture Notes in Electrical Engineering
    View all citing articles on Scopus

    Encarnación Folgado received the B.S. degree in Physics in 2003 through Universidad Nacional de Educación a Distancia (UNED), Spain. The M.S. degree in 2006 through the Department of Artificial Intelligence at UNED. She in currently working towards her Ph.D. at the Department of Artificial Intelligence at UNED. Her research interest lies on activity recognition for video surveillance.

    Mariano Rincón received his Ph.D. in Physics from the National University for Distance Education (UNED) in Madrid (Spain) in 2003. He currently holds the position of Associate Professor in the Department of Artificial Intelligence at UNED. His research lies primarily within the fields of computer visión and knowledge modelling for image understanding.

    Enrique J. Carmona received his degree in Electronic Engineering from the University of Granada, Spain, in 1996, and received his Ph.D. in Physics from the National University for Distance Education (UNED) in Madrid (Spain), in 2003. Since 2009, he is an Associate Professor with the Department of Artificial Intelligence at the UNED, Spain. His research interests are related to Machine Learning, Computer Vision and Evolutionary Computation.

    Margarita Bachiller received the Ph.D. degree in 1999 through the E.T.S.I. Industrial at the UNED (National University of Distance Education of Spain). She currently holds the position of Associate Professor. Her research lies primarily within the field of computer visión and image understanding.

    View full text