1 Introduction

With the advancement of medical technology, the life span of human beings have been significantly prolonged, and the aging of population has become a serious problem faced in China. Data from Ministry of Civil Affairs of China shows that China’s population over the age of 60 is more than two hundred million, and will reach 243 million accounted for 17% of the total population by 2020. Due to disease, age and other factors, many elderly people cannot take good care of their selves. One of the kinds of dangerous behavior is a fall, which is one of the main threats to elders’ health [1]. According to the WHO Global Report on Falls Prevention in Older Age [2], about 28%–35% of 65-year-olds fall every year, and for 70-year-olds this number will rise to 32–42%. The frequency of falls increases year by year with age and physical debility [3]. Vellas et al. [4] conducted a follow-up study of 482 elderly people who lived independently in the community for 24 months. The study showed that the accidental fall was the leading cause of accidental death in elderly people over 85 years of age. 61% of participants (53.7% of men and 65.7% of women) experienced at least one fall during a two-year study. Detecting a fall can effectively reduce the risk of injury for the elderly, which is a very significance condition for the elderly healthy living at home.

Fall has a widely accepted definition by scholars: non-deliberate movements toward the ground or lower (except for continuous blows, loss of consciousness, paralysis, and epileptic consequences) [5]. There are many researchers’ attempts to separate the fall behavior from normal daily behaviors. This is a difficult task, because many normal daily behaviors, such as sitting down, Lying down, are very similar with falling [3]. In recent years, with the advancement of sensor technology, more and more scholars explored on the fall detection problems. Initially researchers use wearable acceleration sensor to distinguish a fall from different behaviors based on change of the acceleration [6,7,8]. These methods can effectively identify some kinds of fall behavior. Then many scholars detect a fall according to different falling end posture (Ending Posture) feature by using image-processing technology [9,10,11]. Due to the two-dimensional image data’s limitations, the accuracy of recognition has been affected by different kinds of issue like how to distinguish between foreground and background, how to accurately track human behavior and so on.

Falls in biomechanical research can be seen as an uncontrolled imbalanced movement of the human body. An imbalance is an important feature of fall behavior. Kinect, a three-dimensional somatosensory vision sensor, provides a new possibility for the study of fall behavior. It can dynamically capture the motion of human body in three-dimensional space so that researchers can analyze human posture more accurately, which greatly improve the ability of human behavior analysis. In addition, differing from the traditional vision sensor, Kinect provides dynamic position information of skeletal nodes in the three dimension. It can protect users’ privacy while analyzing behavior based on skeleton data.

We study fall detection based on changes of human balance by skeleton data. The research includes three aspects, shown in Fig. 1:

Fig. 1.
figure 1

Three research aspects

  1. 1.

    Building a human bionic dynamic mass model by skeleton joints data from Kinect and human mass distribution and computing dynamic positions of Center of Mass (COM);

  2. 2.

    Determining balance by calculating the region of Support of Base (SOB) and Line of Gravity (LOG);

  3. 3.

    A fall detection algorithms based on the recurrent neural networks (RNN) by imbalanced posture features.

Uncontrollable imbalance behavior will result in a fall, but the controllable will not, like sitting or lying. Posture and body movement speed is significant different between controllable and uncontrollable imbalance behavior. Skeleton joints’ relative position and speed are used to describe body’s postures as input features. The Short-Term Memory networks (LSTM) is used to detect fall based on these features.

We have evaluated our fall detection method on the existing database [12]. The results show that our fall detection algorithm by studying human biomechanics equilibrium and posture recognition can detect a fall (95%).

2 Human Body Balance Study

According to a detailed analysis of the lack of the current fall detection methods mentioned above, our paper provides a fall detection method based on skeleton data by studying status of human body balance. Firstly, we will investigate human body balance in this section.

2.1 Balance Definition

Body balance refers to the ability of an individual to maintain the Center of Mass (COM)’s Line of Gravity (LOG) within the body’s Base of Support (BOS) region shown in Fig. 1. Fall starts from an imbalance situation of human body. When imbalance of human body can be detected, the fall might be detected. Imbalance is a key feature of imminent fall. The basic idea of fall detection is to determine whether the body balance by relative position of LOG and BOS, (most of the actions in daily life, such as standing and walking, belong to a state of equilibrium). If imbalance: (fall is an action of imbalance, sitting down and lying are not equilibrium either.) we will try to find whether human will fall or not by LSTM (Fig. 2).

Fig. 2.
figure 2

The features of body balance

2.2 Balance Feature Extraction

Determining COM.

The balance situation depends on position relationship between COM and BOS. COM is the unique point where the weighted relative position of the distributed mass sums to zero [13]. COM of human body depends on the gender and the position of the limb [14] in a standing posture, it is typically about 10 cm lower than the navel, near the top of the hip bones. However, COM of human body is not concentrated in a particular point during human moving. COM cannot be get directly. The main idea to get COM of human has four steps [15]:

  1. 1.

    Calculating the center of mass location of each segment of the body

  2. 2.

    Calculating the torque about the reference point due to each segment based on the segment’s mass and position.

  3. 3.

    Summing the torques about the reference point for all the segments.

  4. 4.

    Dividing the sum of the torques by the total body mass to determine the center of mass location with respect to the reference point.

In real situations, we employ skeleton data from Kinect and Demspter’s body segment parameters [16] to estimate COM position. Kinect is a line of motion sensing input devices by Microsoft, which can detect up to six users at the same time and compute their skeletons in 3D with 25 joints representing body junctions like the feet, knees, hips, shoulders, elbows, wrists, head. COM of segment can be determined from the proximal and distal point coordinates and the segment length percent, all the parameter mentioned above can get from [16] (Table 1).

Table 1. The mass and coefficient of human segment [16]

We improve the method [17] to get COM of human from two dimensions to three dimensions. In short, the method can be summarized in the equation for two steps:

  1. 1.

    Computing COM of each body’s segment by following formulas:

$$ x_{sCOM} = x_{p} l_{p} + x_{d} l_{d} $$
(1)
$$ y_{sCOM} = y_{p} l_{p} + y_{d} l_{d} $$
(2)
$$ z_{sCOM} = z_{p} l_{p} + z_{d} l_{d} $$
(3)

where \( x_{sCOM} \), \( y_{sCOM} \), \( z_{sCOM} \) are coordinates of COM of segmental body; \( x_{p} \), \( y_{p} \), \( z_{p} \) are coordinates of proximal ends; \( x_{d} \), \( y_{d} \), \( z_{d} \) are coordinates of distal ends; and the percentage of segmental length from the proximal and distal ends are represented by \( l_{p} \), \( l_{d} \).

  1. 2.

    Calculating COM of all body by following formulas:

$$ x_{COM} = \frac{{\sum {m_{i} x_{{s_{i} com}} } }}{M} $$
(4)
$$ y_{COM} = \frac{{\sum {m_{i} y_{{s_{i} com}} } }}{M} $$
(5)
$$ z_{COM} = \frac{{\sum {m_{i} z_{{s_{i} com}} } }}{M} $$
(6)

where \( x_{COM} \), \( y_{COM} \), \( z_{COM} \) are coordinates of body; \( m_{i} \) is the mass of the ith segment; \( M \) is the whole mass of body.

Calculating LOG and SOB.

The LOG is important to understand and visualize when determining a person’s ability to maintain balance. When the LOG is within the BOS, the person is considered as balance. When the LOG falls outside the BOS, the person is considered as imbalance [18]. Since the direction of the force of gravity through COM is downward, towards the earth, the LOG can be computed by vertical projection of COM.

The BOS refers to the area beneath a person that includes every point of contact that the person makes with the supporting surface [19]. We estimate the BOS by the eclipse covering humans’ feet skeleton, shown in Fig. 3.

Fig. 3.
figure 3

Support of base estimation

3 Fall Detection Based on LSTM

Fall detection can be convert to sequential data classification problem. Recurrent Neural Networks (RNN) is a very powerful method for dealing with classification for sequence data, but training them has proved to be problematic because the back-propagated gradients either grow or shrink at each time step, so over many time steps they typically explode or vanish [20]

LSTM has been introduced by [21], which have become a crucial ingredient in recent advances with recurrent networks since they are good at learning long-range dependencies [20] and not affected by vanishing and exploding gradient problems. The structure of LSTM is shown in Fig. 4. It introduces a new structure called a memory cell, which is composed of four main elements: an input gate, a forget gate, an output gate and cell activation vectors. The gates serve to modulate the interactions between the memory cell itself and its environment. The input gate allows incoming signal to alter the state of the memory cell or block it. The output gate allows the state of the memory cell to have an effect on other neurons or prevent it. The forget gate modulates the memory cell’s self-recurrent connection, allowing the cell to remember or forget its previous state [22].

Fig. 4.
figure 4

The structure of LSTM in sequence [23]

The formulas 7 to 12 describe how a model of LSTM works, as shown in below:

$$ f_{t} = \sigma (W_{f} \cdot [h_{t - 1} ,x_{t} ] + b_{f} ) $$
(7)
$$ i_{t} = \sigma (W_{i} \cdot [h_{t - 1} ,x_{t} ] + b_{i} ) $$
(8)
$$ \tilde{C}_{t} = \tanh (W_{C} \cdot [h_{t - 1} ,x_{t} ] + b_{C} ) $$
(9)
$$ C_{t} = f_{t} * C_{t - 1} + i_{t} * \tilde{C}_{t} $$
(10)
$$ o_{t} = \sigma (W_{O} \cdot [h_{t - 1} ,x_{t} ] + b_{O} ) $$
(11)
$$ h_{t} = o_{t} * \tanh (C_{t} ) $$
(12)

where \( i \), \( f \), \( o \) and \( C \) denote input gate, forget gate, output gate, and cell activation vectors respectively, and \( \sigma \) denotes the logistic sigmoid function.

In our fall detection method, we employ that skeleton data per frame is as input of LSTM. When imbalance occurring, LSTM is employed to classify human’s activities based skeleton data.

4 Experiments and Evaluation

We employ database [12] to evaluate our fall detection method. In this database, Kinect is employed to record human’s actions, which are grouped two main categories: Activity of Daily Living (ADL) and fall. ADL has four types of actions: sit, grasp, walk, lay; and fall has four types of actions depending on direction: front, back, side, and end-up-sit. It contains 11 healthy volunteers from 22 to 39 years old with different height and weight. Every action is recorded in database three times from 11 health volunteers. We random choose 224 actions as training set and the remaining’s are testing set.

Gasparrini et al. proposed three algorithms to detect a fall based on this database. First one used variation in the skeleton joint position from Kinect and acceleration of the wrist accelerometer; Second one used the same parameters as first one, but it got data from the accelerometer placed on the waist; Third one added a parameter: distance of the spine base joint from the floor. Comparing with theirs, our method’s only use data from one sensor: Kinect, which does not require human to wear any other sensors on the body, and our accuracy is much better than algorithm one and two, a little weaker than algorithm three, as shown in Table 2 below.

Table 2. Results comparison

5 Conclusion and Future Work

In this paper, we propose a fall detection method based on skeleton data by analyzing human’s biomechanics equilibrium. The LSTM is adopted to distinguish fall from other activities. Our method uses only skeleton data from Kinect, it does not require elder to wear any other sensors and get a good performance. It provides a feasible solution for fall detection.