Keywords

1 Introduction

The Health, Labour and Welfare Ministry of Japan reported in 2016 that the third highest cause of death is pneumonia according to demographics [1]. Among elderly patients with pneumonia, 70% suffer from aspiration pneumonia. Dysphagia is a condition that makes it difficult to swallow food well and causes malnutrition or food aspiration in severe cases. Therefore, preventing dysphagia is important in terms of maintaining quality of life.

Dysphagia is typically traced to three causes: structural, functional, and psychological. Structural causes stem from a structural problem that prevents the passage of food in the trachea, functional causes stem from a weakening of the muscles or nerves, and psychological causes are linked to psychogenic disease, such as anorexia by depression. In dysphagia stemming from the functional causes, the tongue and hyoid muscles strongly affect swallowing ability [2]. The hyoid muscles play an important role in the generation of tongue pressure by pushing the tongue from the bottom and also in pulling up the larynx by lifting from above. However, the hyoid muscles, especially the geniohyoid muscle, start to atrophy as people age [3]. This causes a decrease the hyoid muscles and a decrease in the raising up motion of the larynx. Raising the larynx closes the entrance of the trachea, which leads to the prevention of aspiration. This also affects the action whereby food passes through the pharynx. From these reasons, changes due to aging have a detrimental effect on the swallowing function. In order to prevent this, there are many rehabilitation methods to improve swallowing ability. Takehara et al. reported a training method for swallowing [4]. There is an exercise that opens the mouth while contracting the muscles as one way of strengthening hyoid muscles shown in this report. For improving the larynx raising movement, there is an exercise called Shaker where the patient lies with his or her back and shoulder to the floor and raises the head. Another exercise is the Mendelsohn method, where the throat is kept in a high position while swallowing. Uranagase proposed a similar throat exercise to consciously keep the throat in a high position [5]. As mentioned above, moving one’s own throat consciously is important for training, but it can be difficult because it is hard to observe the motion of our own throat. For this reason, the training is monotonous and we are likely to feel bored and give up.

In this paper, we propose a system that helps users to move their own throat consciously. Specifically, we developed a wearable device to visualize the larynx position and prototyped a game to enhance user motivation.

2 Related Work

2.1 Measurement of Swallowing Motion

There have been several attempts to detect swallowing motion. Zhang et al. developed a shirt to detect swallowing motion [6] that comprises bio-impedance sensors around the neck and a pressure sensor behind the first shirt button. They found that the accuracy of the detection improved when the two kinds of sensors were used instead of just one. Iiduka et al. developed a sensor sheet to detect larynx movement in swallowing [7]. They arranged five piezoelectric pressure sensors on the sheet and were able to obtain the maximum rising velocity, the upper part staying period of the throat, and so on by the voltage signal. GOKURI [8] is a system that recognizes swallowing motion by microphone. This device analyzes the sound caused by swallowing and estimates whether the swallowing motion is normal or not. However, this system does not lead to any improvement of the action.

In swallowing motion, there has been some research that focused specifically on larynx movement. Ultrasonic diagnostic equipment [9] can evaluate the morphology of the muscles involved in swallowing or the swallowing movement by putting an ultrasonic probe around the throat. Shimizu et al. [10] stated that such inspection with ultrasonic diagnostic equipment is reliable; however, it is quite expensive and thus impractical for use in our daily lives. Taketani et al. [11] developed a system to evaluate swallowing motion by detecting larynx rising movement with a depth camera called Kinect. Kinect emits infrared light toward the larynx and then measures the distance between Kinect and the larynx. As another approach, Takahashi et al. developed a system called CODE that detect the larynx movement by photographs taken from the side [12]. CODE detects the throat outline by comparing the difference between the background and the jaw part. After removing noise and performing correction, CODE identifies the peak position from the differential values of the outline and obtains a larynx movement curve by plotting to a graph. However, these systems take time to set up and require a specialized environment for use. Sato et al. [13] developed a system that detects the larynx position with a photo-reflective sensor and then visualizes the position. This system can evaluate throat motion easily, but only for males with a protruding Adams’s apple. Our proposed system is different in that we estimate the position of the larynx by machine learning even if user with not protruding Adams’s apple.

2.2 Measurement Around the Larynx with Photo-Reflective Sensor

The photo-reflective sensor (PRS) measures the distance between itself and an object from the amount of reflected light. This type of sensor is small and inexpensive, and it has already been used in a few studies focusing on changes around the throat, including the research introduced in Sect. 2.1 [13].

Yasu et al. developed a wearable module equipped with six PRSs to measure pharynx position [16]. Their system estimates the pharynx position by measuring the peak position from the sensor values. Sakashita et al. developed an immersive telepresence system to transmit the body and facial movements of a performer into a puppet [17]. The performer wears a head mounted display (HMD) equipped with PRSs in the mask part that measure the condition of the mouth. The device proposed in this paper was developed with reference to these studies.

2.3 Gamification for Exercise

There has already been some research on introducing games for exercises. Masaki et al. developed a training game called Squachu for training oral function [14]. In this game, users train the muscles around the tongue and mouth by moving them around. Playing this game was shown to have a positive effect on training the throat muscles, and participants enjoyed playing the game.

Inoue et al. developed a voice input game to support the elderly in speaking [15]. It helps them to train strength to spit out food when food aspiration happens. This game was positively reviewed by elderly participants, who had comments like “I want to play it by myself if the operation is easy” and “I want to play it with my family”.

On the basis of these previous studies, we apply similar gamification techniques to our system. Specifically, we developed a game to provide training on swallowing.

3 Proposed System

In our research, we visualize the motion of swallowing and use it for swallow training through the use of a game system. Our proposed system estimates the throat position between a natural throat state and a throat position raised to the maximum and visualizes it in real time.

3.1 Device

The wearable device we developed consists of 18 PRSs built into the module in a 6 × 3 formation of columns and rows, as shown in Fig. 1. This device was inspired by one proposed by Sato et al. [13] and Yasu et al. [16] that detects pharynx position. Each PRS measures the distance between the skin surface and itself. By arranging the sensors in the horizontal direction, the system can detect changes to the skin surface on the throat resulting from not only the throat position but also throat surface contraction. This enables the device to estimate throat position even if the position of the Adam’s apple is not clear.

Fig. 1.
figure 1

Overview of proposed wearable device.

The outline from the center to the edge of the 3D printed module is fitted to the curvature of the neck. Users put on the wearable device so that the surface of the sensors face the throat and the upper part of the module touches under the chin. The device includes a reflector band in the belt part wrapped around the neck so that it can be put on without any specific tool.

There are individual differences regarding the bumps and dents of the throat, so we need to set the appropriate distance depending on the physicality of each user. In our device, we can change the distance from the throat to the sensors by replacing various parts with different heights.

3.2 Larynx Recognition

The proposed system estimates the larynx position in real time, as shown in Fig. 2. In the learning phase, two states of sensor values are learned: throat position in a natural state and throat position raised to the maximum.

Fig. 2.
figure 2

Recognition overview.

In previous work [16], it has been reported that when we utter something with a high frequency voice, the throat position rises. Also, Hirai et al. [18] reported the position of the larynx moves up and down as frequency goes up and down. Therefore, we assume the throat position is raised to the maximum if user give out as high a voice as possible, we determine this position as the throat position raised to the maximum. We obtain 18 RPS values in each state and remove the noise using a low-pass filter. The laryngeal position is estimated with a support vector machine (SVM) that calculates probability for the two learned states. We got the estimation probability as ratio in measurement range, we determined the position of the larynx.

3.3 Visualizing and Game Design

In previous study conducted by Sato et al. [13], they reported visualizing the position of larynx is valid for training. In this research, we verify whether it is valid for training even if visualizing with gamification.

The estimation result is displayed in the form of an avatar positions on the game screen. When the avatar appears at Lv.0, the larynx position is in a natural state, and at Lv.4, it has risen to the maximum position. The level indication (Fig. 3) is the estimated larynx position divided into four stages according to the estimation probability of the maximum state provided by the SVM. Users can reference this indication as a guide to understand their rough throat position during play.

Fig. 3.
figure 3

Game interface.

In the prototype game, users move an avatar up toward the sky by raising their own throat. The avatar moves vertically according to the estimation result; for example, when a user raises his or her throat, the avatar rises as well.

The number of balloons the avatar holds increases as the larynx position gets higher, and the flying speed gets faster. When the total rising time reaches a designated limit, the game ends. The final arrival height is calculated as the sum-of-products of the level indication and the stay time at that level. Background music is always playing during the game, and every time a player levels up or finishes the game, a sound effect is played. The system records which level the avatar attained and how many seconds it stayed there. We performed two experiments to determine the effectiveness of the system, as described in the following sections.

4 Experiment 1: Estimation Performance

4.1 Overview

In the first experiment, we examined whether our proposed system can estimate throat motion. Four individuals (two males, two females) participated in the experiment. First they were taught the two throat states, natural state and maximum raising state. We instructed them to give out as high a voice as possible when learning the maximum rising state, and we recorded the sound pitch with a tuner. In the estimation phase, participants kept their throat at a natural state and gave out a voice depending on our instruction. We instructed them to give out the same voice as the learning phase, and we recorded the probability estimated by the SVM. With one of the participants (Participant 1), we recorded the estimation probability not only when giving out the same pitch but also when giving out a lower one than in the learning phase. This is because we check whether the estimated position of the larynx changes according to frequency.

4.2 Results

Figure 4 lists the estimation probability when each participant takes each action.

Fig. 4.
figure 4

Estimation probability in each state (%).

Estimation probability was 84.1% when participants do not raise their throat. Estimation probability was 89.6% when participants gave out the same pitch as in the learning phase. This demonstrates that the proposed system can detect two states.

To go into further detail, Fig. 5 shows the continuously estimation probability changes for participant 1. The higher the estimation probability is, the closer to the state raised to the maximum the throat position is estimated. We found that the first increase showed the estimation probability from a natural state to the state she gave out at the same pitch as in learning phase, and the second increase showed that from a natural state to the state she gave out at a lower pitch than in the learning phase. This change corresponds to the continuous change of the larynx position. This result supports previous work showing that the higher the frequency we utter, the higher the throat position rises [16]. It also clearly demonstrates that our proposed system can estimate throat motion.

Fig. 5.
figure 5

Estimation probability changes of the estimated throat position.

5 Experiment 2: User Performance

5.1 Overview

In this experiment, we examined whether users could raise their own throats consciously by repeatedly playing the proposed training game. Five individuals (three males, two females) participated in the experiment. First, participants learned the two states, natural state and maximum state, as in first experiment. They raised their own throats, and then checking the avatar movements to see whether their own throat motion was reflected. We told them that the higher they raised their own throats, the higher their avatar flying position would ascend when game finished. The game started whenever the participant was ready. This procedure was taken as one set, and we carried out one set every hour, for a total of six sets. When the total rising time of the larynx reaches a designated limit, the game ends. In this experiment, we determined the designated limit is 60 s.

5.2 Results

Figure 6 shows the total time that the larynx position was above level 3 during the game, and Fig. 7 shows the total time it was below level 2. However, around the third game of participant C, there seemed to be an estimation failure of the larynx position (all were below level 2) caused by the belt slipping during the game. Hence, this result is excluded from the figures.

Fig. 6.
figure 6

Total time own throat was raised above level 3.

Fig. 7.
figure 7

Total time own throat was raised below level 2.

As Fig. 6 shows, participants A, B, and E improved the total time that the larynx position was above level 3 each time they played the game. Specifically, participant A improved the throat raising time to 18 s, and participant B to 5 s. Participant E worsened from the first game to the second game, decreasing the rising time from 60 s to 22 s. However, he did improve his throat raising time, becoming able to keep it stably in a high position. There was no major change for participant D, and participant C tended to decrease the time to keep his own throat in a high position, but did improve the time from the first game to the second.

As Fig. 7 shows, participants A, B, D, and E decreased the time that the larynx position was below level 2 each time they played game. Participant C also decreased the time from the first game to the fourth game, although his time increased in the fifth and sixth games.

In other words, participants A, B, and E became able to keep their own throats in a high position for a long time, and the time in the low position got shorter. In regard to participant D, the time his throat was in a low position also got shorter, and the time that his throat was kept in a high position increased throughout the game. In conclusion, these results show that four out of five participants improved their ability to keep their throat in a high position for a long time. This demonstrates that users were able to learn how to raise their throat more effectively by playing our proposed training game.

The performance in the sixth game decreased among all participants. We assume this was caused by an accumulation of fatigue to the throat due to exceeding the appropriate amount of training in a day. Therefore, we feel that the training will be most effective if done for the appropriate amount of time and not overdoing it.

In addition, during the training game, some of the participants tried to beat their personal best and compete with other users. This suggests that motivation can be increased by displaying training scores (both one’s own and the other players’).

6 Limitations and Future Work

The proposed system has two limitations. First, the photo-reflective sensor is affected by skin color. This sensor measures the distance by the amount of infrared light reflected from the target. If the target color is dark, the amount of received light decreases, and in this case, it is not possible for the photo sensor to measure the distance well. Second, the proposed system sometimes misrecognizes neck motion as throat motion, as changes to the surface of skin on the throat are typically small. We will reconstruct the device so that it is not so easily affected by these factors.

In future work, we will explore a possible correlation between throat rising and muscle strengthening by measuring tongue pressure and other elements. Besides, our proposed game is too simple, so that we need to improve the game system. We will research required elements as training game, and consider a system that can improve not only the muscle strength of the throat but also the comprehensive swallowing function. In addition, we will verify whether achievement of training changes according to not visualizing or visualizing with gamification.

7 Conclusion

In this paper, we proposed a system to visualize larynx position by throat motion for use in a training game. Individual PRSs measure the distance between the skin surface and itself, and the laryngeal position is estimated by an SVM that calculates the probability for two learned states: a natural state and a throat position raised to the maximum state. Experiments showed that the proposed system could accurately estimate throat motion. We also found that when users played the proposed training game several times, they were able to improve their swallowing ability and keep their own throats in a high position. This demonstrates that participants learned how to raise their throats by playing our game.

In the future, we will explore a possible correlation between throat rising and muscle strengthening. We will also develop a training game to strengthen our comprehensive swallowing ability.