Keywords

1 Introduction

This paper proposes atmosphere sharing method with chatting agent during TV watching in order to improve user’s motivation for conversation with physical or virtual agents. Since people who live alone increase due to aging society or changing of life-style, they tend to lose the opportunity of conversation with other people. If people lack opportunity of daily-life, there is possibility that function of brain or health changes for the worse. It is important to increase the opportunity of conversation in daily-life for healthcare of the elderly people.

In order to overcome the problem, some studies attempted to give users opportunities of the daily conversation by using communication agents, such as interactive robots which have physical body or virtual agent in smart phone or tablet. Kanda et al. proposed that it is important to built trust relationship between people and interactive agents to accept the communication agents as social partner of in daily life [1].

Minami et al. [2] developed a TV chat robot that enables to reply quickly enough to seem spontaneous by making comments on social media. This robot was developed with the aim of promoting continuous dialog with a user. It chats while watching TV with the user. It allows to respond to what the user says as quickly as a human. Its dialog function allows it to compose comments derived texts from social media on what is being broadcast on the TV. This system embodies “sociality” and “favorite information” and so, it was proposed, would be able to motivate the user to communicate with a chat robot continuously. However, a mismatch occurs due to the time delay of the topic content of the TV program and the utterance contents of the robot, due to the time-lag between the robot’s acquisition of texts from social media that relates to content of the TV show and the verbal output of the chatting robot. Also, the user allows to find it difficult to hear the sound of the TV and the robot when they are both “speaking” at the same time.

This paper develops atmosphere sharing methods with TV chat agents by using SNS comments which are submitted by user who is watching TV. This study also carries out evaluation experiments to investigate effectiveness of agents behavior in exciting and laughing scenes. This research compares effects of users in different conditions which are with/-out atmosphere sharing by using physical robots or virtual agents.

2 Related Work

In order for the chat robot to increase the opportunities of speech of the elderly and young people, they need provide suitable opportunities for dialog with the robot. Miyazawa et al. [3] identified the factors that promote the motivation of a user to engage in daily-use communication with robots. Two factors are required for effective interaction. One is “sociality,” which means social nature or tendencies created by a user interacting with SNS communities via a chatting robot. To promote sociality, it is important to give the user a sense that the robot enables to listen to him or her. The other is “favorite information,” which means providing unexpected or new information. Miyazawa et al. established that it is important that two factors should be incorporated into a robot whose function it is to communicate. We classify and summarize under two headings on-going research based on this point of view with regard to communication robots.

Minami et al. [2] aims to improve the sociality and favorite information of robots to encourage daily dialog in the long term. In order to realize this, they developed a TV chat robot by combining the method of Kobayashi et al. [6] which produced a dialog system that is smooth, and another method of Takahashi et al. [7] which generated one that is interesting. This robot has four dialog functions: backchannel [4], repetition [6], machine answering [5] and social media comments [7]. It is found out that the motivation of users to continuously use the robot is enhanced more when all the functions are combined than when it embodies each function alone or a combination of some functions. The problems with this system are indicated Fig. 2. One of the dialog functions, the “social media comments” function takes time to implement. The system needs to process comments written in social media on the web to arrive at the corresponding utterance of the chatting robot. Due to this time delay, there is a mismatch between the topic contents of the TV program (which might, by the time the utterance is ready, may have moved on to another topic) and what the robot says. Also, the television and chat robot may end up speaking simultaneously, making it difficult for the user to attend to both. There is a fear that these problems will reduce the motivation of the user to interact with the robot.

This paper develops atmosphere sharing methods with TV chat agents by using SNS comments which are submitted by user who is watching TV. Section 2 reviews related work of interaction technology with agents for continuous conversation. The overview of our proposed method is mentioned in Sect. 3. The experiment results of interactive agents with atmosphere sharing is shown in Sect. 4 and concluded in Sect. 5 (Fig. 1).

Fig. 1.
figure 1

TV chat agent system.

3 Development of Atmosphere Sharing with Agents

3.1 Atmosphere Sharing with TV Chat Agents

This research develops atmosphere share method with user and agents which are interactive robots as physical agent and CG agent as virtual agent during TV watching. Kinds of atmosphere generally include anger, sadness, fear, or exciting etc. This study focuses on exciting and laughing scenes because we assume that these situations can affect user’s motivation for conversation. We handle the sport and comedy show programs for atmosphere sharing with TV chat agents. In order to determine agent’s behavior for the atmosphere sharing, system has to estimate timing and level of the exciting or laughing situation in TV program. In general, There are two approaches to estimate the timing and level of them in TV program. First way is to detect atmosphere status of TV program by analyzing SNS comments which were submitted by TV viewers. General atmosphere status of TV program can be estimated by using reactions of TV viewers through SNS. Another way is to determine them by directly sensing the reaction of each TV viewer. This approach can estimate status of atmosphere individually because atmosphere is analyzed by a reaction of each TV viewer. In other words, the approach can personalize for atmosphere estimation. However, it is difficult to precisely estimate user’s reaction from appearance of TV viewers. As above reason, This study employs the first approach which determines the atmosphere of TV program. In this paper, the effects of users are evaluated when they are watching TV with agents of which behaviors are decided by using estimated atmosphere.

3.2 Atmosphere Estimation of TV Program

This study handles excite and laughing scene as atmosphere in TV program for TV chat agents. This section describes estimation method of timing of atmosphere status changing and level of atmosphere status.

Timing Estimation of Atmosphere Status Changing. We assume that the atmosphere in TV program is decided by emotion of TV viewers. Therefore, in order to determine the atmosphere, the emotion of TV viewers is estimated by analyzing SNS comments submitted by them in real-time. Exciting scene can be estimated by detecting periods which SNS comments suddenly increase because the magnitude of exciting scene in TV program generally depends on the number of SNS comments. Related work proposed the detection method of important scene by using the number of comment in Twitter. In this study, exciting scenes are detected by Eqs. (1) and (2).

$$\begin{aligned} Threshold_{C}&= \mu + 2\rho \end{aligned}$$
(1)
$$\begin{aligned} C_i < Threshold_{C}&: exciting \ scene \end{aligned}$$
(2)

where \(C_i\) is the number of comments every 5 s. \(\mu \) and \(\rho \) are average and standard deviation in a period which is not laughing scene for past 15 min, respectively.

The laughing scenes are detected by Eqs. (3) and (4).

$$\begin{aligned} Threshold_{L1}&= \mu + \rho \end{aligned}$$
(3)
$$\begin{aligned} L_i < Threshold_{L1}&: laughing \ scene \end{aligned}$$
(4)

where \(L_i\) is the number of comments which express laugh meaning every 5 s. \(\mu \) and \(\rho \) are average and standard deviation in a period which is not exciting scene for past 15 min, respectively.

Level Estimation of Atmosphere Status. We also assume that there is level of the exciting and laughing scenes. This section describes definition of the levels of the exciting and laughing scenes and estimation methods of them. In this study, we define there are four levels of the exciting and laughing scenes The level of exciting scenes are estimated by form Eqs. (5) to (8).

$$\begin{aligned} C_i < Threshold_{C}&: level0 \end{aligned}$$
(5)
$$\begin{aligned} Threshold_{C}< C_i < 1.3\times Threshold_{C}&: level1 \end{aligned}$$
(6)
$$\begin{aligned} 1.3\times Threshold_{C}< C_i < 1.6\times Threshold_{C}&: level2 \end{aligned}$$
(7)
$$\begin{aligned} 1.6\times Threshold_{C} < C_i&: level3 \end{aligned}$$
(8)

where \(C_i\) is the number of comments every 5 s and \(Threshold_{C}\) is estimated in previous section.

$$\begin{aligned} L_i < Threshold_{L}&: level0 \end{aligned}$$
(9)
$$\begin{aligned} Threshold_{L}< L_i < 1.3\times Threshold_{L}&: level1 \end{aligned}$$
(10)
$$\begin{aligned} 1.3\times Threshold_{L}< L_i < 1.6\times Threshold_{L}&: level2 \end{aligned}$$
(11)
$$\begin{aligned} 1.6\times Threshold_{L} < L_i&: level3 \end{aligned}$$
(12)

where \(L_i\) is the number of comments which express laugh meaning every 5 s and \(Threshold_{L}\) is estimated in previous section.

3.3 Behavior Determination

This section explain control method of agent behavior in exciting and laughing scene for atmosphere sharing. There are two types of interactive agents; physical and virtual agents. Virtual agent is drawn by computer graphics in smart phone or tablet. Physical agent is real interactive robot which has physical body. Appearance of both agents are human-like. Smart phone application “Davelive” (amirbo tech Inc.) is used as virtual agents as shown in Fig. 2. Outside body of interactive robot “Kabo-chan” (PIP Inc.) is used as physical robots. Two servo motors are installed at both solder joints of the physical robots to wave robot’s arm.

TV chat engine which gathers SNS comments in real-time provided by [2] is utilized for deciding utterance contents. A server of the engine sends utterance contents and atmosphere information estimated in previous section every seven seconds. Clients, such as smart phone which provides virtual robot or small computer which controls physical robots, talk based-on information received from the chat engine server.

Fig. 2.
figure 2

Agent behavior in exciting scene.

In exciting scene, atmosphere is expressed by speaking with multiple agents at same time. Behavior of multiple agents are controlled based-on levels as follows.

  • level 0 one agent utters one comment.

  • level 1 two agents utter one comment.

  • level 2 three agents utter one comment.

  • level 3 five agents utter one comment.

In case of virtual agents, some virtual robots appear based on the levels as shown in Fig. 2. On the other hand, by using physical agents, five physical robots are prepared.

In laughing scene, atmosphere is expressed by speaking with multiple agents at same time. In addition, three levels (small, medium, and loud) of laughing voices set up in advance Behavior of multiple agents are controlled based-on levels as follows.

  • level 0 one agent utters one comment.

  • level 1 one agent utters one comment and other one agent gives small laugh.

  • level 2 one agent utters one comment and other two give medium laugh.

  • level 3 one agent utters one comment and other four give loud laugh.

In case of virtual agents, some virtual robots appear based on the levels as shown in Fig. 3.

Fig. 3.
figure 3

Agent behavior in laughing scene.

4 Evaluation Experiments of Atmosphere Sharing

This section describes experiments of atmosphere sharing during TV watching with virtual and physical hat agents and effects of users against the agents.

4.1 Experimental Environment

For exciting scene, football game is selected as TV program because timing of exciting periods can be detected easily and there is small differences among individuals. In this experiments, sportscasting of football game between Japan and Uruguay on October 16th 2018 is used for exciting scene. On the other hand, Japanese comedy show is selected as TV program contents for laughing scene as shown in Fig. 4. TV program contents for experiments are made by collecting SNS comments about these TV programs. As physical agents, customized physical robots are used, and each robot takes clothes of different color to recognize robot easily. Voice of the each robot is synthesized by using VioceText Web API. Behaviors of the agents are decided in advance due to repeatability.

As shown in Fig. 5, we ask the 11 subjects to use the chat robot as they watch four types of TV programs each in the case of conditions as follows.

Fig. 4.
figure 4

Movie contents in laughing scene with physical and virtual agents.

Fig. 5.
figure 5

Experimental environment with physical and virtual agents.

  • without atmosphere sharing, virtual agents

  • with atmosphere sharing, virtual agents

  • without atmosphere sharing, physical agents

  • with atmosphere sharing, physical agents.

Subjects answer a questionnaire after each of TV watching sections. The contents of the questionnaire is as follows. The questionnaire takes the form of 7-point Likert scales.

In exciting scene:

  • Do you feel exciting between robots?

  • Do you feel friendliness?

  • Do not you feel strange?

  • Do you feel motivation to use?

  • Do you feel empathy?

In laughing scene:

  • Do you feel fun between robots?

  • Do you feel fun with the robots?

  • Do you feel friendliness?

  • Do you feel empathy?

  • Do not you feel strange?

  • Do you feel motivation to use?

Fig. 6.
figure 6

Results of subjective evaluation in exciting scene.

Fig. 7.
figure 7

Results of subjective evaluation in laughing scene.

4.2 Results and Discussion

Figure 6 shows results of subjective evaluation in atmosphere sharing of exciting scene. In case of using virtual agents, score under condition with atmosphere sharing is compared with that under condition without sharing, and the significant value difference (p < 0.05) is shown. When atmosphere sharing is provided, score under condition using virtual agents is compared with that under condition using physical agents, and the significant value difference (p < 0.05) is shown. From the results, effects of atmosphere sharing in exciting scene is better than one without it regardless of appearance of agents.

Figure 7 shows results of subjective evaluation in atmosphere sharing of laughing scene. In case of using virtual and physical agents, score under condition with atmosphere sharing is compared with that under condition without sharing, and the significant value difference (p < 0.05) is shown. When atmosphere sharing is provided, score under condition using virtual agents is compared with that under condition using physical agents, and the significant value difference (p < 0.05) is shown. From the results, condition with virtual agents get higher score than one with physical agents. As a reason for that, we think there are some functions which only system of virtual agents have. For examples, only virtual agents have face expression and utterance sentences is displayed on smart phone.

5 Conclusion

This paper has proposed atmosphere sharing method with chatting agent during TV watching in order to improve user’s motivation for conversation with the agent. We have developed TV hat agent which have light conversation with user during TV watching. To increase user’s motivation of conversation with the agent, the agent gives interesting and funny talk by using SNS comments that are provided by users who are watching the TV program. This study attempts to share atmosphere, such as scene with enthusiasm or of bursting into laughter, by using the TV chat agent. A behavior of the chat agents is decided by estimating the atmosphere of TV audience by analyzing SNS comments about TV program. It is possible to dynamically change the amount of utterance of the robot according to the excitement and laughing section and the level, and it is considered that more natural excitement can be realized. By comparing the excitement of people with one of SNS during TV watching, features of them are extract and characterize. In addition, by analyzing the feature promoting excitement impression, appropriate SNS comments for TV chat agents can be selected. As a result of subjective evaluation of the user by the robot utterance reflecting each feature, it was confirmed that the utterance of a human who contains a lot of sensuous verbs gives the impression of excitement to the user with a significant difference recognized.