1 Introduction

The greeting is one of the most familiar communicative behaviors in everyday life. Identifying the rules and trends seen in greetings can offer important basic resources for understanding face-to-face interactions.

Yamamoto et al. (2004) looked at the timing between “bows” and “utterances” when Japanese people meet and greet each other and showed that bows preceded utterances. Through experiments, Kobayashi et al. (2013) showed the most comfortable utterance timing when responding to a greeting. In our daily greeting behavior, the people involved in greeting interactions seem to share the timing patterns considered appropriate for them.

In the past, studies on greeting behavior were conducted using robots, CG characters, etc. by making them bow (Yamamoto et al. 2004, Shibata et al. 2014) or by using an audio-response system (Kobayashi et al. 2013), often rigorously controlling the greeting condition. Yamamoto et al. (2006) actually measured bows by Japanese people and quantified their behavioral characteristics. However, they examined “extrinsic bows” by giving advance directions to the subjects. These earlier studies, therefore, eliminated errors but cannot be said to represent the “spontaneous bowing” behavior that Japanese people do in their daily life.

Our study, on the other hand, examined spontaneous greetings by the people responding to an initial greeting. Generally speaking, Japanese people’s greetings involve both vocal utterances and bowing (Tanaka 1989). In our study, we divided the greeting behaviors into “bows” and “utterances” and analyzed the timing of their occurrence. In particular, we focused on quantifying the changes in the respondent’s behavior according to the greeting patterns of the initiator, i.e., greeting generation patterns such as a bow followed by an utterance, an utterance followed by a bow, and a simultaneous occurrence of a bow and utterance. The purpose of our study is to examine the characteristic aspects of greeting behavior.

2 Experiment Overview

2.1 Experiment Procedure

This study sheds light on the spontaneous greetings by responding subjects, looking at how they are generated in different greeting patterns where the bow and utterance timings vary. We conducted a simulated interview experiment (Fig. 1). The pre-assigned interviewer (a woman) first greeted the subject by saying, “Yoroshiku onegai shimasu. (Thank you for participating in our experiment.)” After the interview, she greeted the subject by saying, “Arigato gozaimashita. (Thank you for your cooperation.)” The interviewer used the following three timing patterns in her bows and utterances:

Fig. 1.
figure 1

Interview experiment

  • Bow first: The interviewer greets the subject by first greeting with a bow, followed by an utterance.

  • Utterance first: The interviewer utters the greeting first, followed by a bow.

  • Simultaneous bow and utterance: The interviewer greets the subject by simultaneous bow and utterance

The interviewer wore the same clothing throughout the experiment. She was instructed to “Start the utterance after completely finishing the bow” in the Bow First pattern, and to “Start bowing after completely finishing the utterance” in the Utterance First pattern. The interviewer practiced her patterns about 20 times in advance so that her behavior would be uniform throughout the experiment.

We measured and quantitatively analyzed the spontaneous greetings by the responders (subjects) as well as the timing of their behaviors.

2.2 The Subjects of Our Experiment

The subjects of our experiment were thirty university students (15 male and 15 female students, mean = 20.0, SD = 1.50) meeting each other for the first time. We assigned five male and five female students to each of the experimental conditions.

2.3 Procedure

After the subjects were briefed on the experiment procedure in an anteroom, they were asked to sign an agreement for participation in our experiment. Then each subject was led into the experiment room and to a chair. When the subject entered the experiment room, the interviewer waited, standing at the back of the room. The interviewer and the subject stood facing each other, at a distance of 115 cm.

After the interviewer’s greeting “Yoroshiku onegai shimasu”, they both sat down and conducted a simulated interview for five minutes. After the interview, both stood up and, following the interviewer’s parting greeting, “Arigato gozaimashita,” the subject left the experiment room. The interviewer used one of the greeting patterns shown in Sect. 2.1. Each subject’s response to the interviewer’s greeting was recorded by motion capture and wireless microphone.

2.4 Data Extraction

We obtained the characteristic bowing behavior data by using an optical motion capture system. We placed reflective markers on three body parts (Fig. 2): the top of the head, neck (seventh cervical vertebra), and lower back (sacral bone). We obtained time-series (fps = 120) 3D coordinates of these markers.

Fig. 2.
figure 2

Position of reflective markers

2.5 Analytical Indicator of Bows

From the 3D coordinates we obtained by motion capture, we extracted different types of data: Bow Length, Bow Bending Angle, Delayed Bow Time and Delayed Utterance Time for every greeting condition.

  • Bow Length

    The “Bow Length” is defined as the time lapse between the responding subject starting to lower his/her head in a bow and returning the head to the original position, i.e., the end of the bow (Fig. 3).

    Fig. 3.
    figure 3

    Bow length

  • Bow Bending Angle

    The angle (θ) of maximum bow of the responding subject from the original position, i.e., standing straight (Fig. 4).

    Fig. 4.
    figure 4

    Bow bending angle

  • Delayed Bow Time

    The time lag between the bow start by the interviewer and the responding subject is defined as the “Delayed Bow Time” (Fig. 5). We measured the exact time lapse between the interviewer’s bow start time and the responding subject’s bow start time.

    Fig. 5.
    figure 5

    Delayed bow time

  • Delayed Utterance Time

    The length of time between the interviewer’s utterance start and the subject’s bow start is defined as the “Delayed Utterance Time” (Fig. 6).

    Fig. 6.
    figure 6

    Delayed utterance time

    Table 1 shows the basic statistics of Bow Length and Bow Bending Angle of the interviewer. The result of the paired t-test did not show any significant difference in the Bow Length and Bow Bending Angle (p < .05). Therefore, it was determined that the interviewer’s bows were properly controlled.

    Table 1. Basic statistics for the interviewer’s bows

3 Results

We will now describe the results of our experiment conducted. That data include the three greeting patterns by the interviewer, their respective responses by the responding subjects and the characteristic features of the bows. Our experiment did not show any gender differences; therefore, gender difference is not discussed in this study report.

3.1 Response Greeting Patterns

Our experiment revealed that all of the study subjects spontaneously greeted the interviewer, before and after the interview, and that all of these greetings involved both bows and utterances. First, we classified the order of the response greetings: A bow followed by an utterance; an utterance followed by a bow; and a bow and an utterance occurring simultaneously.

When a subject’s response greeting pattern was the same as that of the interviewer, this response pattern was determined to be “matching”. When the subject’s response greeting pattern was different from that of the interviewer, this response patterns was determined to be “unmatching” (Fig. 7). In order to examine different rates of “matching” and “unmatching” patterns before and after the interview, we conducted a Fisher’s exact test. The result showed that there was a difference in the greeting conditions both before and after the interview (before the interview: p < .01, after the interview: p < .001). Similar trends were shown before and after the interview. About half of the subjects showed the matching “bow” and “utterance” generating order as that of the interviewer, both before and after the interview.

Fig. 7.
figure 7

Response greeting patterns

Table 2 shows the response greetings before and after the interview. When the interviewer bowed first, close to a half of the responding subjects also bowed first, before uttering a greeting. The remaining half of the responding subjects generated both a bow and utterance simultaneously. On the other hand, when the interviewer greeted the subject with an utterance first followed by a bow, the majority of the responding subjects generated both a bow and utterance simultaneously. Similarly, when the interviewer started her greeting with a simultaneous bow and utterance, the majority of the responding subjects also responded with a simultaneous bow and utterance.

Table 2. Cross table of order of greeting behavior (3) × matching trend (2)

From the above results, it was clear that the “bow” and “utterance” generating pattern of the responding subject did not always match that of the interviewer. In particular, because the generating pattern tended to match that of the interviewer when she greeted both with a “bow” and “utterance,” everyday greetings are known to consist of “bows (bowing behavior)” and “utterances (greeting)”. It was also shown that when the interviewer started either the “bow” or “utterance” before the other action, the subject’s responses did not match that of the interviewer. In particular, because the interviewer’s “bow first” behavior tended to generate a matching response, we can say that the subject’s greetings tend to be affected more by the physical movement.

3.2 Quantifying the Response Greeting

The preceding section revealed how the responding subjects generated bows and utterances according to the different greeting patterns of the interviewer. In this section, the present authors looked at the “bowing action” to see how it changed with the three different greeting patterns of the interviewer.

In order to examine if the bowing action changes with the different greeting conditions of the interviewer, we conducted two - way factorial analyses of variance with “before- and after-interview” (2) and “greeting conditions” (3) being used as dependent variables of the bow analytical index shown in Sect. 2.5.

As regards to the Bow Length and Bow Bending Angle, the main effects were seen significantly before and after the interview (F (1, 27) = 25.616, p < .001, F (1, 27) = 30.094, p < .001). In both cases, for each parameter, the main effect of the greeting condition and two- way interaction were not significant. It was shown that the response bows were longer and deeper after the interview than before the interview, regardless of the interviewer’s greeting pattern (Figs. 8 and 9). The mean Bow Length of the responding subject was 1233 ms before the interview. It was 1575 ms after the interview, thus showing that the bow time was longer by about 300 ms. The mean Bow Bending Angle was 47 ° before the interview but was 52 ° after the interview, showing that the bow became deeper by 5 °.

Fig. 8.
figure 8

Result of bow length

Fig. 9.
figure 9

Result of bow bending angle

We observed the significant main effect of the greeting condition for the Delayed Bow Time (F (2, 27) = 16.66, p < .05). The main effect and interaction, however, were not observed before and after the interview. The result of multiple comparison showed that there were significant differences between “utterance first and simultaneous bow and utterance” (p < .05) and between “utterance first and bow first” (p < .05). As for the greeting patterns of “bow first” and “simultaneous bow and utterance”, we found that the responding subject’s bows started about 600 ms after the interviewer’s bow start. On the other hand, under the “utterance first” condition, the responding subject started the responding bow before the interviewer began bowing (Fig. 10).

Fig. 10.
figure 10

Result of delayed bow time

From the above, it was shown that the responding subject’s bowing timing was affected by the greeting generating order of the interviewer.

3.3 Response Greeting Timing Structure

In order to find what prompts the subject’s bow, we examined the Bow Delay Time and the Delayed Utterance Time. The Bow Delay means the time lag between the interviewer’s bow start and the responding subject’s bow start. The Delayed Utterance Time means the time lag between the start of utterances by the interviewer and the subject. Figure 11 shows the mean values of these indices.

Fig. 11.
figure 11

The mean values of delay time

Figure 11 shows that under the “bow first” condition the Delayed Utterance Time are negative values, indicating that the subjects started bowing before the interviewer’s utterance. This means that the subject used the interviewer’s bow as the starting point. In the case of the “utterance first” condition, the Delayed Bow Time is in the negative range, indicating that the subjects used the interviewer’s utterance as the starting point. Under the “simultaneous bow and utterance” condition, both are possibilities. Subjects’ behavior deviation is smaller when the standard deviation is lower. It was considered that the subjects used the interviewer’s bows as the starting point.

Under the “bow first” and “simultaneous bow and utterance” conditions when the interviewer’s bow was the starting point, the subjects were shown to start their bows about 500–600 ms after the interviewer’s bow. However, under the “utterance first” condition, the subjects were shown to start bowing about 800 ms after the interviewer’s bow start.

From these results, the subjects returned bows prompted by the interviewer’s bow under the “bow first” and “simultaneous bow and utterance” conditions, in which the interviewer started the bow before uttering the greeting. It was shown that the subjects’ bowing timing varied with the interviewer’s greeting condition, i.e., “utterance first” or “bow first”.

4 Discussion

4.1 Bow and Utterance Timing

The results of our experiment revealed that the responding subject spontaneously returned a greeting under all experimental conditions, however, not always using the same pattern as the interviewer. It also revealed that the simultaneous generation of a bow and utterance was the most frequent pattern. Kinemuchi et al. (2014) showed that the greeting speed of the responding subject changed with that of the interviewer. However, our experiment showed that the responding subject’s bow-and-utterance generation pattern was not necessarily matched with that of the interviewer. Only when the interviewer bowed first, about a half of the responding subjects bowed accordingly. This indicates that the responding subjects start bowing even if the interviewer does not say anything. This seems to reflect the Japanese greeting manner of “silently bowing to each other”. In business and as social etiquette, “An utterance followed by a bow” is considered proper. Our study, however, barely showed this kind of greeting pattern.

As stated at the beginning of this paper, greetings by Japanese people involve utterances and bows (Tanaka 1989). In our experiment, all subjects greeted with both an utterance and a bow. This spontaneous daily greeting pattern, therefore, consists of both visual (body movement) and auditory (utterance) elements. In particular, our experiment showed that in both the “bow first” and “simultaneous bow and utterance” greeting patterns, bows were always involved, thus showing that it is a very important body movement in the Japanese-style greeting.

4.2 The Characteristics of Spontaneous Bows

Our experiment showed that the length and depth of the responding subjects’ bows were not affected by the interviewer’s greeting pattern. Regardless of the differences in the interviewer’s bow and utterance timing, the responding subjects maintained certain bow length and depth. Due to the fact that the bows tended to be longer and deeper after the interview, greetings appeared more sincere after the interview than before it. These observations suggest that bow lengths and depths considered appropriate for different occasions are shared as part of the cultural norm.

When the interviewer greeted with the “bow first” or “simultaneous bow and utterance” patterns, the responding subject started the bow approximately 600 ms after the start of the interviewer’s bow. Kobayashi et al. (2013) showed that the most agreeable utterance timing was 600 ms when responding vocally. Nagaoka et al. (2005) also showed that the speaker’s timing latency of 600 ms sounded most natural and agreeable. This is the time lapse between the end of the utterance by the interviewer and the start of the utterance by the responding subject. With our experiment, on the other hand, the most agreeable physical responding time was also 600 ms, just like the vocal response, measured from the interviewer’s bow start. In the case of vocal interactions, the utterance started after a lapse of 600 ms after the end of the interviewer’s greeting. In the case of physical interactions, the responding subject started their greeting 600 ms after the bow start by the interviewer.

Here are the reason we obtained the above results. In the case of vocal interactions, the responder (“responding subject” in our experiment) must listen to and understand the utterance of the initiator (“interviewer” in our experiment) before responding vocally. The end of the initiator’s utterance is thought to prompt a vocal response 600 ms later. In the case of physical interactions, the behavior gives out visual information which can be understood instantly, allowing a response before the initiator finishes her action. Thus, the bow start by the initiator must have prompted a response 600 ms later. However, if the initiator started an utterance before a bow, it was also shown that the response bow was started about 800 ms later, without waiting for the start of utterance by the initiator.

From the above results, the responder’s bows are generated as prompted by the initiator’s bow or utterance. When the initiator bowed at the start of the greeting, the responder also bowed. However, when the initiator uttered at the start of the greeting, the responder returned a bow after a certain time lapse. Therefore, the initiator’s bowing action is considered to strongly affect the responder’s bowing action.

5 Conclusion

This study examined spontaneously generated greeting patterns, quantified how the responding subjects’ greeting patterns changed with the bow-and-utterance greeting patterns of the interviewer, and discussed their characteristics. The results of our experiment showed that the greeting patterns did not necessarily go along with that of the interviewer. It also showed that the formal greeting pattern of “utterance first followed by a bow” was rarely practiced. The “simultaneous bow and utterance” was the most common greeting pattern. As for the response timing, it was learned that it was generated with a clue from the start of the greeting by the interviewer. This study was significant in clearly showing the features of one of the most usual means of communication, i.e., the greeting behavior. By building realistic human models based on human behavior, studies like ours are expected to help develop support tools for better communication.