Abstract
Conversation is an integral part of human’s relationship, which involves a large amount of tacit information to be uncovered. In this paper, we introduce the idea of conversation envisioning to disclose the tacit information beneath our conversation. We employ virtual reality for graphic recording (VRGR) to allow both observers and participants to visualize their thoughts in the conversation and to provide a training tool to acquire inter-cultural interactions using situated conversations. We focus on a bargaining scenario to highlight the tacitness of our conversations and use VRGR to make an in-depth analysis of the scenario. The proposed framework allows for performing a detailed analysis of the conversation and collecting different interpretations to provide timely assistance for realizing smoother cross-cultural conversations.
Keywords
1 Introduction
Conversation is the essence of human relationship and is far beyond simple message or information transfer. Through communication, we tend to build common ground, grow mutual understanding, learn about cultural patterns, develop empathy and many more. When conversing, verbals and non-verbals are combined seamlessly to convey our intentions [1]. A small piece of conversation can include an immense amount of information, not just bounded to what can be seen on the surface, but also deep beneath the surface structure. In this view, a lot of tacit dimensions can be found when we converse, leaving a vast area to explore and huge knowledge to learn for a better understanding of human-human communication and further improving human-AI communication [2]. Extracting such hidden aspects can also shed light on the dynamic and sophisticated structure of conversation and can be used as a training resource for understanding social interactions and cultural implications.
While we learn math and science at school, we are supposed to learn to converse through osmosis and without specific training even when communicating with and relating to people from diverse backgrounds. However, acquiring appropriate cultural competencies is deemed critical in today’s culturally and linguistically diverse society that involves sophisticated forms of conversations [3]. Developing cross-cultural skills can be realized by engaging in conversations with people of different cultures [4]. In this view, pedagogy needs to consider the immense tacitness of cross-cultural conversations in order to address people’s assumptions about the exchanged messages and their experiences with other cultures. This paper introduces virtual reality graphic recording (VRGR) as a medium to envision tacit dimensions of the conversation, thus provide a training tool for participants to practice cross-cultural interactions, avoid misunderstanding and realize smoother communication (Fig. 1).
1.1 Misunderstanding Due to Cultural Differences
Conversation form varies across a wide range of inter-cultural context. Parties of conversation with different backgrounds often have culturally-conditioned assumptions about the messages exchanged, which may cause misunderstandings, misinterpretations or conflicts [5]. Misunderstandings frequently manifest during inter-cultural conversations due to the lack of common ground, which makes it difficult to decode the messages correctly. Common ground is the collection of knowledge, beliefs, and suppositions shared by the participants of interactions [6] and lack of it can lead to unpleasant or unexpected consequences or losing opportunities. The following conversation is an example of a cross-cultural dialogue that highlights the role of cultural differences and the tacit dimensions of one’s assumptions and interpretations.
In this conversation, the customer from a feminine culture tries to socialize and make a relationship with the shopkeeper by sharing a personal story and asking for individualized suggestion prior to the bargaining or purchasing phase. The customer is expecting the shopkeeper to acknowledge her marriage (e.g., by saying “congratulations”) and to recommend a very special item to her, since she is asking for an exclusive suggestion (extending arms toward the shopkeeper). The shopkeeper, on the other hand, is mainly focusing on the business, which is normally the case in the masculine societies [7]. The shopkeeper’s behavior is rather recognized as an ignorance and indifference by the customer (“I usually recommend...”), which induces an unpleasant feeling. Moreover, the shopkeeper’s general suggestion implies this message to the customer that he is not considering her as a special case, which fortifies her assumption of being ignored.
However, from the shopkeeper’s viewpoint, the conversation is going on as a simple question-answer, which is a normal case at a shop. What meant to be a friendly conversation, is now more of a misinterpretation due to cultural differences. The customer, however, chooses to avoid being direct, rooted in feminine and collectivist society and tries to gently reject the suggestion by showing her hesitation.
From a business viewpoint, the focus of such conversation should be centered on the price, or the deal itself (typical for individualist and masculine society). Accordingly, in such a case, the customer’s message could be interpreted as an issue with the price. With such assumption, the shopkeeper suggests cheaper products to show his helpfulness. This suggestion unnerves the customer, making her specify why she is not happy with this and the previous recommendation (“I mean something special...”). She indirectly informs the shopkeeper that she is not happy for being treated as an ordinary customer and does not like the commonly-used items. The shopkeeper is more confused, wondering why she is not considering the first recommendation, which is the best suggestion he could ever give to the customer. He shows this by shrugging and raising eyebrows.
What made the customer disappointed is not clear to the shopkeeper and he could not see the point that the customer is unhappy mainly because of being ignored rather than disappointed by the suggested items. This misunderstanding occurs due to lack of common ground and awareness between the shopkeeper and the customer and leaves the customer with no choice but to end this conversation and leave the shop. Even in doing so, the customer tries to cover the disappointment by saying “thank you” before leaving, although it is understood as a form of sarcasm.
1.2 Training Inter-cultural Interactions
The example that is given above clearly demonstrates the importance of socio-cultural competencies in evolving common ground and avoiding misunderstanding. This highlights the role of inter-cultural training and explains the motivation of this study. Cultural competency training has been the focus of many studies that tried to raise the awareness and develop inter-cultural skills by designing practical experiences [8, 9]. The use of technologies such as machine learning techniques has been also playing roles for extracting surface information and using them for smoothing conversation [10]. However, less emphasize was given to abstract messages, high-level interpretations and common ground formation in cross-cultural interactions. This paper focuses on providing simulated experience in a virtual reality environment accompanied by learners’ collaboration and self-explanation, allowing for collecting explanations and presenting interpretations as a source of timely assistance.
Self-explanation is a knowledge-building activity and a process to assist the learners to understand the external input by explaining one’s thoughts, which is done in an attempt to raise the conscious awareness of the mental process [11]. With inferencing and elaborating, learners are supposed to disclose and improve their mental representations. In addition, explanation-to-others as another principle also allows the learners to convey meaning and causality relations and learn by collaboration and discussion. This would allow us to make a corpus of interpretations and collective utterances derived from participants’ collaborations, which can be further used by AI-agents or other participants. The utterances collected from self-explanation may be fragmented or incomplete whereas the utterances from the explanation to others are often more consistent and coherent [12]. Both would provide us with the rich knowledge to augment the conversation. Moreover, such collaboration in a situated environment would allow the participants to actively engage in role-plays and experience situated learning, where learning takes place as a result of social co-participation in a situation in which it occurs [13]. Finally, learners can practice inter-cultural interactions, while benefiting from observers’ or other participants’ interpretations derived from self or other-explanations. Empowered by the given interpretations, participants are supposed to train and improve their inter-cultural competencies and converse more smoothly. In this view envisioning and augmenting the conversation is anticipated to minimize misunderstandings, prevent conflict, reduce risks and lead to building rapport and good relations.
2 Envisioning as High-Level Conversation Analysis
To envision the tacit mental states of the participants in a particular situation, the conversation is required to be augmented to explicitly express the inner dynamics of the thoughts. To facilitate this augmentation, the participants are assumed to “do things that they think will get them what they want” [14]. Theory of mind specifies the reasoning behind actions to a set of belief-desire states. This type of analysis entails a sophisticated interconnecting graph of interpretative, predictive and explanatory resources that can be plentiful in the context of shared cultures or groups. More sophisticated interactions, such as inter-cultural interaction, however, have limited shared resources and raise new questions regarding the level of awareness of the parties, and their mental strategies manifested in their intentions and actions. Therefore, each action (e.g., utterance or gesture) in a given context can be augmented using a rubric shown in Fig. 2.
In this model, a party perceives the incoming signals of the interaction such as linguistic structures, prosodic information, gestures, eye gaze, etc. and forms a belief about the current state of the interaction. This belief can range from an educated guess to a cultural expectation, a probable supposition from the experience, or an unjustified suspicion. This belief depends on the knowledge about the other party, and whether he/she is aware of the mentioned cultures, norms or rules [14].
On the other hand, basic emotions and physiological states affect the desires. The desire derives one to perform the action, whether it is based on wishes or hopes, or it is out of obligation [15]. Temporal and imposed emotions (e.g., anger, fear) may mask the long-term wishes and ideals that are the innate desires of a person, while physiology can shift the desires temporarily (e.g. hunger, pain).
According to Wellman [14] When reasoning about the actions, beliefs are especially useful because they are distinctly directed at the world (although they are hidden and implicit) and they stem in perceptual and evidential experiences of a person. Nevertheless, the desire can be indirectly inferred from verbal and non-verbal cues, e.g., facial expression signaling sort of anger, or body motions indicating pain.
Beliefs and desires form the one’s intention to perform the action and to achieve his/her desired goal. In this view, a multitude of possible actions may come to the mind which fulfills the intention differently. When evaluating different actions, one is expecting a set of possible reactions, for which the likelihood depends on the culture, mood, and many more intrinsic and extrinsic factors. The final action which takes into account the hidden goals, intentions, anticipations, inner thoughts and the evaluation of different alternatives to fulfill the intention, would reveal a lot of information about norms and cultures, one’s experience and his/her personality.
Once the action is executed, based on the previous assessment, a reaction may be fully anticipated, partially expected or totally unexpected, thus raising surprise. In conversation envisioning we follow this methodological rubric to clarify the prior step involved in decision making and action execution process. The envisioning process involves participants as the executors of the actions to disclose, explain, and visualize their mental states and assumptions as well as meta-participants to augment the conversation from a third-person viewpoint and to notify the ambiguous or conflicting assumptions. To this end, we leverage virtual reality and graphic recording to perform the envisioning process.
3 Virtual Reality for Conversation Envisioning
In an attempt to elicit obscure aspects of conversations, we introduce conversation envisioning as a powerful methodology, which strives to unveil the hidden structures of our conversation by exploiting the recent advances of virtual reality (VR) and artificial intelligence. We introduce virtual reality graphic recording (VRGR) as a platform to allow both observers and participants to disclose their thoughts in the conversation. We use such interpretations to augment the conversation and to build a training tool for the learners in order to acquire inter-cultural interactions. VRGR not only allows the investigators to annotate the conversation and fully share the situation from different perspectives (e.g., first or third-person view), but also endows the participants with a tool to envision their thoughts by themselves, thus learning from each other or from interpretations provided by the observers. Furthermore, VRGR serves as a training platform to provide timely assistance, thereby realizing smoother inter-cultural communication.
Figure 3 shows the building blocks of our proposed VRGR platform. In this framework, the video and audio of the participants are captured using camera and microphone, and their motions are alternatively captured by visual or markerless motion capture systems. Such input is fed to the system through the human-computer interface (HCI) unit and is used to reconstruct participant’s motion and the speech by the character manager unit.
Character manager is in charge of generating participants’ role-plays from a live input by transferring the motion data to the 3D avatar, transforming the input voice, and storing the play into role-play database. This can be further used as a source of training data for an autonomous agent that can produce realistic speech and action. This module is also in charge of replaying saved plays or synthesizing avatar behavior for an arbitrary input script.
The main character(s) as well as (optional) NPC characters populate a virtual world created by the Scene manager. The scene manager is in charge of providing context for the scenario, allowing for interaction of the avatars with the environment, providing different views of the scene such as first and third-person and free-form views, enabling traverse through time to facilitate the annotation task, allowing for playing/analyzing alternative role-plays, and enabling timely appearance of the desired annotations in conjunction with Annotation Manager.
The Annotation Manager is the main unit of gathering, combining and storing the annotations from different sources. The annotations may come from the participants themselves during a revision of the role-play to express their first-hand experience, their mental states, emotions, construal, thoughts, expectations, and explanations of the reason why they selected a specific course of actions throughout the scenario and what else they could do or they could think of. This process can potentially highlight the root-cause of miscommunications, conversation breakdowns, or possible negative feelings caused by different interpretations of the situation mainly due to the norm and cultural differences. Furthermore, third-party viewers from stakeholders, experts, people with similar experiences, or even interested ordinary people, can annotate the interaction to demystify why a behavioral pattern emerges in specific situations, or why the role-play has this particular trajectory rather than another one. This unit stores the annotations and retrieves them based on point-of-view, keywords, or even credibility of the annotations that using an internal up-voting mechanism.
To gather a thorough set of annotations, the Graphic Recording unit facilitates the annotation process of the participants/meta-participants by allowing them to place pre-built annotation templates such as text-boxes, arrows, highlight spheres, images, voice memos. Also, time controls (play, pause, slow-down, skip forward or backward) is provided to ease the task. This process can be done either within or outside of the virtual world (using a computer). In VR, an automatic speech recognition system is employed to execute commands and transcript annotator’s input while wearing a VR head-mount display. The GR unit also enables editing and commenting on annotations to group relevant feedbacks in the same place.
4 Virtual Reality Graphic Recording for Training Inter-cultural Conversation
VRGR regenerates a played scenario and includes participants/observers in the annotation procedure while providing them with a friendly interface to add information to the interactions, thus making tacit information visible through graphic recording. With such functionality, the participants can explain their thoughts, play aloud to better disclose the underlying information, and revise the play on-the-fly. To highlight the vast amount of tacit information, we focused on a bargaining scenario as an interesting piece of social interaction, which represents many cultural points. We used VRGR as a tool to make an in-depth analysis of the scenario, provide useful interpretations to the participants, analyze the underlying reasoning, and uncover the mental processes that induced the conversational artifacts.
In a target multi-cultural bargaining scenario, first two or several participants play their roles that are captured by HCI unit. Meanwhile, character manager transforms their voices and motions to their corresponding avatars and interact with scene manager to update the virtual world based on this play. The scenario is then saved and replayed by the scene manager, while the participants along with extra third-party meta-participants discuss their experience and annotate the saved role-play accordingly.
During the discussion, several topics may be attended to or elaborated annotation may be considered necessary, which provides a rich source for situated analysis. The scenario may be replayed by other meta-participants later to augment the conversation and provide more insights on the situation. Through this reiteration, participant and meta-participants learn about the inter-cultural differences, practice how to resolve a specific situation or learn how to avoid unpleasant ones.
Using such tool learners can make a detailed analysis of the scenario and experience how different interpretations can lead the conversation into different branches. In this view, VRGR permits participants or observers to explicitly express their thought processes and virtual referents, hence experience or gain insights on the alternative situations that may result from different interpretations. Moreover, choosing topics such as bargaining, enable us to provide opportunities to develop the knowledge of other cultures in a game-like platform to motivate the learners. Using VR also allows us to design the environments that provide immersion in the target culture [16] and allow for cross-cultural experiential learning [17].
VRGR uses interpretations to augment the conversation and provides learners with a useful training tool, representing the interpretations of their actions and the expected outcomes induced from different perspectives or backgrounds.
5 Experimental Analysis
To evaluate our framework from the participants’ viewpoints, we conducted a preliminary experiment, focusing on cross-cultural experiences gained from interactions using VRGR. The purpose of this experiment was to elicit learners’ feedback on the effectiveness of this platform and to gain insight on improving the framework for future use and experiments. Through this interaction, the participants received interpretations on the cultural aspect of the conversation.
5.1 Participants
The participants of this study were 20 students with the majority being females (13) from different nationalities including Japanese, French, Thai, Korean, Palestinian, Chinese, American, etc. There were students of different majors including doctoral, master and undergraduate students.
5.2 Procedure
We asked the participants to play the role of the customer, in a conversation with the AI-shopkeeper who had a different cultural background. The goal of this conversation was to bargain with the shopkeeper and try to make a deal. The participants were asked to choose their next utterance from the available options. When doing so, the participants could hear their transformed voice and their avatar would perform the appropriate gestures.
The participants interacted with the AI agent two times, each time on a different scenario. On the first round, they did not receive any interpretations along with the conversation, whereas in the second round we provided culture-related interpretations to all the participants. We compare the participants’ choices, monitored their interactions and analyzed the outcome of their conversation in bargaining with the agent for both conditions i.e. before and after receiving GR. After the experiment we asked the participants to explain if VRGR was helpful in understanding the situation, learning about cultural differences and leading to smoother conversation when the two participants have different cultural backgrounds and limited common ground.
5.3 Experimental Results
Participants’ Interaction Using VRGR. Figure 4 shows how participants’ interaction with the AI shopkeeper differed with and without GR. As the figure suggests, the majority of the participants tended to have more positive interaction with the agent when they learned about the actual intention of what the agent said based on the interpretations derived from the analysis of target cultural background and collected interpretations presented by VRGR.
Figure 5 shows how VRGR helped the participants make a happy deal with the shopkeeper. Results demonstrate that VRGR not only helps the participant to reach the goal (make a deal) by conversation but also helps to raise participants’ awareness about the other culture, thus bring the participants closer to the agent and induce a positive atmosphere. It should be noted that even when participants had a bad start in their conversations, later in the interaction they were given another chance to change the conversation’s direction with the help of interpretations to reach a deal, although the negative mood arose at the beginning of the conversation may persist (see the rightmost bar in Fig. 5).
Participants’ Feedback on VRGR. The followings are the participants’ feedback on the use of VRGR given at the open-ended questions.
Learning Cultural Points: The cultural points introduced in the interpretations were interesting and new to almost all of the participants (except one whose cultural background was close to the agent’s):
-
P3: “I think GR is very helpful. It helps understanding why the shopkeeper was answering in a certain way.”
-
P12: “I think GR can be helpful, especially to those who have different cultures or haven’t experienced such situations.”
-
P5: “Without GR, I would not have understood that the shopkeeper was just being polite when offering goods for free, for example, and I would not have asked again for the item price. In situations like this, I think GR is most helpful and I can see that this would be helpful when learning about other social norms and cultures.”
-
P19: “The information about cultural background was helpful. If I didn’t get the information about it, I’d have made another choice.”
Building Common Ground: GR was mentioned as a beneficial tool for building and expanding common group between people, according to the participants:
-
P2: “GR was very helpful in bringing the two shopkeeper and customer to a common ground or understanding of what was going on. It made clear what the shopkeeper was intending with each phrase, gesture, or intonation.”
-
P14: “I believe it is a very good idea to provide GR to help understand the flow of the conversation.”
Serving the Purpose of Education: Participants particularly noted that the interpretations provided by GR sound promising for educational purposes:
-
P7: “I would consider this an effective method to practice social interaction in specific contexts with discrete education goals since the response choices are prescribed and the behavioral analysis and interpretation is discretely defined during the conversation, which are conditions that do not appear in current reality.”
Our preliminary results received positive participant feedback on the usefulness of VRGR to raise understanding in cross-cultural interactions. Furthermore, the results suggested that VRGR augmented the conversation and assisted the participants to have a smoother cross-cultural communication. Therefore, the interpretations provided by GR were rated effective in developing cross-cultural competence and understanding the differences.
6 Conclusions
We introduced VRGR as a medium for training cross-cultural interactions. Benefiting from VR technology, participants are allowed to interact with a virtual agent in a simulated environment. The platform allows participants and meta-participants to revisit the conversation, and make an in-depth analysis of the situation. Following a specific rubric, participants and meta-participants use VRGR to specify the belief and intention that induce one’s action and the expectations about a forthcoming reaction. We evaluated the proposed framework by conducting experiments with participants from different cultural backgrounds. VRGR has received positive participant feedback and is found as a promising tool to enhance cultural competencies and facilitate inter-cultural interactions. Future directions include the incorporation of facial expressions into avatars, the extraction of surface information from the conversation for meta-participants, and the design of content-specific training sessions for the participants.
References
Tversky, B.: Visualizing thought. Top. Cogn. Sci. 3(3), 499–535 (2011)
Nishida, T., Nakazawa, A., Ohmoto, Y., Mohammad, Y.: Conversational Informatics: A Data-Intensive Approach with Emphasis on Nonverbal Communication. Springer, Japan (2014). https://doi.org/10.1007/978-4-431-55040-2
Miranda, A.H.: Best practices in increasing cross-cultural competency. Best Pract. Sch. Psychol. Found. 4, 49–60 (2014)
Jones, J.: Best practices in providing culturally responsive interventions. Best Pract. Sch. Psychol. Found. 1, 353–362 (2002)
Storti, C.: Cross-Cultural Dialogues: 74 Brief Encounters with Cultural Difference. Nicholas Brealey, Boston (2017)
Nishida, T.: Human-Harmonized Information Technology: Horizontal Expansion, vol. 2. Springer, Japan (2017). https://doi.org/10.1007/978-4-431-56535-2
Hofstede, G.J., Jonker, C.M., Verwaart, T.: An agent model for the influence of culture on bargaining. In: Proceedings of the 1st International Working Conference on Human Factors and Computational Models in Negotiation, pp. 39–46. ACM (2008)
Kim, J.M., Hill Jr., R.W., Durlach, P.J., Lane, H.C., Forbell, E., Core, M., Marsella, S., Pynadath, D., Hart, J.: Bilat: a game-based environment for practicing negotiation in a cultural context. Int. J. Artif. Intell. Educ. 19(3), 289–308 (2009)
Lane, H.C., Ogan, A.E.: Virtual environments for cultural learning. In: Second Workshop on Culturally-Aware Tutoring Systems in AIED 2009 Workshops Proceedings. Citeseer (2009)
Fussell, S.R., Zhang, Q. (eds.): Workshop on Culture and Collaborative Technologies (CHI 2007), Proceedings (2007). http://www.cs.cmu.edu/~fussell/CHI2007/overview.shtml
Chi, M.: Self-explaining expository texts: the dual processes of generating inferences and repairing mental models. Adv. Instr. Psychol. 5, 161–238 (2000)
Hempel, C.G., et al.: Aspects of Scientific Explanation. Free Press, New York (1965)
Lave, J., Wenger, E.: Situated Learning: Legitimate Peripheral Participation. Cambridge University Press, Cambridge (1991)
Wellman, H.M.: Making Minds: How Theory of Mind Develops. Oxford University Press, New York (2014)
Wellman, H.M.: The Child’s Theory of Mind. The MIT Press, Cambridge (1992)
Stanley, G., Mawer, K.: Language learners & computer games: from. TESL-EJ 11(4), n4 (2008)
Cushner, K., Brislin, R.W.: Intercultural Interactions: A Practical Guide, vol. 9. Sage Publications, Thousand Oaks (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Mirzaei, M.S., Zhang, Q., Nishida, T. (2018). Conversation Envisioning to Train Inter-cultural Interactions. In: Meiselwitz, G. (eds) Social Computing and Social Media. Technologies and Analytics. SCSM 2018. Lecture Notes in Computer Science(), vol 10914. Springer, Cham. https://doi.org/10.1007/978-3-319-91485-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-91485-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91484-8
Online ISBN: 978-3-319-91485-5
eBook Packages: Computer ScienceComputer Science (R0)