Why are you so slow? – Misattribution of transmission delay to attributes of the conversation partner at the far-end

https://doi.org/10.1016/j.ijhcs.2014.02.004Get rights and content

Highlights

  • With delays up to 1200 ms, no effect found for the perceived quality in both studies.

  • With high delays, interlocutors are perceived less attentive, extraverted and conscientious.

  • Conversational courses change justifying a different attribute perception.

  • Misattribution of the impairment speech delay to attributes of the interlocutor.

Abstract

This study addresses the question of how transmission delay affects user perception during speech communication over telephone systems. It aims to show that the occurrence of pure delay should not be neglected when planning a telephone or conferencing system even if no impact on the perceived quality of the call can be found. It is, for instance, known that, the communication surface structure changes dramatically when transmission delay is inserted by the communication system. Furthermore, studies suggest a change in the perception of the interlocutor at the far-end. This paper describes two experiments that assess the misattribution of the technical impairment delay to personality and behavior-related attributes of the conversation partners. The first experiment shows that interlocutors are perceived as being less attentive when conversing in a three-party setting with symmetrical and asymmetrical delay conditions. In the second experiment, the misattribution is considered in more detail looking at ascribed personality attributes in two-party interaction under transmission delay. For both experiments, comparing the conversation surface structure of delayed to non-delayed calls helped to understand the found outcomes.

Introduction

The question of the true effect of transmission delay on the perception of the user has been addressed by researchers for many years. It is surprising that despite transmission delays being over a second, callers still deem the technical quality of the call as “fairly acceptable” (Egger et al., 2010, Guéguin et al., 2008, Hammer, 2006, Kitawaki and Itoh, 1991). Worse still, in most cases people do not even notice the impairment (Brady, 1971). What they usually do notice is confusion in the conversation (Brady, 1971), or it appears to them that the other person is less attentive (Krauss and Bricker, 1966). When delay is combined with echo, the picture changes severely. People realize the delay through the echo of their own voice. In this case, people rate the quality much lower compared to the delay-only case (Guéguin et al., 2008).

Even though quality ratings often are not strongly affected, the perceived speed of the conversation is reported by participants as a primary quality factor. Bouch et al. (2000) asked participants what their most valued characteristics were for two scenarios – an important business conversation or an interactive tutorial. For the business scenario, speed represented the most important factor. The smoothness of conversation was found to be second. For the tutorial scenario, answers had a greater distribution across the categories but smoothness and speed were still rated the two most important factors.

The cause of these mismatching outcomes may be that delay-only is not perceived solely as a technical impairment. From the viewpoint of a telephone user, there may be a number of reasons for a delayed reaction by the interlocutor at the other end, for example the other person might be thinking, tired, not listening for a moment, insecure or mentally slow. There are several possibilities most of which are due to the context or the personal attributes of the other person.

Should we thus ignore delay issues, as long as no echo occurs, when planning a telephone or conferencing system? This study aims to show that we should not. Instead of perceiving a drop in quality, people misattribute the effects of delay to the person at the far-end. This is of relevance to telecommunication system providers who do not want their users to erroneously perceive their communication partners being, for instance, less friendly. This study presents a quantitative analysis of the problem. It examines what impact delay actually has on the perception of the users when conversing via a telecommunication system with transmission delay.

To assess the impact of delay on perceived transmission quality based on a conversation situation, so-called ‘conversation tests’ are usually conducted. In these ‘tests’ people are given brief instruction sheets with tables and icons to trigger a short communication situation. Different tasks for this purpose can be found in the Rec. P.805 (ITU-T, International Telecommunication Union, 2007) for two-party and P.1301 (ITU-T International Telecommunication Union, 2012) for three-party interaction, respectively. A particular type of ‘conversation test’, the ‘Short Conversation Test’ (SCT) for two-party interaction or the ‘three-party Conversation Test’ (3 CT), aim to cause close-to-natural conversations that could occur in every-day private or business life. The SCTs generally last 2–3 min and the 3 CTs last 5–6 min. The advantage of using those scenarios is a high level of control over the content and course of the conversation; and a high similarity to a real life case and therefore a higher external validity of the results. Apart from the SCTs and 3 CTs, there have been other tasks proposed such as the verification of random numbers or simply free discussions to assess the quality based on a conversation situation.

From previous studies on the quality of the telephone connection in conversational settings, some key variables are known to alter the perception of delay. The most important factor seems to be whether delay occurs in combination with another impairment such as acoustic signal reflections ultimately causing echo. As mentioned before, this leads to worse quality ratings for the same transmission delay condition (Guéguin et al., 2008). For 600 ms one-way delay, the perceived quality in a conversational situation was observed to be about one mean opinion score (MOS) point lower with than without additional echo.

If delay is the only impairment, the required interaction speed of the conversation task has been identified as an important factor. For two-party telephone conversations, Kitawaki and Itoh (1991) showed that the more interactive the conversations, for example a random number verification causing a faster interaction than a free conversation, the worse the rated quality for equal delays. Egger et al. (2010) adopted the idea of scenario interactivity and compared the experienced quality of a random number verification task with close-to-natural scenarios. They found differences along the same lines as Kitawaki and Itoh, however, the conversation scenario and the delay impact was not as strong. The different results can be explained by the test design: In Kitawaki and Itoh׳s study, participants were trained and aware of the impairment, whereas in Egger et al.׳s study, this was not the case, which led to a different sensibility of the participants for the impairment. A study by Schoenenberg et al. (accepted for publication) using a similar design as Egger et al. showed that the motivation for quick interaction can greatly influence the quality ratings as well. In one of the three test scenarios, participants where highly motivated to interact quickly as they were offered a prize for being the fastest group on this (“timed”) random number verification task. In addition, participants were recruited such that most of the participating pairs knew each other well. In this case, they rated the quality much lower than under the same delay conditions for the SCT-type conversations. These experiments show the strength of the role of contextual factors of the call, such as the urgency and motivation for fast interaction. They also point to an important factor – the familiarity of the interlocutors. If people know each other׳s common reaction speed, it is more likely for them to identify a transmission delay as a technical degradation, and as a consequence to include it in their quality judgment.

Outside of the test environment, this aspect of familiarity is strongly interrelated with the amount of knowledge of each others׳ contexts. When people are not familiar with each other, they usually know less about each others׳ state and context.

Addressing this, Cramton (2001) summarized five common problems for the usage of computer mediated communication (CMC) systems in “virtual group” work. She based her analysis on a study in which people were accomplishing a project without face-to-face interaction and only communicating via electronic media such as email, chat, internet-based voting tools and telephone.

Three of the five extracted problems are relevant to telecommunication and conferencing services. One was a lack of communication and retention of contextual information. Group members in Cramton׳s study, for example often forgot and failed to talk about differences in constraints such as deadlines or evaluation criteria which caused misunderstandings. Since telecommunication partners are distributed geographically and calls usually focus on the proposed topics, specific circumstances of each location are more likely to be neglected.

The second type of problem Cramton observed was related to differences in the salience of information. Non-verbal communication such as facial expression and body language can add meaning to the exchanged information in face-to-face meetings. However, non-verbal cues are often lacking when communicating over a telecommunication or conferencing service. This is particularly the case for audio-only services. Some non-verbal information can be delivered via the tone of voice but then a new risk for misunderstandings is formed by technical impairments that can distort these cues.

Cramton also describes the problem of misinterpreting the meaning of silence as a major pitfall of CMC. Misinterpreting the meaning of silence is considered to be a key problem for examining the effect of transmission delay. A moment of silence in a face-to-face conversation can have several causes, most of which reside in the talking behavior or circumstances of the conversation partner. In the case of mediated communication, it can have technical reasons. The so created ambiguity is expected to lead to misunderstandings or misattributions.

In particular, it is proposed that people tend to attribute communication problems to the person (“He/she is not cooperative.”) rather than to the situation (“The system impedes our communication.” or “The connection is too bad at the moment.”). The work of Walther et al. (2002) supports this view. In their studies they show that people tend to misattribute their own failure to adapt to a CMC system to the dispositional attributes of their interaction partners.

For this reason, it is hypothesized that people tend to misattribute problems of telephone or conference conversations caused by transmission delay to their conversational partner, if there is no other talking related impairment such as echo.

At least in two-party telephone conversations, the assignment of negative characteristics may partly be explained by changing conversational surface patterns and an accompanying distorted impression of the other interlocutor׳s conversational reactions. In Fig. 1 a typical mismatch of the conversational realities that occur for two-party interaction with delay is depicted (for details on the state structure and on possible transitions see Schoenenberg et al., accepted for publication).

As Brady (1971) has pointed out in this context, the higher the transmission delay, the more the other interlocutor is perceived as interrupting, and not waiting for responses, when in fact he or she did not interrupt more and waits for responses similarly to non-delayed conditions. Furthermore, he reported more confusion in the conversations with increasing delay; confusion, being defined as a sudden stop of the own speech after an interruption of the other combined with a repetition or request on the others statement. Other works confirm an increasing appearance of interruptions of which most are unintended and due to delay (Egger et al., 2012).

Apart from unintended interruptions, response times get longer when delay is present. This can erroneously cause the impression of a hesitation and consequently implies to the sender that his statement was not the desired one (Pomerantz, 1984). Furthermore, the sender may weaken his statement which facilitates a rather vague communication situation. Similarly, if back-channeling is not timed correctly it fosters the impression of a lack of understanding or attention by the listener.

Krauss and Bricker (1966) looked at an efficiency-related aspect in this context and found that more sender words were needed to accomplish a picture matching task with 1800 ms round trip in comparison to 600 ms round trip or no delay conditions. This result points towards a more complicated interaction when long delays are present. In line with this, Ruhleder and Jordan (2001) explain different problems that often occur in mediated communication with delay. For instance, so-called repairs (Schegloff et al., 1977) tend to fail because one side does not recognize the action. Repairs are mechanisms which are used in face-to-face interaction to solve communication problems. They are usually marked by certain comments, e.g. the request “Hm?”, or by not taking a turn even though it should have been taken over. Due to transmission delay the markers may be wrongly timed on the other side and therefore be misunderstood.

All of these suggestions support the second aspect that will be evaluated in this study, that is, how conversational structures differ if delay is present in comparison to the no delay condition. The outcomes will provide explanation as to why misattribution may have taken place.

In the following, we first formulate the hypotheses (Section 1.1) and next validate them in two studies. Experiment I (Section 2) looks at three-party interaction with transmission delay, evaluating also asymmetric delay conditions which means interaction partners experience different delay times. In experiment II (Section 3), the misattribution effect will be analyzed in a more detail by assessing personality attributes in addition to the perceived quality in two-party conversations with delay. Finally, we will draw conclusions (Section 4) as to whether misattribution takes place, and to what extent it manifests itself.

From the above discussion of prior findings, the following hypotheses are deduced:

H1: The higher the transmission delay, the worse the perceived integral quality of the connection.

Note that even though we expect a drop of perceived integral quality with increasing transmission delay, from the above review it is know that, due to the scenario and sample selection (unfamiliar participants) and due to the proposed misattribution (H2), only marginal effects will be expected.

H2: In case of transmission delay, attributes related with the other interlocutor/s are perceived differently compared to the no-delay condition.

H3: Conversations with transmission delay differ in their surface structure from conversations without transmission delay, promoting the misattribution effect.

It should be pointed out that the above hypotheses are aimed to address delay-only conditions and close-to-natural conversations where the required interaction speed is not remarkably high, but which reflect the most common communication situation.

Section snippets

Goal

The first experiment aims to evaluate the effect of transmission delay in conversations of more than two, namely three, interlocutors.

Initially, it seems reasonable to expect that conversations of multiple interlocutors and with transmission delay are difficult to accomplish for the participant of such a call. With two interlocutors, it has already been shown that the conversational realities diverge with increasing transmission delay (Brady, 1971). Obviously, similar divergence happens for the

Goal

Based on the outcomes of experiment I, we were interested in evaluating in more detail to what extent people misattribute certain attributes to the person at the far-end if delay is present. In particular, we asked whether assigned personality attributes differed if the first contact with a person happened under delay as opposed to the case without delay.

An experiment was designed whereby participants talked only once to a particular interlocutor that they were unfamiliar with, choosing from a

Conclusion

There have been numerous studies on the effects of transmission delay on conversations and experienced technical quality of the call over many decades. The biggest issue with delay-only conditions is that the delay is not directly identifiable by participants as a technical degradation. Only in combination with other impairments, such as echo (Guéguin et al., 2008), or in very particular tasks (Kitawaki and Itoh, 1991, Egger et al., 2012) and lab settings, is it found to affect the perceived

References (37)

  • R.B. Cattell

    The description of personalitybasic traits resolved into clusters

    J. Abnorm. Soc. Psychol.

    (1943)
  • Costa, P.T., McCrae, R.R., 1992. Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI)...
  • C.D. Cramton

    The mutual knowledge problem and its consequences for dispersed collaboration

    Organ. Sci.

    (2001)
  • Egger, S., Schatz, R., Scherer, S., 2010. It takes two to tango – assessing the impact of delay on conversational...
  • Egger, S., Schatz, R., Schoenenberg, K., Raake, A., Kubin, G., 2012. Same but different? – using speech signal features...
  • Geelhoed, E., Parker, A., Williams, D.J., Groen, M., 2009. Effects of Latency on Telepresence. Technical Report....
  • Goldberg, L.G., 1995. What the hell took so long? Donald Fiske and the Big-Five factor structure. In: Personality...
  • L.R. Goldberg

    The development of markers for the big-five factor structure

    Psychol. Assess.

    (1992)
  • Cited by (62)

    • Resident experiences with virtual education during the COVID-19 crisis

      2021, Journal of Plastic, Reconstructive and Aesthetic Surgery
    • Mapping connections among activism interactional practices and presence in videoconferencing language learning

      2021, System
      Citation Excerpt :

      Besides technical issues, technology anxiety can have a negative impact on interactions. For instance, delays have been found to influence on perceptions of our interlocutors as less attentive or friendly (Schoenenberg et al., 2014). Secondly, in language learning, the performative sense created by “taking the stage” as participants enable their microphones and webcams, may be perceived as intimidating for language learners -particularly those with an introvert personality or low language proficiency (Chew & Ng, 2016).

    View all citing articles on Scopus

    This paper has been recommended for acceptance by E. Motta.

    View full text