Keywords

1 Introduction

Online video systems have been recently augmented to provide a more socialized online viewing experience [11], which combines the individual and passive act of viewing with the social, active aspect of commenting, thus creating a discourse around the video content. YouTube, for instance, exemplifies a moderately social environment where viewers can share opinions about videos through commenting and rating features. Other online video sites such as ClipSync and Bilibili extend this logic, by adding text chat features to the videos, which supports active viewing, and encourages viewer participation as well as social engagement. Such sites are pioneers in the integration of communication features with online video content, creating the experience of a communal process of viewing videos together [25], even if in reality the social interaction online is not same-time and same-location, but serial and highly fragmented. In addition, the interactive experience allows users to enjoy videos in greater depth and interpret them along more diverse perspectives, since viewers are privy to other users’ impressions, interpretations, and feelings about the video content [12]. This new viewing experience appears to be desirable to users, according to market share information for corresponding video sites. Bilibili, for instance, is listed among the Top 10 most popular video sites in China [5], despite a late launch (2009), notwithstanding minimal advertising, and a highly competitive environment.

Despite growing scholarly and industry attention towards communal interactive video viewing online, little is known about the reasons for its attractiveness, and specifically why and how the associated communication features achieve to attract users. Although viewing video content and user comments simultaneously (a hallmark of these sites) might be thought to cause cognition overload and destroy video aesthetics, many users are enthusiastic about using these communication features. Users explain that viewing videos whose content is enriched by augmented displays of user comments elicits specific affective experiences, such as playfulness [12]. Furthermore, although research suggests that people often exhibit greater similarity in affective reactions toward technological features than in cognitive assessments, the literature in the information system (IS) literature frequently overlooks non-cognitive factors such as affect [21]. As such, a significant research gap still exists in understanding communal interactive viewing experiences online (CIVEO), especially the affective and cognitive experiences of video sites that provide communication features. This study thus raises the important questions of (1) what the antecedents of users’ affective and cognitive states in CIVEO are; and (2) how affective and cognitive states influence user satisfaction and intention to continue using sites offering CIVEO.

Drawing on the affective response model (ARM) [31], this research identifies the antecedents of a user’s affective state (i.e., playfulness) and cognitive state (i.e., cognition load). More specifically, we posit that extraversion, display augmentability, and comment relevance influence users’ perceptions of playfulness and cognition load. We also posit that playfulness and cognition load influence satisfaction, which in turn influences users’ intention to continue. User satisfaction has been widely considered as a critical factor that determines a user’s intention to continue using an IS [3], and continuance intention has been regarded as a promising concept for explaining continued IS use across various types of IS [30]. Thus, understanding the factors that contribute to video sites user satisfaction and continuance intention has both theoretical and practical implications. We expect that the findings of our study can be referenced by video website managers to improve their service quality and to enhance their competitive advantage.

The remainder of this paper is organized as follows. First, we briefly review the theoretical background. Second, we present our research model and then develop our hypotheses. Thereafter, we describe the research design, present the results of the analyses, and finally conclude the article with implications and limitations.

2 Theoretical Background and Hypothesis Development

2.1 Affective Response Model

ARM is a theoretically bound conceptual framework, which provides a systematic and holistic map for studies that consider affect [31]. It focuses primarily on the affective aspect rather than on cognitive and behavioral aspects. The ARM indicates how affect is manifested and eventually turned into affective evaluations. In the ARM, the antecedents of a user’s affective state are grouped into two categories: human factor and technology stimulus. The human factors refer to the characteristics reside within a person, which is not dedicated to any specific stimulus. The technology stimuli refer to the objective attributes or properties of technological features of an IS. The ARM suggests that human factors and technology stimuli trigger specific affective states. Subsequently, induced affective states may contribute to the formation of affective evaluations. Affective evaluation is a general term that represents user attitudes toward objects and behaviors. As defined in previous studies [29, 31], affective evaluations consist of user attitude toward objects and attitude toward behaviors. Accordingly, in this study, we investigate user satisfaction and intention to continue using online video sites in order to capture affective evaluations.

Researchers have employed the ARM as a robust and parsimonious framework for predicting user attitudes in various contexts (e.g., web interfaces) [9]. However, the ARM only focuses on affective factors and ignores cognitive factors, thus potentially limiting a comprehensive understanding of user attitudes toward technological features. In this study, we extend the ARM by adding a cognitive factor (cognition load) and investigate how the involvement of a non-affective factor influences users’ affective evaluations. To this end, the study attempts to provide insights into design strategies for a video site seeking to facilitate desirable affective and cognitive states.

3 Research Model and Hypotheses

By applying the ARM to CIVEO, we identify a personal trait, extraversion, as a human factor; display augmentability and comment relevance as technology stimuli. We also identify playfulness and cognition load as induced affective and cognitive states. Furthermore, user satisfaction and intention to continue are examined as affective evaluations. The research model is depicted in Fig. 1. The extended ARM model contributes to (1) the provision of a parsimonious and theoretical justification for investigating extraversion, display augmentability, and comment relevance as antecedents and (2) the examination of the role of playfulness and cognition load in predicting users’ satisfaction and intention to continue using online video sites.

Fig. 1.
figure 1

Research model

Antecedents of Induced Affective/Cognitive States.

Extraversion refers to the personality of an individual who is outgoing and personable [26]. In general, extraverts are usually less timid and hesitant in communication with others and they tend to initiate more conversations and talk more than others. Extraverts enjoy interacting with others in a group or a collective of people. In this study, playfulness refers to the perceived hedonic value of an online video site, as amplified by associated fun, excitement, creativity and pleasure [4]. Cognition load is defined as the load imposed on memory by information being presented [24]. Cognitive overload can occur when the degree of mental effort exceeds processing capabilities. However, since extraverts actively pursue social connections with others, they are capable of interacting with others, despite having to process a lot of information during their communication. Thus, we propose the following hypotheses:

  • H1a. Extraversion positively affects users’ playfulness when using online video sites.

  • H1b. Extraversion negatively affects users’ cognition load when using online video sites.

Display augmentability is defined as the extent to which the communication features of a video website support the augmented display of video, audio, and text. Online video sites with high display augmentability allow users to communicate with multiple people, which is desirable for higher playfulness [32]. Previous studies claim that if too many individuals attempt to communicate simultaneously, they are more likely to be distracted and less attentive to the focal communication partners [15]. Besides, greater effort is needed to conduct simultaneous interactions. Thus, we propose the following:

  • H2a. Display augmentability positively affects users’ playfulness when using online video sites.

  • H2b. Display augmentability positively affects users’ cognition load when using online video sites.

Comment relevance is defined as the extent to which the communication features of a video site support the adjustment of comments on videos, thereby making them relevant to the video content. When the comments on videos are relevant, useful, important, meaningful, or helpful to audiences, reading comments will result in enhanced pleasure [33]. Since users have limited ability to process novel information, reading comments that seem to have no relevance to the videos requires extra mental effort [19]. In contrast, the better the fit between comments and videos, the less effort is needed to process the information, thus mitigating the required mental effort. Therefore, we hypothesize the following:

  • H3a. Comment relevance positively affects users’ playfulness when using online video sites.

  • H3b. Comment relevance negatively affects users’ cognition load when using online video sites.

Induced Affective/Cognitive States and Evaluation.

Playfulness is considered as the intrinsic motivation associated with using any IS [28]. Individuals in a state of playfulness are involved in an activity for intrinsic benefits, such as pleasure and enjoyment, rather than extrinsic rewards [27]. Such experience may result in better evaluation of the technology use. For example, research demonstrated that playfulness or enjoyment was positively related to user satisfaction in the post-adoption of social network services [15, 17]. In contrast, the literature also indicated that the greater the cognitive burden, the lower the users’ satisfaction with the learning [24]. We thus propose the following:

  • H4. Playfulness positively affects users’ satisfaction with online video sites.

  • H5. Cognition load negatively affects users’ satisfaction with online video sites.

Satisfaction is defined as an individual’s evaluation and affective response to his or her overall experience with a service or product [20, p. 29]. In decades, user satisfaction has been demonstrated as one of the vital constructs predicting behavioral intentions [3, 19]. For example, many studies have provided evidence that user satisfaction with ISs, such as web-based learning systems [8] and online brokerage services [2], is associated with continuance intention. Thus, propose the following:

H6. Users’ satisfaction positively affects their intention to continue using online video sites.

4 Method

CIVEO is spreading rapidly among Chinese video sites. In particular, the introduction of Danmaku technology to video sites has provided new opportunities for richer communal viewing experiences [16]. Danmaku technology—a communication feature that overlays user comments directly on the video screen and augmentedly displays comments alongside videos [12]—enables users to simultaneously view and add comments to videos. The technology synchronizes the comments to the video playback time and displays them to the current and future viewers as a stream of moving subtitles overlaid on the video screen [5]. By viewing time-synchronous comments (semantic information) together with videos, users feel as if they experience the video as part of an interactive community of (anonymous) viewers, across differences in time and space [25]. Figure 2 presents a screenshot of a video with Danmaku comments enabled on Bilibili.com.

Fig. 2.
figure 2

(source: http://www.bilibili.com/video/av715040/)

A screenshot of a video on Bilibili.com

Data was collected through an online survey from users of Chinese video sites that had implemented a Danmaku system (including IQIYI, LETV, Tudou, Bilibili, Acfun, Souhu, and Tecent Video). Users who had experience of activating the Danmaku system on their video site were considered for the survey. To ensure the validity of the instruments, we adapted all items from previous research. Extraversion measurement items were adapted from Balaji and Chakrabarti [1]. Measurements for display augmentability drew on Tang et al. [26]. Additionally, measurements for comment relevance were initiated by Zimmer et al. [33]. Items of playfulness were adapted from Celik [4], while items of cognition load were derived from Hart and Staveland [14]. Furthermore, measurements of satisfaction and intention to continue were drawn from Bhattacherjeee [3]. All items were measured using seven-point Likert scales ranging from strongly disagree (1) to strongly agree (7). A web-based survey solution provider (http://www.sojump.com/) was used to distribute the questionnaire. 284 users with experience of online communal viewing with the Danmaku function enabled were identified and selected as subjects to complete the questionnaire. The respondents consisted of a range of users, some of whom liked the Danmaku function and other who disliked the function. A total of 212 valid responses were received, representing a response rate of 74.6 %. The demographics of the research sample are displayed in Table 1.

Table 1. Subject demographics

5 Data Analysis and Results

The data analysis was conducted in two stages. In stage one, the appropriateness of measurement model, including reliability, validity, and common method bias, was examined. In stage two, the structural model and hypotheses were assessed and tested respectively [7]. Data was analyzed using SmartPLS 2.0 [23].

5.1 Measurement Model

Reliability was assessed by examining Cronbach’s alpha, composite reliability (CR), and average variance extracted (AVE) [13]. The threshold values used to evaluate these three indices were .70, .70, and .50, respectively [6]. As shown in Table 2, all item loadings were significant (p < .001) and ranged from 0.64 to 0.93, indicating adequate convergent validity [10].

Table 2. Item means and loadings

Discriminant validity of the constructs can be verified by confirming the square root of the AVE to be higher than the inter-construct correlations [10]. The result in Table 3 shows that the square roots of the AVE of all the constructs were higher than all the correlations, suggesting good discriminant validity. Subsequently, following Podsakoff and Organ [22], we tested common method bias (CMB) to prevent artifactual covariance between variables. The results reveal that no single factor emerged from the Harman’s one-factor analysis and there was no one single factor that accounts for the majority of the covariance in the independent and criterion variables, revealing that CMB did not pose a major threat to this study [18].

Table 3. Discriminant validity

5.2 Structural Model

The results of the structural model test are summarized in Fig. 3. As hypothesized, extraversion, display augmentability and comment relevance were positively associated with playfulness. Moreover, extraversion, display augmentability, and comment relevance were negatively associated with cognition load.

Fig. 3.
figure 3

Structural model

The proportions of variances explained were 48.1 % for playfulness, and 30.6 % for cognition load. H1a, H1b, H1c, H2a, and H2c were supported at significance levels of p < 0.05 or better. In addition, playfulness had a positive effect on satisfaction, while cognition load negatively influence satisfaction. Playfulness and cognition load jointly explained 56.5 % of the variance in satisfaction. H3 and H4 were supported at p < 0.001. Finally, satisfaction was positively associated with intention to continue, accounting for 54.1 % of the variance in intention to continue. H5 was supported at p < 0.001.

6 Discussion

The purpose of this study was to examine the antecedents of users’ affective/cognitive states, and the impacts of these states on user satisfaction and intention to continue, in video websites which have provided communication features during video watching. The results showed that that extraversion, display augmentability and comment relevance enhanced playfulness, indicating that the communication features, which allow users to interact with others, provide an enjoyable experience. This study also revealed that extraversion and comment relevance, helped to reduce users’ cognition load. Surprisingly, the results implied that display augmentability negatively influenced cognition load. This is a novel finding, which implies that when highly relevant additional information is provided in a way users prefer, they are able to process more information at the same time, without feeling interrupted or mentally overwhelmed. The results also indicated that playfulness had positive impacts on user satisfaction, while cognition load had a negative influence on user satisfaction. Furthermore, satisfaction was positively associated with intention to continue, which determined the success and sustainability of video websites [32].

Drawing on ARM, this paper was intended to take a first step in empirically investigating CRIVEO. The main contributions of our paper are summarized as follows. First, it contributes to the ARM by incorporating a cognitive factor (cognition load) and empirically examining the parallel role of cognition load with playfulness in determining user satisfaction in the context of video sites. Second, it provides empirical evidence of the role of a human factor (extraversion) and technology stimuli (display augmentability, comment relevance) in predicting affective and cognitive states. Third, it contributes to extending the research on the sustainability of video sites by examining the antecedents and consequences of users’ affective and cognitive states.

A robust understanding of affect and cognition has practical implications for the design, acceptance, use, and management of communication features in online video sites. The work indicates that in order to develop innovative communication features and to improve video site performance, practitioners should focus on improving display augmentability and regulating additional information to make it more relevant to video content.

Our research has several limitations, which suggest opportunities for future studies. First, the results might be limited by the sampling, as the data was collected only from Chinese Danmaku video websites. Although Danmaku technology is popular in East Asia, it does not seem to have been adopted by video sites serving other cultures. Second, antecedents other than extraversion, display augmentability and comment relevance may contribute to playfulness and cognition load. Third, our results might also have suffered from inadequate consideration of third-variable effects. Although the survey respondents were users who had different experience of using Danmaku video sites and were thus appropriate for the present study, the results would be more generalizable by incorporating some control variables, especially personal characteristics of the subjects. We expect to overcome above shortcomings in a future study.