Keywords

1 Introduction

It is rather fascinating that simply viewing another student engage in an interactive tutoring session can yield (in the observer) approximately two-thirds of the learning gains obtained by the student who actually engaged in the session (d = 1.20) [1]. This observation-based learning method, called vicarious learning (defined as learning through observation without overt behaviors [2, 3]), produces robust learning gains in a number of observational contexts including computer-based instruction and peer-to-peer interactions [2,3,4,5,6,7,8,9]. The benefits of vicarious learning highlight its potential as an educational paradigm, particularly given the scalability and cost-effectiveness of learning vicariously through video compared to, for example, engaging with an intelligent tutoring system (ITS). Despite its potential, there are many questions about which features of a vicarious learning session make it effective. This research gap is especially wide for the assessment of moment-to-moment process variables such as attention—a gap we address in the current paper.

1.1 Theoretical Background and Motivation for Current Study

Vicarious learning activities provide a more interactive alternative to traditional online learning (e.g., MOOCs). Here, we are interested in a particular form of vicarious learning involving observing one-on-one tutoring sessions (e.g., [2, 3, 5]). In this context, it is necessary for the vicarious learner (i.e., the secondhand student) to actively process a dialogue between a tutor and another student (i.e., the firsthand student). This activity requires the student to engage in a number of complex processes, such as the integration of multiple perspectives, as well as the evaluation of the credibility and accuracy of each perspective [6, 8, 10]. This form of active processing contrasts with more passive learning activities, like monologues (e.g., video lectures), which remain a popular method of information delivery in online learning contexts (e.g., MOOCs and online courses).

Dialogue-based vicarious learning activities are an effective educational tool that promote active learning, particularly in comparison to similar monologue-based activities [1, 3, 4, 6, 8, 10]. The effectiveness of such vicarious tasks can be explained through the ICAP framework [11] (Interactive > Constructive > Active > Passive), which suggests that while interactive tasks are the most effective for learning, followed by construct, active, and passive tasks. Olney et al. [12] recently extended ICAP to highlight the role of attention. According to their ICAP-A framework, students’ attentional processes (i.e. mind-wandering, or off-task thought) would follow the same general ICAP pattern, such that students would be least likely to mind wander during an interactive learning activity, followed by constructive, active, and passive activities. In line with their framework, mind-wandering tends to occur most often during monologues (i.e. video lectures, (~43% of the time [12, 13]) and least often during interactions with a dialogic intelligent tutoring system (~23% of the time [14,15,16]), although these results are correlational.

Notably absent from the literature are studies exploring the frequency and influence of mind-wandering during vicarious learning tasks. Watching a video of a learning session is similar to viewing a video of a lecture; yet, mind-wandering may be less frequent during vicarious learning due to the active processing required by perspective-taking when viewing a dialogue. The current study addresses this gap by asking participants (i.e. secondhand learners) to watch a short video of prerecorded interactive intelligent tutoring sessions to assess the frequency of mind-wandering during vicarious learning.

Two decades ago, Cox et al. originally posed the question, “What are good models for the vicarious learner - experts or novices?” [6] (p. 432). Three hypothetical explanations were laid out for the “best” type of firsthand student: (1) experts can model perfect behavior, which may be preferable because the learning session is “uncluttered” or without error; (2) moderately-skilled learners can make the learning session more student-centered since the secondhand learners may better identify with such learners; (3) unskilled firsthand learners may be effective because the secondhand learner would learn what to avoid, and would be motivated to do so after witnessing any negative feedback.

Some prior work may be in support of the effectiveness of viewing unskilled an firsthand student. For example, viewing erroneous examples can help promote more critical evaluation and deeper learning [17, 18]. However, a study by Chi et al. [5] provided tentative evidence in support of the expert firsthand student. In their study, students learned more from “good” firsthand students (five students were retroactively assigned to be “good” students based on their pretest scores) when secondhand students collaboratively observed a one-on-one human tutoring session. However, the authors acknowledged their small sample size (N = 20 secondhand students) and solely focused on learning outcomes. Thus, we also examined whether the expertise of the firsthand student would have an effect on the mind-wandering rates and subsequent learning of the secondhand learner.

1.2 Current Study

We take the first steps toward understanding mind-wandering in the context of vicarious learning from an interactive ITS. We address three research questions: First, what is the overall rate of mind-wandering during vicarious learning from an ITS? This is an important consideration given the overwhelming discrepancy between monologue-based learning activities – which have the highest rates of mind-wandering [19] – and vicarious learning from dialogues, both of which can be disseminated via short videos online.

Second, does the expertise of the firsthand student influence mind-wandering and learning? We operationally defined expertise as correctness of the firsthand student’s responses, which we manipulated across three conditions: 100% correct condition, to correspond with Cox et al.’s expert level; 0% correct (Incorrect condition), to correspond with the unskilled student; and 50% correct (Mixed condition), corresponding to the moderately-skilled student. Exploring the impact of the firsthand student’s expertise can help inform strategies on how to design effective vicarious dialogues.

Third, we investigated if any main effects of Firsthand Student Expertise on learning are mediated by mind-wandering (Firsthand Student Expertise → Mind-wandering → Learning).

2 Methods

2.1 GuruTutor Overview

Participants viewed a video of a firsthand student interacting with GuruTutor, an ITS modeled after
expert human tutors [20]. GuruTutor is designed to teach biology
topics through collaborative conversations in natural language. Throughout the conversation, an animated tutor agent references (using gestures) a multimedia
workspace that displays content relevant to the
conversation (see Fig. 1). GuruTutor analyzes learners’ typed responses via natural language processing techniques and the tutor’s responses are tailored to each learner’s conversational turns. For a more detailed description of GuruTutor, see [20,21,22]. Participants viewed the firsthand student interacting with the two sections of GuruTutor that involve collaborative dialogue: (1) Common Ground Building Instruction and (2) Scaffolded Dialogue. The Common Ground Building Instruction section—sometimes called collaborative lecture [23]—is where basic information and terminology are covered. This section is critical because many biology topics involve specialized terminology (e.g., thermoregulation, metabolism) that need to be introduced before scaffolding can occur. In the Scaffolded Dialogue section, the tutor prompts the learner to answer questions about key concepts using a Prompt → Feedback → Verification Question → Feedback → Elaboration cycle. Importantly, the tutor elaborates the correct answer after every response.

Fig. 1.
figure 1

Screenshot of learning session with GuruTutor.

2.2 Participants and Design

Participants (N = 118) were recruited from Amazon’s Mechanical Turk, a platform for crowdsourcing and online data collection [24,25,26]. Participants had to be at least 18 years of age (M = 35.3 years, SD = 20.1) and their location was limited to the United States. Each participant received $2.75 for completing the study.

Participants were randomly assigned to watch a video of a Guru tutoring session recorded in one of three conditions that varied in terms of the frequency of correct responses provided by the firsthand student (here a simulated student) during the Common Ground Building Instruction and 2) Scaffolded Dialogue phases: 100% correct (Correct Condition), 100% incorrect (Incorrect Condition), and 50% correct (Mixed Condition). Participants in the Mixed Condition were randomly assigned to watch one of two videos that were counterbalanced with respect to which specific questions were answered correctly versus incorrectly. For example, whenever an answer was correct in version A, it was incorrect in version B, and vice versa (see Table 1 for examples). There were no differences in mind-wandering rates (p = .759), pretest (p = .935), and posttest scores (p = .338) as a function of counterbalance.

Table 1. Example dialogue across the three conditions.

2.3 Materials and Procedure

Videos of GuruTutor Session.

All videos were prerecorded with a screen capture program (Camtasia) while a researcher interacted with GuruTutor using a predetermined script for firsthand student responses. The topic pertained to how animals maintain body temperature. Answer length and video length were consistent across conditions with an average video length of approximately 16-min. with all videos being within 45 s in length from the others. Each video had the same number (n = 142) of dialogue turns, with the firsthand student’s responses comprising 18% (answering 21 questions) of the dialogue turns; the remaining were tutor turns.

Order of answer correctness in the Mixed condition was pseudo-randomly determined so that vicarious learners could not detect a pattern. In both the Incorrect and Mixed Conditions, incorrect answers were thematically-related to the content but incorrect with respect to the specific tutor question. Regardless of whether the firsthand student response was correct or incorrect, the tutor provided feedback about answer correctness and repeated the correct answer via elaborated feedback. This was done as a guard against false information being retained (see Table 1 for an example of the dialogue across the three conditions).

Thought Probes.

Mind-wandering was measured using a probe-caught method during the video. Participants were presented with the following description, which was adapted from previous studies [6, 21]: “Sometimes when you are watching the video, you may suddenly realize that you are not thinking about what it is that you are watching. We call this “zoning out” or mind wandering about thoughts unrelated to the content of what it is that we are reading. So, we would like you to tell us when you are zoning out. During the presentation of the video, you will hear a “beep” and the video will stop. We would like to know if you are thinking about the video or if you are thinking about something else (e.g., what you will be eating for dinner, your plans for the week). When you hear the tone and you are zoning out, please indicate “Yes” by pressing the “Y” key on your keyboard. If you hear the tone and you are not zoning out, please indicate “No” by pressing the “N” key on your keyboard.”

The instructions also emphasized that participants should be as honest as possible when reporting mind-wandering and that their responses would have no influence on their progress and compensation. There were nine probes per video with probe timings approximately evenly interspersed and set to align with the same events across conditions (e.g., after the tutor completed a specific turn).

Learning Measures.

We used 16 four-foil multiple-choice questions to assess learning. The questions were derived from previously administered standardized test items or from researcher-created items (see [27]). The questions targeted specific concepts mentioned during the session, such as: Which of the following is true about blood temperature?: a. it is cooled as it is pumped near the brain: b. it is heated as it is pumped near the extremities: c. it is heated as it is pumped near the core (correct answer): d. blood temperature generally stays about the same. Two parallel versions of the test were created (8 items each) by randomly dividing the questions, which were counterbalanced as pre- and posttest.

Procedure.

After providing electronic consent, participants completed a pretest to gauge prior knowledge. They then received instructions for the thought probes and were informed they would watch a prerecorded video of a student interacting with a computer tutor called GuruTutor. They were instructed that their task was to watch the video in order to understand the concepts being taught and that they would be subsequently assessed on their learning. At this point, the video was presented along with the thought probes. Finally, participants completed the posttest and were debriefed.

3 Results and Discussion

Table 2 presents descriptive statistics for key variables. An analysis of variance (ANOVA) revealed no differences across conditions with respect to prior knowledge, F(2,115) = 2.29, p = .106, thereby confirming successful random assignment.

Table 2. Means and standard deviation (in parentheses) for key variables across the conditions.

3.1 How Often Did Participants’ Mind-Wander?

We first explored the frequency of mind-wandering during the vicarious learning session. Participants reported mind-wandering 31.7% of the time (SD = 31.9; or 2.9 mind-wandering episodes on average during the session). This finding parallels the rates found in other active online learning activities, such as reading [12, 19, 28].

3.2 Did Firsthand Student Expertise Influence Mind-Wandering in Secondhand Learners (Participants in Current Study)?

Mind-wandering rates were analyzed using a Poisson regression which is suitable for count data (i.e. the count of the number of probes with positive mind-wandering responses). We first assessed the main effect of Firsthand Student Expertise by including it as the only independent variable. A significant omnibus test indicated that model fit improved after including Firsthand Student Expertise in comparison to the intercept-only model, χ2(2) = 6.69, p = .035. Comparisons of parameter estimates revealed that participants in the Mixed condition reported significantly less mind-wandering compared to both the Incorrect (B = .335, SE = .139, Wald χ2(1) = 5.81, p = .016) and Correct conditions (B = .287, SE = .140, Wald χ2(1) = 4.18, p = .041). Rates of mind-wandering across the Correct and Incorrect conditions were on par with one another, p = .704, yielding the following pattern of results (Mixed < [Correct = Incorrect]).

We tested whether the main effect of Firsthand Student Expertise was robust after adding prior knowledge as a covariate. The omnibus test was significant, χ2(3) = 8.95, p = .030. The tests of model effects indicated that pretest was not a significant predictor of mind-wandering, B = -.445, SE = .299, Wald χ2(1) = 2.22, p = .137. The effect of Firsthand Student Expertise was still significant after including the covariate, Wald χ2(2) = 7.32, p = .026, with the same pattern of effects: Participants reported less mind-wandering in the Mixed condition compared to the Correct (p = .040) and Incorrect (p = .009) conditions, which were on par with one another (p = .524).

3.3 Did Firsthand Student Expertise Influence Learning?

We first assessed participants learned from the vicarious learning session using a paired samples t-test. There was a significant increase from pre- to posttest, t(117) = 9.99, p < .001, d = 1.22 after pooling across conditions, suggesting that vicarious learning was effective in our context. We then tested whether Firsthand Student Expertise predicted post-test scores after controlling for pre-test in an ANCOVA, but found no main effect of Firsthand Student Expertise, F(2,114) = .434, p = .649.

3.4 Did Firsthand Student Expertise Influence Learning Through Mind-Wandering?

Although there was no evidence for a main effect, it is possible that Firsthand Student Expertise may influence learning indirectly through mind-wandering (Firsthand Student Expertise → mind-wandering → learning) [29] – particularly given that mind-wandering was negatively related to posttest scores, rho = −.173, p = .061. We tested indirect effects using the ‘mediation’ package in R [30]. We specified two models: (1) a mediator model, which was a Poisson model regressing mind-wandering on Firsthand Student Expertise, controlling for pretest scores; and (2) an outcome variable model, which was a linear model regressing posttest scores on mind-wandering and Firsthand Student Expertise including the same covariate. We obtained causal estimates for the indirect effect over 10,000 quasi-Bayesian Monte Carlo simulations; however, there was no evidence of mediation, p = .190, 95% CI = −.005, .014.

4 General Discussion and Conclusion

Until now, mind-wandering had not been explored in the context of vicarious learning from an ITS—an important context given the effectiveness of vicariously observing dialogues [3, 4] combined with the cost-effectiveness of delivering vicarious learning sessions online. The current study addressed this gap while also examining whether Firsthand Student Expertise influenced the rate of mind-wandering, learning, including both direct and indirect effects.

4.1 Main Findings

Participants reported mind-wandering approximately 32% of the time, underscoring its frequency during vicarious learning activities [19]. These rates are considerably lower than those typically observed by students viewing a monologue – e.g., recorded classroom lectures (rates around 40% [13, 31]). At the same time, these rates are slightly higher in comparison to rates produced by interacting with an ITS (23% reported in [14]), perhaps because ITSs afford a more interactive experience. These general patterns are in line with predictions made by ICAP-A [12] in that participants may be more likely to mind wander in passive contexts compared to active (e.g., vicariously listening to a dialogue or interactive (e.g., engaging with an ITS) contexts.

We also examined how the expertise of the firsthand student influenced mind-wandering rates. Secondhand learners reported mind-wandering less often when the firsthand student’s answers included a mix of both incorrect and correct answers (i.e. Cox et al.’s [6] version of a moderately-skilled student). This pattern is consistent with Cox et al.’s prediction that secondhand learners may identify with a moderately-skilled student and therefore attend more closely to both perspectives in the dialogue. Another plausible explanation is that uncertainty about the firsthand student’s answers held participants’ attention in the Mixed condition, whereas correctness was predictable in the other two conditions. Future work, however, will be needed to determine which of these accounts explains why participants were on task more often in the Mixed condition.

All three conditions performed equally well on the posttest and Firsthand Student Expertise did not indirectly influence learning through mind-wandering. This may indicate that participants in condition adopted a different strategy for processing the dialogue – by paying attention more overall (Mixed condition), or perhaps only to certain parts of the dialogue (Incorrect/Correct conditions). For example, once participants understood the firsthand student’s level expertise, they may have guessed which pieces of information required more focused. Recent evidence from a sustained attention task suggests that participants indeed develop strategies to alter off-task behaviors based on motivation to perform well on the task [32], but future studies should assess the specific strategies employed in vicarious dialog-based learning contexts.

4.2 Limitations and Future Directions

It is important to note the limitations of this study. First, this study was conducted online, so we had no control over the participants’ environment. However, this may also be reflective of vicarious learning in ecologically valid scenarios during online learning. Further, although the use of Mechanical Turk has been validated as a reliable source of data [24], replication with actual students is warranted. Second, our sample size was limited to 118 participants. It is therefore possible that we did not have adequate power to detect an indirect effect of mind-wandering (see Sect. 3.4). Third, in contrast to prior work on vicarious learning [7], we used experimenter-generated learning sessions instead of authentic learning sessions to implement the key manipulation with high internal validity. Future work should, therefore, attempt to use authentic learning sessions by first having an actual student interacting with the ITS, then assigning a second participant to watch their video. This method could provide a broader range of student expertise rather than two extremes used here (100% and 0% accuracy). Fourth, we only explored one topic (maintaining body temperature) in a single ITS; therefore, follow up studies are needed to determine if results generalize more broadly.

Finally, some people may object to the intentional use of incorrect responses. We acknowledge this limitation, but we feel that they are less of a concern in the current study for the following reasons: (1) all incorrect responses were corrected immediately after the firsthand student’s response; (2) all three conditions performed equally well on the posttest; (3) all protocols were approved by the appropriate ethics board; (4) secondhand learners were consenting participants instead of actual students.

Our findings can help inform the design of vicarious learning systems that aim to promote engagement and learning. For example, GuruTutor could be strategically modified so that the firsthand student introduces and resolves specific misconceptions [33] or asks deep-reasoning questions [7]—both of which have been shown to be effective for learning. Additional characteristics of the firsthand student can be manipulated, including factors like affective tone, length of responses, or amount of turn-taking in the dialogue. It is also possible to build detectors of mind-wandering (e.g., using eye-gaze [15, 34, 35]) during vicarious learning so that real-time interventions can be deployed to steer participants back on task. Such systems could dynamically adjust the correctness of firsthand student answers depending on mind-wandering, while also ensuring that correct answers are repeated after a mind-wandering episode.

4.3 Conclusion

This study provides a foundation for examining the role of attention in vicarious learning contexts. Although online vicarious learning sessions are a time- and cost-effective learning method [2], mind-wandering still occurs with some regularity (approximately 30% of the time) during vicarious learning. The current study sheds light on how the expertise of the firsthand student can influence mind-wandering. However, more work is needed to explore ways to design and optimize online vicarious learning tasks to promote attention and learning.