research-article

My science tutor: A conversational multimedia virtual tutor for elementary school science

Authors:

Daniel Bolaños,

Cindy Buchenroth-Martin,

Edward Svirsky,

Sarel Van Vuuren,

Timothy Weston,

Lee BeckerAuthors Info & Claims

ACM Transactions on Speech and Language Processing (TSLP), Volume 7, Issue 4

Article No.: 18, Pages 1 - 29

https://doi.org/10.1145/1998384.1998392

Published: 18 August 2011 Publication History

Abstract

This article describes My Science Tutor (MyST), an intelligent tutoring system designed to improve science learning by students in 3^rd, 4^th, and 5^th grades (7 to 11 years old) through conversational dialogs with a virtual science tutor. In our study, individual students engage in spoken dialogs with the virtual tutor Marni during 15 to 20 minute sessions following classroom science investigations to discuss and extend concepts embedded in the investigations. The spoken dialogs in MyST are designed to scaffold learning by presenting open-ended questions accompanied by illustrations or animations related to the classroom investigations and the science concepts being learned. The focus of the interactions is to elicit self-expression from students. To this end, Marni applies some of the principles of Questioning the Author, a proven approach to classroom conversations, to challenge students to think about and integrate new concepts with prior knowledge to construct enriched mental models that can be used to explain and predict scientific phenomena. In this article, we describe how spoken dialogs using Automatic Speech Recognition (ASR) and natural language processing were developed to stimulate students' thinking, reasoning and self explanations. We describe the MyST system architecture and Wizard of Oz procedure that was used to collect data from tutorial sessions with elementary school students. Using data collected with the procedure, we present evaluations of the ASR and semantic parsing components. A formal evaluation of learning gains resulting from system use is currently being conducted. This paper presents survey results of teachers' and children's impressions of MyST.

References

[1]

Aist, G. and Mostow, J. 2009. Designing spoken tutorial dialogue with children to elicit predictable but educationally valuable responses. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech).

[2]

Atkinson, R. K. 2002. Optimizing learning from examples using animated pedagogical agents. J. Educ. Psych. 94, 416--427.

[3]

Baylor, A. L. and Ryu, J. 2003. Does the presence of image and animation enhance pedagogical agent persona&quest; J. Edu. Comput. Resear., 28, 4, 373--395.

[4]

Baylor, A. L. and Kim, Y. 2005. Simulating instructional roles through pedagogical agents. Int. J. Artific. Intell. Edu. 15, 1.

Digital Library

[5]

Beck, I. L., McKeown, M. G., Worthy, J., Sandora, C. A., and Kucan, L. 1996. Questioning the author: A year-long classroom implementation to engage students with text. Elem. School J. 96, 4, 387--416.

[6]

Beck, I. and McKeown, M. 2006. Improving Comprehension with Questioning the Author: A Fresh and Expanded View of a Powerful Approach. Scholastic.

[7]

Bernstein, J. and Cheng, J. 2007. Logic and validation of fully automatic spoken English test. In The Path of Speech Technologies in Computer Assisted Language Learning: From Research Toward Practice, M. Holland and F. P. Fisher, Eds., Routledge, 174--194. http://www.ordinate.com/samples/Versant-English/Sample-TEST-PAPER-Versant-English-Test-watermark.pdf

[8]

Bloom, B. S. 1984. The 2 sigma problem: The search for methods of group instruction as effective as one-on-one tutoring. Educ. Resear. 13, 4--16.

[9]

Bolanos, D., Cole, R., Ward, W., Borts, E., and Svirsky, E. 2011. FLORA: Fluent oral reading assessment of children's speech. ACM Trans. Speech Lang. Process.

Digital Library

[10]

Bruner, J. S. 1966. Toward a Theory of Instruction. Harvard University Press, Cambridge, MA.

[11]

Bruner, J. S. 1990. Acts of Meaning. Harvard University Press, Cambridge, MA.

[12]

Butcher, K. R. 2006. Learning from text with diagrams: Promoting mental model development and inference generation. J. Edu. Psych. 98, 1, 182--197.

[13]

Chapin, S. H., O'Connor, C., and Anderson, N. C. 2003. Classroom Discussions Using Math Talk to Help Students Learn. Math Solution Publications, Sausalito, CA.

[14]

Chen, W., Mostow, J., and Aist, G. 2010. Exploiting predictable response training to improve automatic recognition of children's spoken questions. In Proceedings of the 10th International Conference on Intelligent Tutoring Systems (ITS2010), Springer-Verlag, 55--64.

Digital Library

[15]

Chi, M. T. H., Bassok, M., Lewis, M. W., Reimann, P., and Glaser, R. 1989. Self-explanations: How students study and use examples in learning to solve problems. Cogn. Sci. 13, 145--182.

[16]

Chi, M. T. H., De Leeuw, N., Chiu, M., and Lavancher, C. 1994. Eliciting self-explanations improves understanding. Cogn. Sci. 18, 439--477.

[17]

Chi, M. T. H., Siler, S. A., Jeong, H., Yamauchi, T., and Hausmann, R. G. 2001. Learning from human tutoring. Cogn. Sci. 25, 471--533.

[18]

Clarkson, P. R. and Rosenfeld, R. 1997. Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of Eurospeech.

[19]

Cohen, P. A., Kulik, J. A., and Kulik, C. L. C. 1982. Educational outcomes of tutoring: A meta-analysis of findings. Am. Edu. Resear. J. 19, 237--248.

[20]

Cole, R., Van Vuuren, S., Pellom, B., Hacioglu, K., Ma, J., Movellan, J., Schwartz, S., Wade-Stein, D., Ward, W., and Yan, J. 2003. Perceptive animated interfaces: First steps toward a new paradigm for human--computer interaction. InProc. IEEE 91, 9, 1391--1405.

[21]

Cole, R., Wise, B., and Van Vuuren, S. 2007. How Marni teaches children to read. Educ. Techn.

[22]

Craig, S. D., Gholson, B., Ventura, M., Graesser, A. S., and Tutoring Research Group. 2000. Overhearing dialogues and monologues in virtual tutoring sessions: Effects on questioning and vicarious learning. Int. J. Artif. Intell. Edu. 11, 242--253.

[23]

Dede, C., Salzman, M., Loftin, B., and Ash, K. (in press). Using virtual reality technology to convey abstract scientific concepts. In Learning the Sciences of the 21st Century: Research, Design, and Implementing Advanced Technology Learning Environments. M. J. Jacobson and R. B. Kozma, Eds. Lawrence Erlbaum, Hillsdale, NJ.

[24]

Driscoll, D., Craig, S. D., Gholson, B., Ventura, M., Hu, X., and Graesser, A. 2003. Vicarious learning: Effects of overhearing dialog and monolog-like discourse in a virtual tutoring session. J. Educ. Comput. Resear. 29, 431--450.

[25]

Federico, M. 1996. Bayesian Estimation Methods for n-gram language model adaptation. In Proceedings of ICSLP'96, 240--243.

[26]

Graesser, A. C., Hu, X., Susarla, S., Harter, D., Person, N. K., Louwerse, M., Olde, B. and the Tutoring Research Group. 2001. AutoTutor: An intelligent tutor and conversational tutoring scaffold. In Proceedings of the 10th International Conference of Artificial Intelligence in Education, 47--49.

[27]

Graesser, A., N., Person, N., and Harter D. 2001. Teaching tactics and dialog in Autotutor. Int. J. Artific. Intell. Edu.

[28]

Hausmann, R. G. M. and Vanlehn, K. 2007a. Explaining self-explaining: A contrast between content and generation. Artificial Intelligence in Education, R. Luckin, K. R. Koedinger, and J. Greer, Eds. IOS Press, Amsterdam, Netherlands, 417--424.

Digital Library

[29]

Hausmann, R. G. M. and Vanlehn, K. 2007b. Self-explaining in the classroom: Learning curve evidence. In Proceedings of the 29th Annual Conference of the Cognitive Science Society. D. McNamara and G. Trafton Eds., Erlbaum, Mahwah, NJ, 1067--1072.

[30]

King, A. 1989. Effects of self-questioning training on college students' comprehension of lectures. Contemp. Educ. Psy. 14, 366--381.

[31]

King, A. 1991. Effects of training in strategic questioning on children's problem-solving performance. J. Educ. Psych. 83, 307--317.

[32]

King, A. 1994. Guiding knowledge construction in the classroom: Effect of teaching children how to question and explain. Am. Educ. Resear. J. 31, 338--368.

[33]

King, A., Staffieri, A., and Adelgais, A. 1998. Mutual peer tutoring: Effects of structuring tutorial interaction to scaffold peer learning. J. Educ. Psych. 90, 134--15.

[34]

Kintsch, W. 1988. The role of knowledge in discourse comprehension: A construction-integration model. Psych. Rev. 95, 163--182.

[35]

Kintsch, W. 1998. Comprehension: A Paradigm for Cognition. Cambridge University Press, Cambridge, England.

[36]

Lee, L. and Rose, R. C. 1998. A frequency warping approach to speaker normalization. IEEE Trans. Speech Audio Process. 6, 1, 49--60.

[37]

Leggetter, C. J. and Woodland, P. C. 1995. Maximum likelihood linear regression for speaker adaptation of continuous density Hidden Markov Models. Comput. Speech Langu. 9, 171--185.

[38]

Lester, J., Converse, S., Kahler, S., Barlow, S., Stone, B., and Boghal, R. 1997. The persona effect: Affective impact of animated pedagogical agents. In Proceedings of CHI'97, ACM, New York, 359--366.

Digital Library

[39]

Lester, J., Stone, B., and Stelling. G. 1999. Lifelike pedagogical agents for mixed-initiative problem solving in constructivist learning environments. User Model. User-Adap. Interact. 9, 1--2, 1--44.

Digital Library

[40]

Littman, D. and Silliman, X. 2004. ITSPOKE: An intelligent tutoring spoken dialog system. In Proceedings of HLT-NAACL, 5--8.

Digital Library

[41]

Ma, J., Cole, R. A., Pellom, B., Ward, W., and Wise, B. 2004. Accurate automatic visible speech synthesis of arbitrary 3d models based on concatenation of di-viseme motion capture data. J. Comput. Anim. Virt. Worlds 15, 485--500.

Digital Library

[42]

Ma, J. Yan, J., and Cole, R. 2002. CU Animate: Tools for enabling conversations with animated characters. In Proceedings of the International Conference on Spoken Language Processing.

[43]

Madden, N. A. and Slavin, R. E. 1989. Effective pullout programs for students at risk. in Effective Programs for Students At Risk, R. E. Slavin, N. L. Karweit, and N. A. Madden, Eds., Allyn and Bacon, Boston, MA.

[44]

Mayer, R. 2001. Multimedia Learning. Cambridge University Press, Cambridge, UK.

Digital Library

[45]

McKeown, M. G. and Beck, I. L. 1999. Getting the discussion started. Educ. Leader. 57, 3, 25--28.

[46]

McKeown, M. G., Beck, I. L., Hamilton, R., and Kucan, L. 1999. Accessibles—Questioning the Author (Easy-Access Resources for Classroom Challenges). Wright Group, Bothell, WA.

[47]

Moreno, R., Mayer, R. E., Spires, H. A., and Lester, J. C. 2001. The case for social agency in computer-based teaching: Do students learn more deeply when they interact with animated pedagogical agents&quest; Cogn. Inst. 19, 2, 177--213.

[48]

Mostow, J. and Aist, G. 1999. Giving help and praise in a reading tutor with imperfect listening—Because automated speech recognition means never being able to say you're certain. CALICO J. 16, 3, 407--424.

[49]

Mostow, J. and Aist, G. 2001. Evaluating tutors that listen: An overview of Project LISTEN. In Smart Machines in Education, K. Forbus and P. Feltovich, Eds.

Digital Library

[50]

Mostow, J., Aist, G., Burkhead, P., Corbett, A., Cuneo, A., Eitelman, S., Huang, C., Junker, B., Sklar, M. B., and Tobin, B. 2003. Evaluation of an automated reading tutor that listens: Comparison to human tutoring and classroom instruction. J. Educa. Comput. Resear. 29, 1, 61--117.

[51]

Mostow, J. and Chen, W. 2009. Generating instruction automatically for the reading strategy of self-questioning. In Proceedings of the 14th International Conference on Artificial Intelligence in Education (AIED'09). 465--472.

Digital Library

[52]

Murphy, P. K. and Edwards. M. N. 2005. What the studies tell us: A meta-analysis of discussion approaches. In Making Sense of Group Discussions Designed to Promote High-Level Comprehension of Texts. Symposium Presented at the Annual Meeting of the American Educational Research Association.

[53]

Murphy, P. K., Wilkinson, I. A. G., Soter, A. O., Hennessey, M. N., and Alexander, J. F. 2009. Examining the effects of classroom discussion on students' high-level comprehension of text: A meta-analysis. J. Educ. Psych. 101, 740--764.

[54]

Naep. 2002. http://nces.ed.gov/nationsreportcard

[55]

Nass C. and Brave S. 2005. Wired for Speech: How Voice Activates and Advances The Human-Computer Relationship. MIT Press, Cambridge, MA.

Digital Library

[56]

Nystrand, M. and Gamoran, A. 1991. Instructional discourse, student engagement, and literature achievement. Resear. Teach. English 25, 261--290.

[57]

Palincsar, A. S. 1998. Social constructivist perspectives on teaching and learning. Annual Revi. Psych. 49, 345--375.

[58]

Palincsar, A. S. and Brown, A. 1984. Reciprocal teaching of comprehension-fostering and comprehension- monitoring activities. Cogn. Instr. 1, 117--175.

[59]

Pine, K. J. and Messer, D. J. 2000. The effect of explaining another's actions on children's implicit theories of balance. Cogn. Instr. 18, 1, 35--51.

[60]

Reeves, B. and Nass, C. 1996. The Media Equation, Cambridge University Press, Cambridge, UK.

[61]

Rickel, J. and Johnson, W. L. 2000. Task-oriented collaboration with embodied agents in virtual worlds. In Embodied Conversational Agents, J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds.

Digital Library

[62]

Soter, A. O. and Rudge, L. 2005. What the discourse tells us: Talk and indicators of high-level comprehension. In Proceedings of the Annual Meeting of the American Educational Research Association. 11--15.

[63]

Soter, A. O., Wilkinson, I. A. G., Murphy, P. K., Rudge, L., Reninger, K., and Edwards, M. 2008. What the discourse tells us: Talk and indicators of high-level comprehension. Int. J. Educ. Resear. 47, 372--391.

[64]

Taylor, P., Black, A. W., and Caley, R. 1998. The architecture of the festival speech synthesis. In Proceedings of the 3rd ESCA Workshop in Speech Synthesis. 147--151.

[65]

Topping, K. and Whitley, M. 1990. Participant evaluation of parent-tutored and peer-tutored projects in reading, In Educa. Resear. 32, 1, 14--32.

[66]

Van Lehn, K. and Graesser, A. C. 2002. Why2 Report: Evaluation of Why/Atlas, Why/AutoTutor, and accomplished human tutors on learning gains for qualitative physics problems and explanations. Unpublished report prepared by the University of Pittsburgh CIRCLE group and the University of Memphis Tutoring Research Group.

[67]

Van Lehn, K., Lynch, C., Taylor, L., Weinstein, A., Shelby, R., Schulze, K., Treacy, D., and Wintersgill, M. 2003. In Intelligent Tutoring Systems, S. A. Cerri, G. Gouarderes, and F. Paraguacu, Eds. Springer, Berlin, Germany, 367--376.

[68]

Vanlehn, K., Lynch, C., Schulze, K. Shapiro, J. A., Shelby, R., Taylor, L., Treacy, D., Weinstein, A., and Wintersgill, M. 2005. The Andes physics tutoring system: Five years of evaluations. In Proceedings of the 12th International Conference on Artificial Intelligence in Education. G. McCalla and C. K. Looi, Eds. IOR Press, Amsterdam.

Digital Library

[69]

Vygotsky, L. S. 1978. Mind in Society: The Development of Higher Psychological Processes. Harvard University Press, Cambridge, MA.

[70]

Ward, W. 1994. Extracting information from spontaneous speech, In Proceedings of the International Conference on Spoken Language Processing (ICSLP)

[71]

Ward, W. and Pellom, B. 1999. The CU Communicator system. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop.

[72]

Wise, B., Cole, R., Van Vuuren, S., Schwartz, S., Snyder, L., Ngampatipatpong, N., Tuantranont, J., and Pellom, B. 2005. Learning to read with a virtual tutor: foundations to literacy. In Interactive Literacy Education, C. Kinzer and L. Verhoeven, Eds., Environments through Technology. Lawrence Erlbaum, Mahwah, NJ.

[73]

Wood, D. and Middleton, D. 1975. A study of assisted problem solving. Brit. J. Psych. 66, 181--191.

Cited By

Patel TScharenborg O(2024)Improving End-to-End Models for Children’s Speech RecognitionApplied Sciences10.3390/app1406235314:6(2353)Online publication date: 11-Mar-2024
https://doi.org/10.3390/app14062353
Yuan YGenatempo PJin QYarosh S(2024)Field Trial of a Tablet-based AR System for Intergenerational Connections through Remote ReadingProceedings of the ACM on Human-Computer Interaction10.1145/36536968:CSCW1(1-28)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653696
Shankar NAfshan AJohnson AMahapatra AMartin ANi HPark HPerez MYeung GBailey ABreazeal CAlwan A(2024)The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environmentJASA Express Letters10.1121/10.00341954:11Online publication date: 1-Nov-2024
https://doi.org/10.1121/10.0034195
Show More Cited By

Index Terms

My science tutor: A conversational multimedia virtual tutor for elementary school science
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Speech recognition

Recommendations

Natural Language, Mixed-initiative Personal Assistant Agents
IMCOM '18: Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication

The increasing popularity and use of personal voice assistant technologies, such as Siri and Google Now, is driving and expanding progress toward the long-term and lofty goal of using artificial intelligence to build human-computer dialog systems ...
The RavenClaw dialog management framework: Architecture and systems

In this paper, we describe RavenClaw, a plan-based, task-independent dialog management framework. RavenClaw isolates the domain-specific aspects of the dialog control logic from domain-independent conversational skills, and in the process facilitates ...
Gaze awareness in conversational agents: Estimating a user's conversational engagement from eye gaze
Special issue on interaction with smart objects, Special section on eye gaze and conversation

In face-to-face conversations, speakers are continuously checking whether the listener is engaged in the conversation, and they change their conversational strategy if the listener is not fully engaged. With the goal of building a conversational agent ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Speech and Language Processing

ACM Transactions on Speech and Language Processing Volume 7, Issue 4

August 2011

143 pages

ISSN:1550-4875

EISSN:1550-4883

DOI:10.1145/1998384

Issue’s Table of Contents

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 August 2011

Accepted: 01 April 2011

Revised: 01 October 2010

Received: 01 June 2010

Published in TSLP Volume 7, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Division of Research on Learning in Formal and Informal Settings
U.S. Department of Education
Boulder Language Technologies

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
835
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)18

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Patel TScharenborg O(2024)Improving End-to-End Models for Children’s Speech RecognitionApplied Sciences10.3390/app1406235314:6(2353)Online publication date: 11-Mar-2024
https://doi.org/10.3390/app14062353
Yuan YGenatempo PJin QYarosh S(2024)Field Trial of a Tablet-based AR System for Intergenerational Connections through Remote ReadingProceedings of the ACM on Human-Computer Interaction10.1145/36536968:CSCW1(1-28)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653696
Shankar NAfshan AJohnson AMahapatra AMartin ANi HPark HPerez MYeung GBailey ABreazeal CAlwan A(2024)The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environmentJASA Express Letters10.1121/10.00341954:11Online publication date: 1-Nov-2024
https://doi.org/10.1121/10.0034195
Fan RShankar NAlwan A(2024)UniEnc-CASSNAT: An Encoder-Only Non-Autoregressive ASR for Speech SSL ModelsIEEE Signal Processing Letters10.1109/LSP.2024.336503631(711-715)Online publication date: 2024
https://doi.org/10.1109/LSP.2024.3365036
Li JHasegawa-Johnson MMcElwain N(2024)Analysis of Self-Supervised Speech Models on Children’s Speech and Infant Vocalizations2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW62465.2024.10626416(550-554)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSPW62465.2024.10626416
Shankar NFan RAlwan A(2024)SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW62465.2024.10625884(560-564)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSPW62465.2024.10625884
Wang DChen G(2024)Are perfect transcripts necessary when we analyze classroom dialogue using AIoT?Internet of Things10.1016/j.iot.2024.10110525(101105)Online publication date: Apr-2024
https://doi.org/10.1016/j.iot.2024.101105
Oliveira AIsotani SPinto I(2023)Tecnologia Educacional na era da IATecnologias, Sociedade e Conhecimento10.20396/tsc.v10i2.1836710:2(68-101)Online publication date: 22-Dec-2023
https://doi.org/10.20396/tsc.v10i2.18367
Xu WMa JYao JLin WZhang CXia XZhuang NWeng SXie XFeng SYing FHansen PYao C(2023)MathKingdom: Teaching Children Mathematical Language Through Speaking at Home via a Voice-Guided GameProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581043(1-14)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581043
Fan RChu WChang PAlwan A(2023)A CTC Alignment-Based Non-Autoregressive Transformer for End-to-End Automatic Speech RecognitionIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2023.326378931(1436-1448)Online publication date: 2023
https://doi.org/10.1109/TASLP.2023.3263789
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents