skip to main content
research-article

My science tutor: A conversational multimedia virtual tutor for elementary school science

Published: 18 August 2011 Publication History

Abstract

This article describes My Science Tutor (MyST), an intelligent tutoring system designed to improve science learning by students in 3rd, 4th, and 5th grades (7 to 11 years old) through conversational dialogs with a virtual science tutor. In our study, individual students engage in spoken dialogs with the virtual tutor Marni during 15 to 20 minute sessions following classroom science investigations to discuss and extend concepts embedded in the investigations. The spoken dialogs in MyST are designed to scaffold learning by presenting open-ended questions accompanied by illustrations or animations related to the classroom investigations and the science concepts being learned. The focus of the interactions is to elicit self-expression from students. To this end, Marni applies some of the principles of Questioning the Author, a proven approach to classroom conversations, to challenge students to think about and integrate new concepts with prior knowledge to construct enriched mental models that can be used to explain and predict scientific phenomena. In this article, we describe how spoken dialogs using Automatic Speech Recognition (ASR) and natural language processing were developed to stimulate students' thinking, reasoning and self explanations. We describe the MyST system architecture and Wizard of Oz procedure that was used to collect data from tutorial sessions with elementary school students. Using data collected with the procedure, we present evaluations of the ASR and semantic parsing components. A formal evaluation of learning gains resulting from system use is currently being conducted. This paper presents survey results of teachers' and children's impressions of MyST.

References

[1]
Aist, G. and Mostow, J. 2009. Designing spoken tutorial dialogue with children to elicit predictable but educationally valuable responses. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech).
[2]
Atkinson, R. K. 2002. Optimizing learning from examples using animated pedagogical agents. J. Educ. Psych. 94, 416--427.
[3]
Baylor, A. L. and Ryu, J. 2003. Does the presence of image and animation enhance pedagogical agent persona? J. Edu. Comput. Resear., 28, 4, 373--395.
[4]
Baylor, A. L. and Kim, Y. 2005. Simulating instructional roles through pedagogical agents. Int. J. Artific. Intell. Edu. 15, 1.
[5]
Beck, I. L., McKeown, M. G., Worthy, J., Sandora, C. A., and Kucan, L. 1996. Questioning the author: A year-long classroom implementation to engage students with text. Elem. School J. 96, 4, 387--416.
[6]
Beck, I. and McKeown, M. 2006. Improving Comprehension with Questioning the Author: A Fresh and Expanded View of a Powerful Approach. Scholastic.
[7]
Bernstein, J. and Cheng, J. 2007. Logic and validation of fully automatic spoken English test. In The Path of Speech Technologies in Computer Assisted Language Learning: From Research Toward Practice, M. Holland and F. P. Fisher, Eds., Routledge, 174--194. http://www.ordinate.com/samples/Versant-English/Sample-TEST-PAPER-Versant-English-Test-watermark.pdf
[8]
Bloom, B. S. 1984. The 2 sigma problem: The search for methods of group instruction as effective as one-on-one tutoring. Educ. Resear. 13, 4--16.
[9]
Bolanos, D., Cole, R., Ward, W., Borts, E., and Svirsky, E. 2011. FLORA: Fluent oral reading assessment of children's speech. ACM Trans. Speech Lang. Process.
[10]
Bruner, J. S. 1966. Toward a Theory of Instruction. Harvard University Press, Cambridge, MA.
[11]
Bruner, J. S. 1990. Acts of Meaning. Harvard University Press, Cambridge, MA.
[12]
Butcher, K. R. 2006. Learning from text with diagrams: Promoting mental model development and inference generation. J. Edu. Psych. 98, 1, 182--197.
[13]
Chapin, S. H., O'Connor, C., and Anderson, N. C. 2003. Classroom Discussions Using Math Talk to Help Students Learn. Math Solution Publications, Sausalito, CA.
[14]
Chen, W., Mostow, J., and Aist, G. 2010. Exploiting predictable response training to improve automatic recognition of children's spoken questions. In Proceedings of the 10th International Conference on Intelligent Tutoring Systems (ITS2010), Springer-Verlag, 55--64.
[15]
Chi, M. T. H., Bassok, M., Lewis, M. W., Reimann, P., and Glaser, R. 1989. Self-explanations: How students study and use examples in learning to solve problems. Cogn. Sci. 13, 145--182.
[16]
Chi, M. T. H., De Leeuw, N., Chiu, M., and Lavancher, C. 1994. Eliciting self-explanations improves understanding. Cogn. Sci. 18, 439--477.
[17]
Chi, M. T. H., Siler, S. A., Jeong, H., Yamauchi, T., and Hausmann, R. G. 2001. Learning from human tutoring. Cogn. Sci. 25, 471--533.
[18]
Clarkson, P. R. and Rosenfeld, R. 1997. Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of Eurospeech.
[19]
Cohen, P. A., Kulik, J. A., and Kulik, C. L. C. 1982. Educational outcomes of tutoring: A meta-analysis of findings. Am. Edu. Resear. J. 19, 237--248.
[20]
Cole, R., Van Vuuren, S., Pellom, B., Hacioglu, K., Ma, J., Movellan, J., Schwartz, S., Wade-Stein, D., Ward, W., and Yan, J. 2003. Perceptive animated interfaces: First steps toward a new paradigm for human--computer interaction. InProc. IEEE 91, 9, 1391--1405.
[21]
Cole, R., Wise, B., and Van Vuuren, S. 2007. How Marni teaches children to read. Educ. Techn.
[22]
Craig, S. D., Gholson, B., Ventura, M., Graesser, A. S., and Tutoring Research Group. 2000. Overhearing dialogues and monologues in virtual tutoring sessions: Effects on questioning and vicarious learning. Int. J. Artif. Intell. Edu. 11, 242--253.
[23]
Dede, C., Salzman, M., Loftin, B., and Ash, K. (in press). Using virtual reality technology to convey abstract scientific concepts. In Learning the Sciences of the 21st Century: Research, Design, and Implementing Advanced Technology Learning Environments. M. J. Jacobson and R. B. Kozma, Eds. Lawrence Erlbaum, Hillsdale, NJ.
[24]
Driscoll, D., Craig, S. D., Gholson, B., Ventura, M., Hu, X., and Graesser, A. 2003. Vicarious learning: Effects of overhearing dialog and monolog-like discourse in a virtual tutoring session. J. Educ. Comput. Resear. 29, 431--450.
[25]
Federico, M. 1996. Bayesian Estimation Methods for n-gram language model adaptation. In Proceedings of ICSLP'96, 240--243.
[26]
Graesser, A. C., Hu, X., Susarla, S., Harter, D., Person, N. K., Louwerse, M., Olde, B. and the Tutoring Research Group. 2001. AutoTutor: An intelligent tutor and conversational tutoring scaffold. In Proceedings of the 10th International Conference of Artificial Intelligence in Education, 47--49.
[27]
Graesser, A., N., Person, N., and Harter D. 2001. Teaching tactics and dialog in Autotutor. Int. J. Artific. Intell. Edu.
[28]
Hausmann, R. G. M. and Vanlehn, K. 2007a. Explaining self-explaining: A contrast between content and generation. Artificial Intelligence in Education, R. Luckin, K. R. Koedinger, and J. Greer, Eds. IOS Press, Amsterdam, Netherlands, 417--424.
[29]
Hausmann, R. G. M. and Vanlehn, K. 2007b. Self-explaining in the classroom: Learning curve evidence. In Proceedings of the 29th Annual Conference of the Cognitive Science Society. D. McNamara and G. Trafton Eds., Erlbaum, Mahwah, NJ, 1067--1072.
[30]
King, A. 1989. Effects of self-questioning training on college students' comprehension of lectures. Contemp. Educ. Psy. 14, 366--381.
[31]
King, A. 1991. Effects of training in strategic questioning on children's problem-solving performance. J. Educ. Psych. 83, 307--317.
[32]
King, A. 1994. Guiding knowledge construction in the classroom: Effect of teaching children how to question and explain. Am. Educ. Resear. J. 31, 338--368.
[33]
King, A., Staffieri, A., and Adelgais, A. 1998. Mutual peer tutoring: Effects of structuring tutorial interaction to scaffold peer learning. J. Educ. Psych. 90, 134--15.
[34]
Kintsch, W. 1988. The role of knowledge in discourse comprehension: A construction-integration model. Psych. Rev. 95, 163--182.
[35]
Kintsch, W. 1998. Comprehension: A Paradigm for Cognition. Cambridge University Press, Cambridge, England.
[36]
Lee, L. and Rose, R. C. 1998. A frequency warping approach to speaker normalization. IEEE Trans. Speech Audio Process. 6, 1, 49--60.
[37]
Leggetter, C. J. and Woodland, P. C. 1995. Maximum likelihood linear regression for speaker adaptation of continuous density Hidden Markov Models. Comput. Speech Langu. 9, 171--185.
[38]
Lester, J., Converse, S., Kahler, S., Barlow, S., Stone, B., and Boghal, R. 1997. The persona effect: Affective impact of animated pedagogical agents. In Proceedings of CHI'97, ACM, New York, 359--366.
[39]
Lester, J., Stone, B., and Stelling. G. 1999. Lifelike pedagogical agents for mixed-initiative problem solving in constructivist learning environments. User Model. User-Adap. Interact. 9, 1--2, 1--44.
[40]
Littman, D. and Silliman, X. 2004. ITSPOKE: An intelligent tutoring spoken dialog system. In Proceedings of HLT-NAACL, 5--8.
[41]
Ma, J., Cole, R. A., Pellom, B., Ward, W., and Wise, B. 2004. Accurate automatic visible speech synthesis of arbitrary 3d models based on concatenation of di-viseme motion capture data. J. Comput. Anim. Virt. Worlds 15, 485--500.
[42]
Ma, J. Yan, J., and Cole, R. 2002. CU Animate: Tools for enabling conversations with animated characters. In Proceedings of the International Conference on Spoken Language Processing.
[43]
Madden, N. A. and Slavin, R. E. 1989. Effective pullout programs for students at risk. in Effective Programs for Students At Risk, R. E. Slavin, N. L. Karweit, and N. A. Madden, Eds., Allyn and Bacon, Boston, MA.
[44]
Mayer, R. 2001. Multimedia Learning. Cambridge University Press, Cambridge, UK.
[45]
McKeown, M. G. and Beck, I. L. 1999. Getting the discussion started. Educ. Leader. 57, 3, 25--28.
[46]
McKeown, M. G., Beck, I. L., Hamilton, R., and Kucan, L. 1999. Accessibles—Questioning the Author (Easy-Access Resources for Classroom Challenges). Wright Group, Bothell, WA.
[47]
Moreno, R., Mayer, R. E., Spires, H. A., and Lester, J. C. 2001. The case for social agency in computer-based teaching: Do students learn more deeply when they interact with animated pedagogical agents? Cogn. Inst. 19, 2, 177--213.
[48]
Mostow, J. and Aist, G. 1999. Giving help and praise in a reading tutor with imperfect listening—Because automated speech recognition means never being able to say you're certain. CALICO J. 16, 3, 407--424.
[49]
Mostow, J. and Aist, G. 2001. Evaluating tutors that listen: An overview of Project LISTEN. In Smart Machines in Education, K. Forbus and P. Feltovich, Eds.
[50]
Mostow, J., Aist, G., Burkhead, P., Corbett, A., Cuneo, A., Eitelman, S., Huang, C., Junker, B., Sklar, M. B., and Tobin, B. 2003. Evaluation of an automated reading tutor that listens: Comparison to human tutoring and classroom instruction. J. Educa. Comput. Resear. 29, 1, 61--117.
[51]
Mostow, J. and Chen, W. 2009. Generating instruction automatically for the reading strategy of self-questioning. In Proceedings of the 14th International Conference on Artificial Intelligence in Education (AIED'09). 465--472.
[52]
Murphy, P. K. and Edwards. M. N. 2005. What the studies tell us: A meta-analysis of discussion approaches. In Making Sense of Group Discussions Designed to Promote High-Level Comprehension of Texts. Symposium Presented at the Annual Meeting of the American Educational Research Association.
[53]
Murphy, P. K., Wilkinson, I. A. G., Soter, A. O., Hennessey, M. N., and Alexander, J. F. 2009. Examining the effects of classroom discussion on students' high-level comprehension of text: A meta-analysis. J. Educ. Psych. 101, 740--764.
[54]
Naep. 2002. http://nces.ed.gov/nationsreportcard
[55]
Nass C. and Brave S. 2005. Wired for Speech: How Voice Activates and Advances The Human-Computer Relationship. MIT Press, Cambridge, MA.
[56]
Nystrand, M. and Gamoran, A. 1991. Instructional discourse, student engagement, and literature achievement. Resear. Teach. English 25, 261--290.
[57]
Palincsar, A. S. 1998. Social constructivist perspectives on teaching and learning. Annual Revi. Psych. 49, 345--375.
[58]
Palincsar, A. S. and Brown, A. 1984. Reciprocal teaching of comprehension-fostering and comprehension- monitoring activities. Cogn. Instr. 1, 117--175.
[59]
Pine, K. J. and Messer, D. J. 2000. The effect of explaining another's actions on children's implicit theories of balance. Cogn. Instr. 18, 1, 35--51.
[60]
Reeves, B. and Nass, C. 1996. The Media Equation, Cambridge University Press, Cambridge, UK.
[61]
Rickel, J. and Johnson, W. L. 2000. Task-oriented collaboration with embodied agents in virtual worlds. In Embodied Conversational Agents, J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds.
[62]
Soter, A. O. and Rudge, L. 2005. What the discourse tells us: Talk and indicators of high-level comprehension. In Proceedings of the Annual Meeting of the American Educational Research Association. 11--15.
[63]
Soter, A. O., Wilkinson, I. A. G., Murphy, P. K., Rudge, L., Reninger, K., and Edwards, M. 2008. What the discourse tells us: Talk and indicators of high-level comprehension. Int. J. Educ. Resear. 47, 372--391.
[64]
Taylor, P., Black, A. W., and Caley, R. 1998. The architecture of the festival speech synthesis. In Proceedings of the 3rd ESCA Workshop in Speech Synthesis. 147--151.
[65]
Topping, K. and Whitley, M. 1990. Participant evaluation of parent-tutored and peer-tutored projects in reading, In Educa. Resear. 32, 1, 14--32.
[66]
Van Lehn, K. and Graesser, A. C. 2002. Why2 Report: Evaluation of Why/Atlas, Why/AutoTutor, and accomplished human tutors on learning gains for qualitative physics problems and explanations. Unpublished report prepared by the University of Pittsburgh CIRCLE group and the University of Memphis Tutoring Research Group.
[67]
Van Lehn, K., Lynch, C., Taylor, L., Weinstein, A., Shelby, R., Schulze, K., Treacy, D., and Wintersgill, M. 2003. In Intelligent Tutoring Systems, S. A. Cerri, G. Gouarderes, and F. Paraguacu, Eds. Springer, Berlin, Germany, 367--376.
[68]
Vanlehn, K., Lynch, C., Schulze, K. Shapiro, J. A., Shelby, R., Taylor, L., Treacy, D., Weinstein, A., and Wintersgill, M. 2005. The Andes physics tutoring system: Five years of evaluations. In Proceedings of the 12th International Conference on Artificial Intelligence in Education. G. McCalla and C. K. Looi, Eds. IOR Press, Amsterdam.
[69]
Vygotsky, L. S. 1978. Mind in Society: The Development of Higher Psychological Processes. Harvard University Press, Cambridge, MA.
[70]
Ward, W. 1994. Extracting information from spontaneous speech, In Proceedings of the International Conference on Spoken Language Processing (ICSLP)
[71]
Ward, W. and Pellom, B. 1999. The CU Communicator system. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop.
[72]
Wise, B., Cole, R., Van Vuuren, S., Schwartz, S., Snyder, L., Ngampatipatpong, N., Tuantranont, J., and Pellom, B. 2005. Learning to read with a virtual tutor: foundations to literacy. In Interactive Literacy Education, C. Kinzer and L. Verhoeven, Eds., Environments through Technology. Lawrence Erlbaum, Mahwah, NJ.
[73]
Wood, D. and Middleton, D. 1975. A study of assisted problem solving. Brit. J. Psych. 66, 181--191.

Cited By

View all
  • (2024)Improving End-to-End Models for Children’s Speech RecognitionApplied Sciences10.3390/app1406235314:6(2353)Online publication date: 11-Mar-2024
  • (2024)Field Trial of a Tablet-based AR System for Intergenerational Connections through Remote ReadingProceedings of the ACM on Human-Computer Interaction10.1145/36536968:CSCW1(1-28)Online publication date: 26-Apr-2024
  • (2024)The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environmentJASA Express Letters10.1121/10.00341954:11Online publication date: 1-Nov-2024
  • Show More Cited By

Index Terms

  1. My science tutor: A conversational multimedia virtual tutor for elementary school science

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Speech and Language Processing
    ACM Transactions on Speech and Language Processing   Volume 7, Issue 4
    August 2011
    143 pages
    ISSN:1550-4875
    EISSN:1550-4883
    DOI:10.1145/1998384
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 August 2011
    Accepted: 01 April 2011
    Revised: 01 October 2010
    Received: 01 June 2010
    Published in TSLP Volume 7, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Semantic parsing
    2. avatar
    3. dialog management
    4. language model

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)69
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Improving End-to-End Models for Children’s Speech RecognitionApplied Sciences10.3390/app1406235314:6(2353)Online publication date: 11-Mar-2024
    • (2024)Field Trial of a Tablet-based AR System for Intergenerational Connections through Remote ReadingProceedings of the ACM on Human-Computer Interaction10.1145/36536968:CSCW1(1-28)Online publication date: 26-Apr-2024
    • (2024)The JIBO Kids Corpus: A speech dataset of child-robot interactions in a classroom environmentJASA Express Letters10.1121/10.00341954:11Online publication date: 1-Nov-2024
    • (2024)UniEnc-CASSNAT: An Encoder-Only Non-Autoregressive ASR for Speech SSL ModelsIEEE Signal Processing Letters10.1109/LSP.2024.336503631(711-715)Online publication date: 2024
    • (2024)Analysis of Self-Supervised Speech Models on Children’s Speech and Infant Vocalizations2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW62465.2024.10626416(550-554)Online publication date: 14-Apr-2024
    • (2024)SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW62465.2024.10625884(560-564)Online publication date: 14-Apr-2024
    • (2024)Are perfect transcripts necessary when we analyze classroom dialogue using AIoT?Internet of Things10.1016/j.iot.2024.10110525(101105)Online publication date: Apr-2024
    • (2023)Tecnologia Educacional na era da IATecnologias, Sociedade e Conhecimento10.20396/tsc.v10i2.1836710:2(68-101)Online publication date: 22-Dec-2023
    • (2023)MathKingdom: Teaching Children Mathematical Language Through Speaking at Home via a Voice-Guided GameProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581043(1-14)Online publication date: 19-Apr-2023
    • (2023)A CTC Alignment-Based Non-Autoregressive Transformer for End-to-End Automatic Speech RecognitionIEEE/ACM Transactions on Audio, Speech, and Language Processing10.1109/TASLP.2023.326378931(1436-1448)Online publication date: 2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media