Cross-validation study of methods and technologies to assess mental models in a complex problem solving situation

https://doi.org/10.1016/j.chb.2011.11.018Get rights and content

Abstract

This paper reports a cross-validation study aimed at identifying reliable and valid assessment methods and technologies for natural language (i.e., written text) responses to complex problem-solving scenarios. In order to investigate current assessment technologies for text-based responses to problem-solving scenarios (i.e., ALA-Reader and T-MITOCAR), this study compared the two best developed technologies to an alternative methodology. Comparisons amongst the three models (benchmark, ALA-Reader, and T-MITOCAR) provided two findings: (a) the benchmark model created the most descriptive concept maps; and (b) the ALA-Reader model had a higher correlation with the benchmark model than did T-MITOCAR’s. The results imply that the benchmark model is a viable alternative to the two existing technologies and is worth exploring in a larger scale study.

Highlights

► We validated three assessment technologies for natural language responses to complex problems. ► We propose a benchmark method in comparison with current technologies involving ALA-Reader and T-MITOCAR. ► The benchmark model creates the most descriptive concept maps. ► The ALA-Reader model had higher correlation with the benchmark model than did T-MITOCAR. ► The benchmark model is a viable alternative to two existing technologies.

Introduction

This study investigated current methods and technologies that yield concept maps − structural knowledge representations consisting of concepts and relations (Clariana, 2010, Narayanan, 2005, Novak and Cañas, 2006, Spector and Koszalka, 2004) − as re-representations of a student’s mental models. This study is a type of cross-validation aimed at identifying the methods that work best in terms of forming the basis for dynamic formative feedback. It is assumed that using natural language (written text) responses as a basis for concept map representations of student thinking is likely to provide a reliable foundation for use in providing formative feedback and assessment (Pirnay-Dummer, Ifenthaler, & Spector, 2009).

There is a common belief that problem solving includes conceptualizing the problem space, which involves creating a knowledge structure that integrates ideas and concepts that a problem solver associates with the problem situation (Dochy et al., 2003, Jonassen et al., 1993, Newell and Simon, 1972, Segers, 1997). As a consequence, assessing problem solving should naturally take into account the constructed knowledge structure (Gijbels, Dochy, Van den Bossche, & Segers, 2005); simple knowledge tests are somewhat weak measures of problem-solving ability (Thomas, 2005).

In order to capture structural knowledge, a number of technologies have been developed, including: DEEP (Dynamic Evaluation of Enhanced Problem-solving, Spector & Koszalka, 2004); SMD (Surface, Matching, and Deep Structure, Ifenthaler, 2008); T-MITOCAR (Text Model Inspection Trace of Concepts and Relations, Pirnay-Dummer et al., 2009); CmapTools (Novak & Cañas, 2006); jMap (Jeong, 2008); ACSMM (Analysis Constructed Shared Mental Model, O’Connor et al., 2004); KU-Mapper (Clariana & Wallace, 2009); ALA-Mapper (Analysis of Lexical Aggregates-Mapper, Clariana et al., 2009, Taricani and Clariana, 2006); ALA-Reader (Analysis of Lexical Aggregates-Reader, Clariana and Wallace, 2009, Clariana et al., 2009); and KNOT (Knowledge Network Orientation Tool, Schvaneveldt, 1990).

Current technologies either require learners to create an annotated concept map with rich descriptions of links and nodes (DEEP) or else they use text responses as an interim step in generating a concept map (T-MITOCAR and ALA-Reader) that can then be assessed with tools such as SMD or KNOT. All these technologies have limitations in terms of their suitability, reliability, and validity (Kalyuga, 2006, Seel, 1999, Spector et al., 2006).

This paper focuses on methods that use text responses to generate a concept map that can then be assessed and explores an alternative approach that attempts to restore rich descriptions of links between nodes. Prominent methods and technologies are classified and analyzed in terms of their merits and deficiencies. Next, alternative methods and technologies to analyze student responses in the form of written text are selected. Finally, cross-validation among the selected technologies is performed, analyzed and reported. Based on the results, an alternative approach to consider in automatically constructing and assessing concept maps based on open-ended text responses to a problem situation is then described.

Section snippets

Mental models as inferred entities

Mental models are cognitive artifacts resulting from perception and linguistic comprehension, representing certain aspects of external situations (Johnson-Laird, 2005a, Johnson-Laird, 2005b). In this perspective, knowledge appears to be a configuration of holistic mental representations.

Mental model representations consist of propositional representations as structured symbols and images (Johnson-Laird, 2005a, Newell, 1994). A concept map, such as the externally-represented structural component

State-of-the-art concept map technologies

Concept maps are generally visually represented through network analysis using a set of techniques to portray patterns of relations among nodes (Coronges et al., 2007, Hutchison, 2003, Wasserman and Faust, 1994). Most of the techniques involve mathematical algorithms derived from graph theory (Rupp et al., 2010a, Schvaneveldt et al., 1989, Wasserman and Faust, 1994). In these techniques, proximity data between and among concepts is defined as “judgments of similarity, relatedness, or

Method

This study has two aims: (1) to identify the methods that consistently yield the most descriptive and accurate concept maps; (2) to validate these methods and technologies using a benchmark method. The most complex condition can be expressed in this combination: natural language + open-ended + directional adjacency data + pronoun-edited. As to the network analysis method, the social network analysis (SNA) is considered as an alternative in terms of obtaining a more descriptive concept map and diverse

Determining a benchmark

Two comparisons amongst the methods (the noun-only versus pronoun-edited and the directional versus non-directional) for creating benchmark models were implemented so that we could decide a reliable and economical way to establish a benchmark. The first review was of the noun-only and pronoun-edited using the numerical similarity between the two approaches.

As Table 3 summarizes, very high average numerical similarities (s > 0.93) were observed between the noun-only and pronoun-edited data of

Research findings

This study assumed that an individual student’s understanding is meaningfully elicited via a natural language approach. Two state-of-the-art technologies, ALA-Reader and T-MITOCAR, were selected because they were consistent with the initial assumption. In order to validate the technologies, an alternative method was established as a benchmark.

It was believed that linguistic knowledge representation should be open-ended in terms of concepts and should be directional in terms of relations. In

References (63)

  • Clariana, R. B. (2010). Multi-decision approaches for eliciting knowledge structure. In Computer-based diagnostics and...
  • R.B. Clariana et al.

    A computer-based approach for deriving and measuring individual and team knowledge structure from essay questions

    Journal of Educational Computing Research

    (2007)
  • R.B. Clariana et al.

    A comparison of pair-wise, list-wise, and clustering approaches for eliciting structural knowledge in information systems courses

    International Journal of Instructional Media

    (2009)
  • R. Clariana et al.

    Deriving and measuring group knowledge structure from essays: The effects of anaphoric reference

    Educational Technology Research and Development

    (2009)
  • A.M. Collins et al.

    A spreading–activation theory of semantic processing

    Psychological Review

    (1975)
  • Colman, A. M., Shafir, E., Tversky, Amos (2008). In Koertge, N. (Ed.), New dictionary of scientific biography (pp....
  • K.A. Coronges et al.

    Structural comparison of cognitive associative networks in two populations

    Journal of Applied Social Psychology

    (2007)
  • A. Garnham

    Mental models as representations of discourse and text

    (1987)
  • A. Garnham

    Mental models and the interpretation of anaphora

    (2001)
  • D. Gijbels et al.

    Effects of problem-based learning: A meta-analysis from the angle of assessment

    Review of Educational Research

    (2005)
  • T.E. Goldsmith et al.

    Assessing structural knowledge

    Journal of Educational Psychology

    (1991)
  • T.E. Goldsmith et al.

    Applications of structural knowledge assessment to training evaluation

  • C.M. Goodman

    The Delphi technique: a critique

    Journal of Advanced Nursing

    (1987)
  • J.G. Greeno

    Situations, mental models and generative knowledge

  • P. Hage et al.

    Structural models in anthropology

    (1983)
  • C. Hsu et al.

    The Delphi technique: making sense of consensus

    Practical Assessment Research & Evaluation

    (2007)
  • K.A. Hutchison

    Is semantic priming due to association strength or feature overlap? A microanalytic review

    Psychonomic Bulletin & Review

    (2003)
  • D. Ifenthaler

    Relational structural and semantic analysis of graphical representations and concept maps

    Educational Technology Research and Development

    (2008)
  • D. Ifenthaler et al.

    The mystery of cognitive structure and how we can detect it: tracking the development of cognitive structures over time

    Instructional Science

    (2009)
  • Jeong, A. C. (2008)....
  • P.N. Johnson-Laird

    The history of mental models

  • Cited by (0)

    View full text