Elsevier

Computers in Human Behavior

Volume 72, July 2017, Pages 678-691
Computers in Human Behavior

Full length article
Which cognitive abilities underlie computational thinking? Criterion validity of the Computational Thinking Test

https://doi.org/10.1016/j.chb.2016.08.047Get rights and content

Highlights

  • A Computational Thinking Test (CTt) aimed at Grades 5th to 10th is provided

  • Computational thinking correlates moderately with spatial and reasoning abilities

  • Computational thinking correlates strongly with problem-solving ability

  • Results are consistent with proposals linking CT with the CHC model of intelligence

  • Gender differences in computational thinking performance are discussed

Abstract

Computational thinking (CT) is being located at the focus of educational innovation, as a set of problem-solving skills that must be acquired by the new generations of students to thrive in a digital world full of objects driven by software. However, there is still no consensus on a CT definition or how to measure it. In response, we attempt to address both issues from a psychometric approach. On the one hand, a Computational Thinking Test (CTt) is administered on a sample of 1,251 Spanish students from 5th to 10th grade, so its descriptive statistics and reliability are reported in this paper. On the second hand, the criterion validity of the CTt is studied with respect to other standardized psychological tests: the Primary Mental Abilities (PMA) battery, and the RP30 problem-solving test. Thus, it is intended to provide a new instrument for CT measurement and additionally give evidence of the nature of CT through its associations with key related psychological constructs. Results show statistically significant correlations at least moderately intense between CT and: spatial ability (r = 0.44), reasoning ability (r = 0.44), and problem-solving ability (r = 0.67). These results are consistent with recent theoretical proposals linking CT to some components of the Cattel-Horn-Carroll (CHC) model of intelligence, and corroborate the conceptualization of CT as a problem-solving ability.

Introduction

We live immersed in a digital ecosystem full of objects driven by software (Manovich, 2013). In this context, being able to handle the language of computers is emerging as an inescapable skill, a new literacy, which allows us to participate fully and effectively in the digital reality that surrounds us: it is about to ‘program or be programmed’ (Rushkoff, 2010); it is about to be ‘app-enabled or app-dependent’ (Gardner & Davis, 2013). The term ‘code-literacy’ has recently been coined to refer to the process of teaching and learning to read-write with computer programming languages (Prensky, 2008, Rushkoff, 2012). Thus, it is considered that a person is code-literate when is able to read and write in the language of computers and other machines, and to think computationally (Román-González, 2014). If code-literacy refers ultimately to a new read-write practice, computational thinking (CT) refers to the underlying problem-solving cognitive process that allows it. In other words, computer programming is the fundamental way that enables CT come alive (Lye & Koh, 2014); although CT can be transferred to various types of problems that do not directly involve programming tasks (Wing, 2008).

Given this current reality overrun by the digital, it is not surprising that there is renewed interest in many countries to introduce CT as a set of problem-solving skills to be acquired by the new generations of students; even more, CT is becoming viewed at the core of all STEM (Science, Technology, Engineering, & Mathematics) disciplines (Henderson et al., 2007, Weintrop et al., 2016). Although learn to think computationally has long been recognized as important and positive for the cognitive development of students (Liao and Bright, 1991, Mayer, 1988, Papert, 1980), as computation has become pervasive, underpinning communication, science, culture and business in our society (Howland & Good, 2015), CT is increasingly seen as an essential skill to create rather than just consume technology (Resnick et al., 2009). Thus, many governments around the world are incorporating computer programming into their national educational curricula. The recent decision to introduce computer science teaching from primary school onwards in the UK (Brown et al., 2013) and others European countries (European Schoolnet, 2015) reflects the growing recognition of the importance of CT.

However, there is still little consensus on a formal definition of CT (Gouws et al., 2013, Kalelioğlu et al., 2016), and disagreements over how it should be integrated in educational curricula (Lye & Koh, 2014). Similarly, there is a worrying vacuum about how to measure and assess CT, fact that must be addressed. Without attention to assessment, CT can have little hope of making its way successfully into any curriculum. Furthermore, in order to judge the effectiveness of any curriculum incorporating CT, measures that would enable educators to assess what the student has learned need to be validated (Grover & Pea, 2013).

In response, we attempt to address these issues from a psychometric approach. On the one hand, how our Computational Thinking Test (CTt) has been designed and developed is reported, as well as its descriptive statistics and reliability derived from an administration on a sample exceeding a thousand Spanish students. On the other hand, the criterion validity (Cronbach & Meehl, 1955) of the CTt is studied with respect to already standardized psychological tests of core cognitive abilities. Thus, this paper is aimed at providing a new instrument for measuring CT and additionally giving evidence of the correlations between CT and other well-established psychological constructs in the study of cognitive abilities.

We can distinguish between: a) generic definitions; b) operational definitions; c) educational and curricular definitions.

One decade ago, in 2006, Jeanette Wing's foundational paper defined that CT “involves solving problems, designing systems, and understanding human behavior, by drawing on the concepts fundamental to computer science” (Wing, 2006, p. 33). Thus, CT's essence is thinking like a computer scientist when confronted with a problem. But this first generic definition has been revisited and specified in successive attempts over the last few years, still not reaching an agreement (Grover and Pea, 2013, Kalelioğlu et al., 2016). So, in 2011 Wing clarified, CT “is the thought processes involved in formulating problems and their solutions so that the solutions are represented in a form that can be effectively carried out by an information-processing agent” (Wing, 2011; on-line). One year later, this definition is simplified by Aho, who conceptualizes CT as the thought processes involved in formulating problems so “their solutions can be represented as computational steps and algorithms” (Aho, 2012, p. 832).

In 2011, the Computer Science Teachers Association (CSTA) and the International Society for Technology in Education (ISTE) developed an operational definition of computational thinking that provides a framework and common vocabulary for Computer Science K-12 educators: CT is a “problem-solving process that includes (but is not limited to) the following characteristics: formulating problems in a way that enables us to use a computer and other tools to help solve them; logically organizing and analyzing data; representing data through abstractions such as models and simulations; automating solutions through algorithmic thinking (a series of ordered steps); identifying, analyzing, and implementing possible solutions with the goal of achieving the most efficient and effective combination of steps and resources; generalizing and transferring this problem solving process to a wide variety of problems” (CSTA & ISTE, 2011; on-line).

More than definitions in the strict sense, frameworks for developing CT in the classroom and other educational settings are mentioned next. So, from the UK, the organization Computing At School (CAS) states that CT involves six different concepts (logic, algorithms, decomposition, patterns, abstraction, and evaluation), and five approaches to working (tinkering, creating, debugging, persevering, and collaborating) in the classroom (CAS Barefoot, 2014). Moreover, from the United States, Brennan and Resnick (2012) describe a CT framework that involves three key dimensions: ‘computational concepts’ (sequences, loops, events, parallelism, conditionals, operators, and data); ‘computational practices’ (experimenting and iterating, testing and debugging, reusing and remixing, abstracting and modularizing); and ‘computational perspectives’ (expressing, connecting, and questioning). Table 1 shows a crosstab intersecting the CT framework dimensions (Brennan & Resnick, 2012) with the sampling domain of our Computational Thinking Test (CTt), which will be detailed in Sub-section 1.4.

While CT involves thinking skills to solve problems algorithmically (e.g., Brennan and Resnick, 2012, Grover and Pea, 2013), intelligence (i.e., general mental ability or general cognitive ability) involves primarily the ability to reason, plan and solve problems (Gottfredson, 1997). Even authors with alternative approaches to the conceptualization of intelligence recognize intelligence as a “computational capacity” or “the ability to process certain kinds of information in the process of solving problems of fashioning products” (Gardner, 2006, p. 503).

Within a cognitive approach, it has been recently suggested (Ambrosio, Xavier, & Georges, 2014) that computational thinking is related to the following three abilities-factors from the Cattell-Horn-Carroll (CHC) model of intelligence (McGrew, 2009, Schneider and McGrew, 2012):

  • Fluid reasoning (Gf), defined as: “the use of deliberate and controlled mental operations to solve novel problems that cannot be performed automatically. Mental operations often include drawing inferences, concept formation, classification, generating and testing hypothesis, identifying relations, comprehending implications, problem solving, extrapolating, and transforming information. Inductive and deductive reasoning are generally considered the hallmark indicators of Gf” (McGrew, 2009, p. 5)

  • Visual processing (Gv), defined as “the ability to generate, store, retrieve, and transform visual images and sensations. Gv abilities are typically measured by tasks (figural or geometric stimuli) that require the perception and transformation of visual shapes, forms, or images and/or tasks that require maintaining spatial orientation with regard to objects that may change or move through space” (McGrew, 2009, p. 5)

  • Short-term memory (Gsm), defined as “the ability to apprehend and maintain awareness of a limited number of elements of information in the immediate situation (events that occurred in the last minute or so). A limited-capacity system that loses information quickly through the decay of memory traces, unless an individual activates other cognitive resources to maintain the information in immediate awareness” (McGrew, 2009, p. 5).

Therefore, it is expected that a computational thinking test should correlate with other already validated tests aimed at measuring cognitive abilities cited above.

Count on validated measurement instruments is something necessary and valuable in any research area. However, for the moment, there is still a large gap of tests relating to CT that have undergone a comprehensive psychometric validation process (Mühling, Ruf, & Hubwieser, 2015). As Buffum et al. (2015) say: “developing assessments of student learning is an urgent area of need for the relatively young computer science education community as it advances toward the ranks of more mature disciplines such as physics that have established standardized assessments over time” (Buffum et al., 2015, p. 622). Anyway, we find in recent years some remarkable attempts to measure and assess CT in students from 5th to 10th grade, which are the ones of this paper's interest.

From the University of California, comes the instrument Fairy Assessment in Alice (Werner, Denner, Campe, & Kawamoto, 2012), which tries to measure the understanding and use of abstraction, conditional logic, algorithmic thinking and other CT concepts that middle school students utilize to solve problems. However, this instrument is designed ad hoc to be used in the context of programming learning environment Alice1 (Graczyńska, 2010), and it has not been undergone to a psychometric validation process. The research group from Clemson University (South Carolina) provides a complementary perspective (Daily et al., 2014, Leonard et al., 2015). These authors propose a kinesthetic approach to learning (‘embodied learning’) and assessment of CT with 5th and 6th grade students. To do so, they alternate activities for programming motion sequences (choreographies) in the Alice environment, with the representation of those same sequences in a physical-kinesthetic environment. The assessment tool also combines both settings, but its psychometric properties have not been reported.

Another interesting research line with middle school students is provided by the group from the University of Colorado. They work with students in the video-game programming environment AgentSheets2 Within a first group of studies (Koh, Basawapatna, Bennett, & Repenning, 2010), these authors identify several Computational Thinking Patterns (CTP) that young programmers abstract and develop during the creation of their video-games; in this context, they design the Computational Thinking Patterns Graph, an automated tool that analyzes the games programmed by the students, and represents graphically how far each game has involved the different CTP when compared with a model. Within a second group of studies (Basawapatna, Koh, Repenning, Webb, & Marshall, 2011), the authors try to assess whether students are able to transfer the CTP acquired during video-game programming to a new context of scientific simulations programming. For this assessment, they develop CTP-Quiz instrument, whose reliability or validity have not been reported.

Similarly, from the Universidad Rey Juan Carlos (Madrid, Spain) Dr. Scratch3 is presented (Moreno-León and Robles, 2015a, Moreno-León and Robles, 2015b, Moreno-León and Robles, 2014). Dr. Scratch is a free and open source web application designed to analyze, simply and automatically, projects programmed with Scratch4 (Resnick et al., 2009), as well as it provides feedback that can be used to improve programming skills and to develop CT in middle school students (Moreno-León, Robles, & Román-González, 2015). In order to assign an overall CT score to the project, Dr. Scratch infers the programmer competence along the following seven CT dimensions: Abstraction and problem decomposition; Parallelism; Logical thinking; Synchronization; Flow control; User Interactivity; and Data representation. Therefore, Dr. Scratch is not strictly a cognitive test but a tool for the formative assessment of Scratch projects. Dr. Scratch is currently under validation process, although its convergent validity with respect to other traditional metrics of software quality and complexity has been already reported (Moreno-León, Robles, & Román-González, 2016).

Furthermore, we consider the Bebras International Contest,5 a competition born in Lithuania in 2003 which aims to promote the interest and excellence of primary and secondary students around the world in the field of Computer Science from a CT perspective (Cartelli et al., 2012, Dagiene and Futschek, 2008, Dagiene and Stupuriene, 2014). Each year, the contest proposes a set of Bebras Tasks, whose overall approach is the resolution of real problems, significant for the students, through the transfer and projection of their CT over those. These Bebras Tasks are independent from any particular software or hardware, and can be administered to individuals without any prior programming experience. For all these features, the Bebras Tasks have been pointed out as more than likely embryo for a future PISA (Programme for International Student Assessment) test in the field of Computer Science (Hubwieser and Mühling, 2014, Jašková and Kováčová, 2015). Anyway, the Bebras International Contest is, at the moment, an event for promoting CT, not a measuring instrument; among other considerations, because it is not composed by a stable and determined set of task-items, but a set that varies from year to year, with slight modifications along the countries. However, its growing expansion has aroused the interest of psychometry researchers, who have begun to investigate its possible virtues as a CT measurement instrument. Thus, descriptive studies about the student's performance on Bebras Tasks have been recently published, referred to the corresponding editions of the Bebras International Contest held in Germany (Hubwieser and Mühling, 2014, Hubwieser and Mühling, 2015), Italy (Bellettini et al., 2015), Taiwan (Lee, Lin, & Lin, 2014) or Turkey (Kalelioğlu, Gülbahar, & Madran, 2015). In all of them, and in most of the tasks studied, significantly higher performances in the male group in comparison with the female group were reported.

But strictly speaking, we only have knowledge of two tests aimed to middle/high school students which are being fully subjected to the psychometric requirements; both instruments are currently undergoing a validation process.

  • a.

    Test for Measuring Basic Programming Abilities (Mühling et al., 2015): it is designed for Bavarian students from 7th to 10th grade. This test is aimed at measuring the students' ability to execute a given program based on the so-called ‘flow control structures’; which are considered at the core of the CT for this age group: Sequencing (doing one step after another); Selection (doing either one thing or another); Repetition (doing one thing once and again). These control structures lead to the following CT concepts that are covered by the test: sequence of operations; conditional statement with (if/else) and without (if) alternative; loop with fixed number of iterations (repeat times); loop with exit condition (conditional loop: while or repeat until); and the nesting of these structures to create more complex programs.

  • b.

    Commutative Assessment (Weintrop & Wilensky, 2015): it is designed for high-school students, from 9th to 12th grade. This test is aimed at measuring students' understanding of different computational concepts, depending on whether they occur through scripts written in visual (block-based) or textual programming languages; which is a key transition to reach higher levels of code-literacy. The test has a length of 28 items, and it addresses the following CT concepts: conditionals; defined/fixed loops; undefined/unfixed loops; simple functions; functions with parameters/variables.

Overall, our Computational Thinking Test (CTt) has been developed following the practical guide to validating computer science knowledge assessments with application to middle school from Buffum et al. (2015), which is aligned with the international standards for psychological and educational testing (AERA, APA, & NCME, 2014). In addition, the CTt is consistent with other computational thinking tests under validation, aimed to middle/high school, such as the Test for Measuring Basic Programming Abilities (Mühling et al., 2015) or the Commutative Assessment (Weintrop & Wilensky, 2015), just described in Sub-section 1.3.

The CTt was initially designed with a length of 40 multiple choice items (version 1.0, October 2014). After a content validation process through twenty experts' judgement, this first version was refined to the final one (version 2.0, December 2014) of 28 items length (Román-González, 2015); which is built on the following principles:

  • Aim: CTt aims to measure the development level of CT in the subject.

  • Operational definition of measured construct: CT involves the ability to formulate and solve problems by relying on the fundamental concepts of computing, and using logic-syntax of programming languages: basic sequences, loops, iteration, conditionals, functions and variables.

  • Target population: CTt is mainly designed and intended for Spanish students between 12 and 14 years old (7th and 8th grade); although it can be also used in lower grades (5th and 6th grade) and upper grades (9th and 10th grade).

  • Instrument Type: multiple choice test with 4 answer options (only one correct).

  • Length and estimated completion time: 28 items; 45 min.

Each item of the CTt6 is designed and characterized according to the following five dimensions of the sampling domain:

  • Computational concept addressed: each item addresses one or more of the following seven computational concepts, ordered in increasing difficulty: Basic directions and sequences (4 items); Loops–repeat times (4 items); Loops–repeat until (4 items); If–simple conditional (4 items); If/else–complex conditional (4 items); While conditional (4 items); Simple functions (4 items). These ‘computational concepts’ are aligned with some of the CT framework (Brennan & Resnick, 2012; see Table 1) and with the CSTA Computer Science Standards for 7th and 8th grade (CSTA, 2011).

  • Environment-Interface of the item: CTt items are presented in any of the following two environments-interfaces: ‘The Maze’ (23 items) or ‘The Canvas’ (5 items). Both interfaces are common in popular sites for learning programming such as Code.org (Kalelioğlu, 2015).

  • Answer alternatives style: in each item, the response alternatives may be presented in any of these two styles: Visual arrows (8 items) or Visual blocks (20 items). Both styles are also common in popular sites for learning programming such as Code.org (Kalelioğlu, 2015).

  • Existence or non-existence of nesting: depending on whether the item solution involves a script with (19 items) or without (9 items) nesting computational concepts (a concept embedded in another to a higher hierarchy level) (Mühling et al., 2015).

  • Required task: depending on which of the following cognitive tasks is required for solving the item: Sequencing: the student must sequence, stating in an orderly manner, a set of commands (14 items); Completion: the student must complete an incomplete given set of commands (9 items); Debugging: the student must debug an incorrect given set of commands (5 items). This dimension is partially aligned with the aforementioned ‘computational practices’ from the CT framework (Brennan & Resnick, 2012; see Table 1).

The CTt is administered collectively and on-line, and it can be performed both via non-mobile or mobile electronic devices. Preliminary results about the CTt psychometric properties after its administration on a sample of 400 Spanish students (7th and 8th grade) have been already reported (Román-González, Pérez-González, & Jiménez-Fernández, 2015). Examples of definitive CTt items translated into English are shown in Fig. 1, Fig. 2, Fig. 3, Fig. 4; with their specifications detailed below.

Section snippets

Participants

The CTt was administered on a total sample of 1,251 Spanish students, boys and girls from 24 different schools enrolled from 5th to 10th grade. The distribution of the subjects by gender, grade and age is shown in Table 2. From the total sample, 825 (65.9%) students belong to public schools, and 426 (34.1%) belong to private schools. Considering the device on which the CTt was administered, 1,001 students did it on a personal computer (80.0%) and 250 students (20.0%) did it so on a tablet. None

Descriptive statistics

Table 3 shows the main descriptive statistics of the CTt score (calculated as the sum of correct answers along the 28 items of the test) for the entire sample (n = 1,251).

In Fig. 6 (left), a histogram showing the distribution of the CTt score along the sample is depicted. As it can be seen, the aforementioned distribution fits remarkably the normal curve; although, given the very large size of the sample, the small existing maladjustments are penalized by the Kolmogorov-Smirnov test which

Implications and limitations

The CTt has some strengths like: it can be administered in pretest conditions to measure the initial development level of CT in students without prior programming experience from 5th to 10th grade; it can be collectively administered so it could be used in massive screenings and early detection of students with high abilities (or special needs) for programming tasks; it can be utilized for collecting quantitative data in pre-post evaluations of the efficacy of curricula or programs aimed at

Conclusions and further research

In this paper we have provided evidences of reliability and criterion validity of a new instrument for the assessment of CT and additionally we expanded our understanding of the CT nature through the theory-driven exploration of its associations with other established psychological constructs in the cognitive sphere. We have found expected positive small or moderate significant correlations (0.27 < r < 0.44) between CT and three of the four primary mental abilities of the Thurstone (1938) model

Acknowledgements

We thank Professor Dr. Kate Howland (University of Sussex) for collaborating in the adaptation and translation of CTt items from the Spanish language to the English language.

References (82)

  • P.L. Ackerman et al.

    The locus of adult intelligence: Knowledge, abilities, and nonability traits

    Psychology and Aging

    (1999)
  • AERA et al.

    Standards for educational and psychological testing

    (2014)
  • A.V. Aho

    Computation and computational thinking

    The Computer Journal

    (2012)
  • A.P. Ambrosio et al.

    Digital ink for cognitive assessment of computational thinking

  • A. Anastasi

    Psychological testing

    (1968)
  • C.A.S. Barefoot

    Computational thinking [web page]

    (2014)
  • E. Barros et al.

    Using general mental ability and personality traits to predict job performance in three Chilean organizations

    International Journal of Selection and Assessment

    (2014)
  • A. Basawapatna et al.

    Recognizing computational thinking patterns

    Proceedings of the 42nd ACM Technical Symposium on Computer Science Education

    (2011)
  • C. Bellettini et al.

    How challenging are Bebras tasks? An IRT analysis based on the performance of Italian students

  • G.K. Bennett

    Differential aptitude tests [technical manual]

    (1952)
  • K. Brennan et al.

    New frameworks for studying and assessing the development of computational thinking

  • N.C.C. Brown et al.

    Bringing computer science back into schools: Lessons from the UK

  • P.S. Buffum et al.

    A practical guide to developing and validating computer science knowledge assessments with application to middle school

  • Q. Burke

    The markings of a new pencil: Introducing programming-as-writing in the middle school classroom

    The Journal of Media Literacy Education

    (2012)
  • P.A. Cáceres et al.

    Efecto de un modelo de metodología centrada en el aprendizaje sobre el pensamiento crítico, el pensamiento creativo y la capacidad de resolución de problemas en estudiantes con talento académico

    Revista Española De Pedagogía

    (2011)
  • P.A. Carpenter et al.

    What one intelligence test measures: A theoretical account of the processing in the raven progressive matrices test

    Psychological Review

    (1990)
  • A. Cartelli et al.

    Bebras contest and digital competence assessment: Analysis of frameworks

  • J. Cohen

    A power primer

    Psychological Bulletin

    (1992)
  • L.J. Cronbach et al.

    Construct validity in psychological tests

    Psychological Bulletin

    (1955)
  • CSTA

    K–12 computer science standards

    (2011)
  • CSTA et al.

    Operational definition of computational thinking for K–12 education

    (2011)
  • V. Dagiene et al.

    Bebras international contest on informatics and computer literacy: Criteria for good tasks

  • V. Dagiene et al.

    Informatics education based on solving attractive tasks through a contest

  • S.B. Daily et al.

    Dancing Alice: Exploring embodied pedagogical strategies for learning computational thinking

  • T.E.A. Ediciones

    PMA: Aptitudes Mentales Primarias (manual técnico)

    (2007)
  • M. Edwards

    Algorithmic composition: Computational thinking in music

    Communications of the ACM

    (2011)
  • European Schoolnet

    Computing our future

    (2015)
  • H. Gardner et al.

    The App Generation: How today's youth navigate identity, intimacy, and imagination in a digital world

    (2013)
  • L.A. Gouws et al.

    Computational thinking in educational activities: An evaluation of the educational game light-bot

  • E. Graczyńska

    ALICE as a tool for programming at schools

    Natural Science

    (2010)
  • S. Grover et al.

    Computational thinking in K–12: A review of the state of the field

    Educational Researcher

    (2013)
  • Cited by (454)

    View all citing articles on Scopus
    View full text