Go/No Go Criteria in Formative E-Rubrics

Company, Pedro; Otey, Jeffrey; Agost, María-Jesús; Contero, Manuel; Camba, Jorge D.

doi:10.1007/978-3-319-91152-6_20

Pedro Company¹⁵,
Jeffrey Otey¹⁶,
María-Jesús Agost¹⁷,
Manuel Contero¹⁸ &
…
Jorge D. Camba¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10925))

Included in the following conference series:

International Conference on Learning and Collaboration Technologies

2347 Accesses

Abstract

Learning processes rely on the progressive assimilation and objective assessment of concepts. In this regard, learning rubrics are commonplace. However, the design of these rubrics has focused mainly on scoring (summative rubrics), whereas formative rubrics have received significantly less attention. Electronic formats (e-rubrics) enable dynamic and interactive functionalities that facilitate the adaptable and adaptive delivery of content. In this paper, we examine three characteristics to make formative rubrics more adaptable and adaptive: criteria dichotomization, weighted evaluation criteria, and go/no-go criteria. A new approach to the design of formative rubrics is introduced, where dichotomization and weighted criteria are combined with the use of go/no-go criteria. The approach is discussed as a method to better guide the learner while adjusting to the student’s assimilation pace. Two types of go/no-go criteria (hard and soft) are studied and experimentally validated in a Computer-Aided Design assessment context.

You have full access to this open access chapter, Download conference paper PDF

Teachers as designers of formative e-rubrics: a case study on the introduction and validation of go/no-go criteria

Article 18 July 2019

Design of tasks for online assessment that supports understanding of students’ conceptions

Article 29 July 2017

Utilising Learnersourcing to Inform Design Loop Adaptivity

Keywords

1 Introduction

Learning rubrics are scoring guides constructed of descriptors (or evaluative criteria) that establish specifications to assess. These criteria should align with the formative objectives [1] and are sometimes called evaluation tables, as they are typically arranged in a table format.

One of the main objectives of a rubric is to standardize and accelerate the evaluation process by highlighting the most relevant aspects of the subject matter. However, many authors claim that rubrics should go beyond assessment, as their continuous development has made them useful for informing and motivating the assessed subjects.

Rubrics can be classified according to different criteria [2]. Holistic rubrics offer a global view of the evaluation, whereas analytic rubrics provide a detailed view of different items. Similarly, general rubrics can be used for an entire course, while task-specific rubrics may focus on one particular assignment or project. Summative rubrics produce a final global score, sorting subjects by those who pass and those who fail the evaluation. Formative rubrics provide performance feedback by conveying information about the strengths and weaknesses of the subjects [3]. The research reported in this paper focuses on analytic task-specific formative rubrics.

Formative rubrics help assessed subjects determine their own progress throughout the training period. Since different formative levels have different needs, these rubrics must be designed, formatted, and applied based on their specific purpose and context of use. In particular, we are interested in self-evolving rubrics that can adapt automatically to the learning pace of each student.

This paper examines the design and use of formative rubrics in higher education, where rubrics for specialized content are common [4]. The use of computer technologies and formats is discussed as a means to maximize the benefits of adaptable and adaptive rubrics. To this end, certain characteristics such as criteria dichotomization, weighted evaluation criteria to provide various levels of importance, and go/no-go criteria have been linked to new strategies that adjust to the learning pace of each student. More specifically, we describe the use of the go/no-go criteria in combination with dichotomization and weighted criteria as a strategy to better guide the learner.

The paper is structured as follows: in the next section, a brief review of the state of the art in rubrics is provided as a set of commonly accepted lessons learned. Next, we examine the concepts of dichotomization and levels of importance in evaluation criteria and discuss strategies on how they can be used to control the student’s assimilation pace. Two types of go/no-go criteria are studied and validated in a computer modeling context. Finally, general recommendations are discussed based on the results of two experimental studies.

2 State of the Art

The state of the art on rubrics development can be summarized as a set of commonly accepted lessons-learned. Rubrics can manage complex evaluation scenarios by defining subsets of homogeneous criteria called dimensions. In this regard, clustering techniques can be used to join descriptors in relatively homogeneous natural groupings. It has been stated that “dimensions are useful to work with hierarchical rubrics, which organize criteria in different levels” [5]. Dimensions are also useful to work with complex assessments, which evaluate heterogeneous criteria [6].

Each criterion is evaluated by measuring the degree of compliance or achievement level of a particular situation. The achievement levels should be stated using the same terminology as the corresponding criteria. It is important to use a consistent scale throughout all achievement levels and avoid mixing positive and negative scales [7]. The number of achievement levels may vary depending on the situation. Although in some cases, the level of compliance may be dichotomously determined, whether or not a criterion is met is generally measured through a finite set of levels that discretize a continuum. A commonly used system for establishing discrete levels is based on Likert elements, usually with five achievement levels or points [8]. Rohrmann [9] states that category scaling enhances the usability of assessment instruments and well-defined qualifiers facilitate unbiased judgments.

While objective scoring is difficult to achieve, especially when self-assessing, achievement level categories provide unambiguous scales to properly rate quality. When providing performance scores to students, a preferred strategy involves moderate leniency, so confidence can be gradually built. Instead of penalizing students for each individual mistake, a proper assessment perspective may involve viewing those instances in a larger context to better determine whether they are significant enough to prevent awarding the highest rating.

Rubrics are designed to homogenize evaluation processes. A common strategy towards this end is to complement rubrics with anchors; i.e. written descriptions or examples that illustrate the various levels of achievement, or work samples [10]. Descriptions of good practices should be integrated into formative rubrics and used on demand, so the user of the rubric can be guided throughout the process of determining which criterion to check, how to do it, and the importance of the criterion in the overall result. Good practices are particularly important in distance and self-paced training courses, where the instructor is not always available.

An important consideration when designing a rubric is the type of format. Static formats can be easily implemented but cannot provide feedback to the user, so they are only suitable for summative assessment. Formative assessment requires rubrics that can adapt. In this sense, formative rubrics must be dynamic [6].

Dynamic formats process information to provide feedback to the user, and can be adapted to specific cases and users with different levels of expertise. Two types of dynamic rubrics can be distinguished. Rubrics are adaptive if the instructor can design different rubrics for different stages of formation [11]. They are adaptable if users can vary the level of detail interactively to adapt the rubric to their learning rhythms [6]. Clearly, implementing these functionalities require the use of electronic rubrics (e-rubrics). To this end, dedicated applications to manage rubrics are needed, since standard tools such as spreadsheets are not fully adequate [12]. In the next sections, strategies to design adaptable and adaptive rubrics are discussed.

2.1 Adaptable Rubrics

Rubric criteria must be arranged according to gradually increasing levels of detail, so every user has the opportunity to select the level that better adapts to his/her understanding, thus optimizing the formative action. According to Company et al. [6], an effective strategy to accomplish this functionality with e-rubric tools is to allow the user to “fold” and “unfold” the level of detail of the criteria interactively. A basic rubric with hierarchical levels of criteria is illustrated in Fig. 1. All rubric criteria are shown unfolded, as interactivity cannot be shown in a static image.

Dichotomous criteria can be defined when assessing simple aspects of a task, but also when itemizing the assessment with large amounts of criteria. Therefore, level decomposition indirectly favors the dichotomization of the assessment process.

Similarly, the role of weights as focal pointers is essential. Disagreements between student’s and instructor’s perceptions will naturally reveal discrepancies on the goodness of the evaluated task. Therefore, making the perceived importance of each criterion explicit helps students focus on “what counts” (those criteria that the instructor wants to prioritize). Additionally, adjusting weights throughout the training period is a method to shift focus from criteria that are already achieved to those that require a longer maturing time.

Finally, all levels must be described using the same terms as the main criterion, but each level is characterized by appropriate qualifiers for each type of attribute. The attributes are the characteristics underlying the criteria, such as, for example, frequency, intensity or probability. Other authors such as Rohrmann [9] established achievement levels based on different attributes.

2.2 Adaptive Rubrics

To benefit from the formative nature of rubrics and accommodate to the student’s learning pace, general concepts are typically revealed gradually, as successive rubrics throughout a course. Alternatively, an unfold/fold strategy can also be adopted where only low level (deployed) criteria are shown at the beginning of a course, and other versions displaying higher level criteria are introduced only to students who already mastered the low level criteria. Consequently, the unfold/fold strategy is useful both for the student, who can adapt the criteria to her optimal level of understanding, and the instructor, who can make criteria less specific throughout the learning process as needed.

For a more accurate adjustment to a user’s level, an e-rubric tool must allow the configuration of gates, i.e. the different options that are available to the user based on his/her previous responses to particular criteria, so the rubric can automatically update with new and more complete content every time the user reaches a certain level of achievement in a previous (more basic) rubric. For example, a rubric can be configured so that only when a user obtains a minimum score of 8 points out of 10 in lesson 2, will the contents of lesson 3 be available. Another possibility is to make the system display messages that reinforce learning based on the results achieved. If the score does not reach a minimum threshold in certain questions, an automatic message can inform the student to review the particular lesson(s) related to those contents.

Gates provide active and customized feedback depending on user progress. The instructor has adaptive control over the gates from one stage to another, whereas users have adaptable control of the information needed to complete each stage. Therefore, adaptive rubrics must be coordinated with the lesson plans.

A sample course schedule with the introduction plan of lower level criteria during a period of six weeks (following a bottom-up approach) is shown in Fig. 2. These low-level criteria will later be replaced by more general criteria. Note that knowledge and/or procedures may become exclusionary in the last weeks (indicated with an “X”). In the next section, the use of go/no-go criteria is discussed as a means to implement this behavior.

3 Go/No-Go Criteria

Go/no-go criteria are defined as exclusionary conditions inside descriptors. They establish basic conditions that must be met. Otherwise, the evaluation process is interrupted and the final score will be zero, regardless of other criteria. These go/no-go criteria must be explicitly identified. For example: “If the deliverable document contains many spelling mistakes, evaluation does not continue”. Figure 3 shows a hard go/no-go criterion, placed at the beginning of the rubric to avoid unnecessary assessments.

An alternative form of go/no-go is establishing a threshold parameter for pass/fail. An example of such a soft go/no-go criterion embedded in the final score (considering that after 10 errors, the grade becomes zero, regardless of other criteria) is shown in Fig. 3. As a result, catastrophic failures result in a no-go, while moderate failures reduce the final score, but do not prevent assessing other rubric dimensions. This soft go/no go is a recommended academic scoring alternative necessary to highlight critical failures, while avoiding unnecessary punitive student exam scores (so that maximum partial credit could be awarded). On the contrary, hard go/no-go criteria clearly send the message that some failures are unacceptable for already formed students.

3.1 Experimental Evaluation of Go/No-Go Criteria

The goal of our study was to validate the hypothesis that neither the use of hard nor soft go/no-go criteria affects the ability of students to self-assess their work. In other words, that the strong warning message sent by this type of criteria does not prevent the perception of the other criteria.

To this end, two pilot experiments were conducted (mid-term and final exam) to assess student understanding of CAD assemblies. These examples represent complex evaluation cases, where the following quality dimensions of CAD models/assemblies [5] were used:

1.
Models are valid if they can be accessed successfully by suitable software with no errors or warnings.
2.
Models are complete if all necessary product characteristics are provided for all design purposes.
3.
Models are consistent if they do not crash during normal design exploration or common editing tasks.
4.
Models are concise if they do not contain any extraneous (repetitive or fragmented) information or techniques.
5.
Models are clear and coherent if they are understood at first glance.
6.
Models are effective if they convey design intent.

Undergraduate students (beginning CAD users) at a Spanish university were introduced to a prototype system of assembly rubrics after being exposed to parts rubrics earlier in the semester. Detailed explanations of the assembly rubric dimensions were discussed and provided to the students prior to their exams. This material included thorough descriptions of the definition and significance of the quality dimensions as well as clarifications of the detailed criteria used to measure the degree of accomplishment of such dimensions. The five achievement levels were defined as: No/Never, Almost Never/Rarely, Sometimes, Almost Always/Mostly, Yes/Always (Rohrmann 2007) and quantified as 0, 0.25, 0.5, 0.75 and 1, respectively.

Completion of rubrics was required and considered correct if it matched the primary instructor’s (Instructor 1) evaluation (ideal). The primary instructor (Instructor 1) was the professor of record of the course; instructor 2 was a faculty member at a different institution whose sole responsibility was to assess the student work.

While the system was primarily developed to assess CAD model quality, the rubric itself can be assessed in terms of ease of understanding and use (which is an underlying research hypothesis). If a rubric is clearly understood each rater (instructor and student) should produce similar assessments.

Experiment 1

For this experiment, students were required to assemble a fitness equipment pulley (Fig. 4, right) using four non-standard parts (previously modeled, as displayed in Fig. 4, left) and various standard parts.

The original intent, as explained to students, was to identify validity as a hard go/no-go switch, so the assembly would fail assessment if all linked files could not be located or used. Thus, weights were assigned as follows: valid (hard go/no-go criterion that multiplies the overall score using the remaining rubric dimensions), 0%; complete, 20%; consistent, 30%; concise, 20%; clear, 15%; and design intent, 15%.

Students were provided the solution after submitting their exams in order to self-assess their performance against an ideal solution. Although students were informed that Dimension 1 (validity) would be a hard go/no-go criterion (i.e., failure to submit a valid file would result in a non-passing grade for the exam), a soft go/no-go criterion was enforced by instructors (with up to half-credit being awarded to avoid unnecessarily punitive scoring).

A total of fifty students took the mid-term exam, but only forty-six submitted the self-assessment rubrics. Table 1 summarizes the results of the experiment which can be found at Otey (2017), and illustrates the inter-rater reliability scores for the mid-term exam (for the student and both instructors). As described by Otey [13], the experiment demonstrated stronger agreement between instructors than either instructor with the students, for all dimensions. Agreement between instructors and students was obtained for the dimensions of validity, completeness, and consistency, but weak agreement exists for conciseness, clarity and design intent.

Table 1. Inter-rater reliability scores for midterm exam

Full size table

Experiment 2

Following a similar procedure as the first experiment, the final exam required assembling a mechanism. More specifically, students were asked to assemble a mechanical filter using four non-standard parts (previously modeled) and assorted standard parts. The assembly is shown in Fig. 5.

Fifty-one students submitted self-assessment e-rubrics using our system. The students were assessed on assembly sequence and the use of sub-assemblies. This time, however, the students were informed that Dimension 1 (validity) would be assessed as a soft go/no-go criterion. As an example, a validity score of 0.5 would result in the remaining criteria receiving half value.

The inter-rater reliability scores for the final exam (for the student and both instructors) are shown in Table 2. Once again, there is greater agreement between the instructors than between instructor and students. There exists moderate to strong agreement for Dimension 1, between both instructors and between instructors and students. There is strong agreement between instructors for Dimensions 1, 2, 4, and 5 and little agreement between instructors and students for any dimension other than validity. It appears that there is no measureable increase in agreement for all dimensions other than validity, for instructors and students, between the mid-term and final exam. Reasons for the lack of increase could be attributed to time, as there was only three weeks between exams. This little time could not have been enough for many students to grasp missed concepts and improve their performance.

Table 2. Inter-rater reliability scores for final exam

Full size table

A slight increase can be observed when comparing Tables 1 and 2 in the agreement of some dimensions over time, which we speculate could be due to the previous exposure to the rubric. However, there are no significant changes in the inter-rater reliability. Thus, we can validate the hypothesis that switching from hard to soft go/no-go criteria does not affect the ability of students to self-assess their work, since the similarities in the assessments of raters (instructors and student) imply that the rubric is clearly understood.

Ideally, it would be useful to determine whether the correlation for each dimension significantly improved or decreased. However, since the r-value is bound between 0 and 1, it is exceedingly difficult to construct meaningful conclusions about this matter. A linear relationship between the correlation values cannot be assumed, but even if the change in correlation values were significant, would it be consequential? Even with perfectly defined rubric dimensions, it is impossible to remove all subjectivity, which may cloud any definitive judgment. In such cases, only the professional expertise of the investigator would guide those determinations. Regardless of this lack of statistical certainty, a pronounced general pattern emerges that reflects a positive directional improvement for a majority of rubric dimensions (between both instructor and student, and between instructors).

Finally, it is worth investigating in the future, whether the clearly increased agreement for Dimension 1 is only due to the previous exposure, or to the fact that the explicit declaration of the go/no-go criterion as soft frees the students from a possible panic to recognize the failure, since it is no longer catastrophic.

4 Conclusions

Rubrics, particularly summative rubrics, are mainly used to standardize and facilitate evaluation processes. However, rubrics can also become formative tools to convey performance information to students. This paper revisited the concept of rubrics to further extend some aspects related to making formative rubrics more adaptable and adaptive: criteria dichotomization, weighted evaluation criteria, and go/no-go criteria.

Formative rubrics should be discussed as learning content is presented, and not after teaching. They should also adapt to the learning rhythm of students. According to the subject who controls adaptation, two types of dynamic rubrics are distinguished. Rubrics are adaptable if users can interactively vary the level of detail of criteria, and they are adaptive if the instructor can design different rubrics for different stages of formation. E-rubrics allow incorporating dynamic tools for adaptable and adaptive rubrics. The possibility to unfold or fold the level of detail of the criteria, and query the anchors associated to the levels of attainment, allow the students to obtain an improved understanding of the matter, when needed.

Additionally, adaptive rubrics allow instructors the progressive introduction of the evaluable concepts. Therefore, adaptive rubrics should be coordinated with the teaching guide. Timetables are suggested to plan the pace to introduce lower level criteria during the first weeks (following a bottom-up approach), and replaced them later by more general criteria, whose knowledge may become exclusionary (by way of go/no-go criteria) at the end of the instructional period.

Go/no-go criteria (when a failure in one criterion is so critical that it prevents analyzing other aspects of the subject’s performance), are recommended, but they must be explicitly identified—and included as such—in the descriptor. Finally, go/no-go criteria can become soft by including a threshold parameter (Ex. After ten errors, the assigned grade becomes zero, regardless of satisfying other rubric criteria).

References

Popham, W.J.: What’s Wrong—and What’s Right—with Rubrics. Educ. Leadersh. 55(2), 72–75 (1997)
Google Scholar
Educational research service. focus on: developing and using instructional rubrics. Educational Research Service (2004)
Google Scholar
Panadero, E., Jonsson, A.: The use of scoring rubrics for formative assessment purposes revisited: A review. Educ. Res. Rev. 9, 129–144 (2013)
Article Google Scholar
Reddy, Y.M., Andrade, H.: A review of rubric use in higher education. Assess. Eval. High. Educ. 35(4), 435–448 (2010)
Article Google Scholar
Company, P., Contero, M., Otey, J., Plumed, R.: Approach for developing coordinated rubrics to convey quality criteria in MCAD training. Comput. Aided Des. 63, 101–117 (2015)
Article Google Scholar
Company, P., Contero, M., Otey, J., Camba, J.D., Agost, M.J., Perez-Lopez, D.: Web-Based System for Adaptable Rubrics: Case Study on CAD Assessment. Educ. Technol. Soc. 20(3), 24–41 (2017)
Google Scholar
Tierney, R., Simon, M.: What’s still wrong with rubrics: focusing on the consistency of performance criteria across scale levels. Pract. Assess. Res. Eval. 9(2), 1–7 (2004). http://www.pareonline.net
Google Scholar
Likert, R.: A technique for the measurement of attitudes. Arch. Psychol. 22(140), 55 (1932)
Google Scholar
Rohrmann, B.: Verbal qualifiers for rating scales: sociolinguistic considerations and psychometric data. Project Report, University of Melbourne/Australia (2007)
Google Scholar
Jonsson, A., Svingby, G.: The use of scoring rubrics: reliability, validity and educational consequences. Educ. Res. Rev. 2, 130–144 (2007)
Article Google Scholar
Georgiadou, E., Triantafillou, E., Economides, A.A.: Evaluation parameters for computer-adaptive testing. Br. J. Edu. Technol. 37(2), 261–278 (2006)
Article Google Scholar
Company, P., Otey, J., Contero, M., Agost, M.J., Almiñana, A.: Implementation of adaptable rubrics for CAD model quality formative assessment purposes. Int. J. Eng. Educ. 32(2A), 749–761 (2016)
Google Scholar
Otey, J.: A contribution to conveying quality criteria in mechanical CAD models and assemblies through rubrics and comprehensive design intent qualification. PhD Thesis, Submitted to the Doctoral School of Universitat Politècnica de València (2017)
Google Scholar

Download references

Acknowledgements

This work was partially supported by grant DPI2017-84526-R (MINECO/AEI/FEDER, UE), project “CAL-MBE, Implementation and validation of a theoretical CAD quality model in a Model-Based Enterprise (MBE) context.”

Author information

Authors and Affiliations

Institute of New Imaging Technologies, Universitat Jaume I, Av. de Vicent Sos Baynat, s/n, 12071, Castellón, Spain
Pedro Company
Zachry Department of Civil Engineering, Texas A&M University, 199 Spence St, TX, 77840, College Station, USA
Jeffrey Otey
Department of Mechanical Engineering and Construction, Universitat Jaume I, Av. de Vicent Sos Baynat, s/n, 12071, Castellón, Spain
María-Jesús Agost
Instituto de Investigación e Innovación en Bioingeniería (I3B), Universitat Politècnica de Valéncia, Camino de Vera, s/n, 46022, Valencia, Spain
Manuel Contero
Gerald D. Hines College of Architecture and Design, University of Houston, 4200 Elgin St, Houston, TX, 77204-4000, USA
Jorge D. Camba

Authors

Pedro Company
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Otey
View author publications
You can also search for this author in PubMed Google Scholar
María-Jesús Agost
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Contero
View author publications
You can also search for this author in PubMed Google Scholar
Jorge D. Camba
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jorge D. Camba .

Editor information

Editors and Affiliations

Cyprus University of Technology, Limassol, Cyprus
Panayiotis Zaphiris
Cyprus University of Technology, Limassol, Cyprus
Andri Ioannou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Company, P., Otey, J., Agost, MJ., Contero, M., Camba, J.D. (2018). Go/No Go Criteria in Formative E-Rubrics. In: Zaphiris, P., Ioannou, A. (eds) Learning and Collaboration Technologies. Learning and Teaching. LCT 2018. Lecture Notes in Computer Science(), vol 10925. Springer, Cham. https://doi.org/10.1007/978-3-319-91152-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-91152-6_20
Published: 30 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91151-9
Online ISBN: 978-3-319-91152-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics