Keywords

1 Introduction

Knowledge is composed of layers. Let us think of something familiar like real variables in mathematics. Real variables are a generic concept based on the properties of objects from an inferior layer, like numbers. As variables may refer to any real number, properties common to all reals are required to transform and manage any variable. In turn, numbers are an abstraction of reality, used to quantify, relate and compare objects, distances, etc. To fully master the concept of a real variable, previous layers of knowledge (reality, numbers, reals, properties...) are required.

Real variables can conceivably be understood and used without mastering all inferior layers of knowledge. However, it appears clear that failing to master all those layers would lead to problems. In fact, it is common to see students learning to manipulate simple equations using a set of automatic operations without reasoning. Even on higher education, many students can be puzzled when asked for reasons on why factors “cancel out” or why addends change sign when moved from one side to the other. This becomes clear when they are first asked to do matrix or modular algebra, and start failing on simple concepts like the absence of commutative properties.

Many understanding problems at higher levels of abstraction can be associated to lack of mastery on lower levels. This lack of mastery prevents students from understanding higher-level concepts, and push them to memorize rules instead of learn by understanding. A simple teaching strategy would be making students aware and helping them improve their mastery of lower abstraction levels. However simple, this strategy is rarely put into practice. Moreover, teaching tendencies seem to be moving opposite. When students seem unable to solve assignments, the focus is often put into the way these assignments are presented in first place, and on teaching methodologies afterwards. It is common to redesign assignments towards step-by-step guidance or “fill-in the gaps”-like models, in the hope of helping students understand and acquire knowledge without hassle. Whenever this fails, lack of interest is blamed and making content and assignments more appealing seems the way to go.

Many subjects are considered difficult by nature. Mathematics [16], Physics [19], Chemistry [2], Programming [14], etc. These subjects are often criticized by students, who claim that they are boring, complex and even phobic [4]. Is this difficulty coming from the very nature of these subjects? Could it possibly be an effect of the lack of lower-level knowledge? This is a difficult question and plenty of evidence is required. This work is a first step towards answering it.

This work is circumscribed to Programming as a subject and ability taught at Computer Science and Engineering degrees. An intervention experiment is carried out and evidence is gathered in the form of student results and surveys. The experiment is focused on teaching students something that is considered extremely difficult by nature, as it is machine code and assembler programming. The intervention is done with first and fourth year students. Fourth years are also required to use assembler for an actual project, and they are delivered a second test. Evidence gathered from both experiments is analysed and discussed. Results seem to favour the idea that a lack of low-level knowledge could be responsible for higher-level difficulty perception and performance of students.

2 Background

The teaching of programming in universities is based on the proposals of ACM/IEEE [3]. These proposals recommend using high-level languages and teaching paradigms such as imperative, object-oriented and functional programming. In fact, this latest revision [3] is directly focused on object orientation as the basis. Even so, there is still no general consensus on the paradigm for starting to teach [18]. The most popular alternatives are structured and object-oriented programming.

To facilitate their learning, teachers have designed all sorts of strategies. These innovations are mainly methodological: implementation of active methodologies in the classroom [17], use of role-playing game [20], use of tools to visualize execution [23], learning through programming errors [22], creation of video games in Scratch as an introduction [1], or use of Augmented Reality as a motivating element [8].

These innovations generally improve student motivation but learning problems remain. Despite all the teachers’ efforts, students still have many difficulties in learning to program [7, 10]. Perhaps an important fact is being overlooked: all these strategies forget the low level details. The most probable consequence is that students are erroneously modelling basic knowledge needed for properly program. [5] proposes a reform in a programming course to focus on fundamental concepts such as machine code instructions, the random access memory model and the fetch-execute cycle. Assembly language programming is taught through examples of translations from C to assembler language constructs.

In 1983, Johnson-Laird [11] described the representations currently in use to explain the learning process in science [12]. In his theory he describes the mental models as structural analogous approximations of the world that individuals make. According to Johnson-Laird, they are unscientific, incomplete and unstable: they are simply useful for the subject’s predictions.

Conceptual models are used in science [9]. They are external representations, accurate, complete and consistent with scientifically shared knowledge. They can be materialized as mathematical formulations, analogies or material artifacts.

Learning [12] would consist of a modelling process, where students construct new mental models or adapt existing ones. As they develop more complete mental models closer to the reality they model, learning becomes more effective.

In order to students to be able to build a consistent mental model for programming, perhaps they need to know how the compiler works, the machine code it generates and the machine that runs it. Most programming courses do not teach all these low-level details. Therefore, students generate their own erroneous models to try to understand the actual functioning of the system. This is where the hypothesis of this paper lies: the lack of knowledge of low-level details could lead to difficulties at a later stage.

3 Proposal

In order to gather evidence about the research question, two intervention experiments, and a modification of a two-months long activity, have been designed and carried out:

  • Experiment A. An introductory course on machine code and assembler programming.

  • Experiment B. A classroom programming exercise about memory managing.

  • Activity. Development of a computer game for an 8-bit computer.

All of these activities give evidence on difficulty perceived by students as well as abilities acquired with and without low-level intervention. This section explains these three activities in detail.

3.1 Machine Code Course

The main idea of this course is to test the perceived difficulty of students without previous knowledge of computer programming. The course also explores the influence of adding low-level knowledge to students already trained on computer programming. Moreover, the course was also designed with an innovative educational perspective which also tests whether difficulty of similar courses could be due to their methodology instead of their content.

The course was titled “Programming and videogames from scratch: mastering z80 assembler”Footnote 1 and abbreviated as DEZ80 for its spanish initials. First year students were notified about DEZ80 before the start of their initial term. Students interested in the course filled in a form to apply. Out of this appliance, participants were selected randomly, after splitting them proportionally into two groups depending on their selected degree (Computer Engineering or Multimedia Degree). Finally, selected students attended to the course in person the week before the start of their initial term. During that week, they stayed programming in Z80 assembler from 16:00h to 20:00h, Monday to Friday.

DEZ80 (Fig. 1) starts from scratch, requiring no previous knowledge, and teaches how to program in machine code and assembler. DEZ80 focuses students on developing a simple game, which is divided into constructive subtasks. Everything is based on more than fifty videos that students attend at their own pace and will. Although in the intervention experiment the course was given in-person, it is kind of a MOOC [21] by design. All DEZ80 videos and content are available onlineFootnote 2.

Fig. 1.
figure 1

Video presenting one of the challenges of DEZ80

DEZ80 is innovative in the way it presents tasks to students. From its very beginning, students are required to practice first, using theoretical videos as support for their practice. Students are asked to try things even if they fail, and to use failure as a way of learning. They are taught to understand programming practice in a similar way to sports practice: whenever you fail at tennis, you try again and use failures as source for learning. This is similar to a flipped classroom method [13], but applied on a self-paced MOOC.

DEZ80 is composed of levels. Each level has a final challenge, and several preparatory challenges. The goal for students is to pass each level’s final challenge. So, they may decide to go learning challenge-by-challenge, or directly confronting the final one. There is no rule for them, other than having to pass each final challenge. Practical videos also include examples for guidance on how to face challenges. Except initial ones, challenges are generally open problems: they do not have a closed solution. For example, first final challenge is to create a program to draw an sprite. There is no template or rule on how to create that program or which sprite to draw. That is always up to student’s creativity.

DEZ80 was designed to be finished by students in 20 to 30 h. As this is impractical on a modern PC, an Amstrad CPC 464 was used as target platform. This classic machine has 64 K of RAM and a Zilog Z80 8-bits processor which runs at roughly 4 Mhz. This machine is simple enough to be understood by students at its lowest-level of abstraction in a short period of time. Although a real Amstrad CPC 464 was brought to in-person lessons, students were given a free-software emulator called WinAPEFootnote 3. This is a very accurate emulator that has the advantage of having a debugger and let students explore memory, processor status and step-by-step execution.

Students were asked to fill in a surveyFootnote 4 before starting and after ending the course. Main items in these surveys are about their prior programming knowledge, perceived difficulty of the challenges, learnt concepts and satisfaction.

3.2 Memory Management Exercise

This exercise is designed to test students’ understanding on how to perform memory management and on memory layout. This is a crucial knowledge to have, as most performance-demanding applications require delicate memory management and solid understanding. Fourth year students are expected to have a decent understanding in this matter.

The exercise is straight forward if concepts involved are understood. It consists on typing a program to do these 4 steps in order: 1. Reserve dynamic memory to store a matrix, 2. Fill in the matrix with data, 3. Print the matrix to a terminal, 4. Release memory. Students have to perform this exercise in two different ways:

  1. 1.

    Students are shown and explained a valid code that performs this exerciseFootnote 5. Afterwards, they are asked to create an identical program by themselves.

  2. 2.

    After finishing the first part, students are given a link to the valid solution. Then, they are asked to modify that solution in order to perform only 2 memory reservation requests, independently of the size of the matrixFootnote 6.

Technically, the first part reserves an array of pointers to matrix rows, and then reserves an array for each row iteratively. The second part asks the students to understand that there is no need to reserve rows one by one. Then the solution consists on reserving the pointer array in the first request, and enough space for all the rows in the second. Then, the only extra task required is to fill the pointer array making each item point to its assigned memory for one row.

However simple, this exercise tests students ability to visualize and understand memory layout, to understand the reservation/liberation process and to understand how pointers actually work. All three of them are critical abilities for any well-trained computer engineer. Also, all these abilities are related to an understanding of concepts below the surface of the programming language students use. Students that understand concepts have no problem solving this exercise, whereas those basing their knowledge on high-level rules have serious troubles.

The complete explanation to the students on how to perform the exercise and the hints they are given is available onlineFootnote 7. This same lesson was given in 2016 and in 2017. Students were explained the original code, asked to replicate it (1st part), then they received a general explanation about memory and management, they were asked to manage it manually (2nd part) and they were given a hint on printing and visualizing memory layout of the matrix and the pointers that constitute it. The only difference between 2016 and 2017 was that 2017’s students were given DEZ80 course on their first week, and they developed their 8-bit computer games (see Sect. 3.3) in Z80 assembler instead of C programming language.

3.3 8-Bit Computer Game

Fourth year students have been asked to create a computer game for the Amstrad CPC 464 8-bits machine during latest five years [6]. An international contest called #CPCRetroDev has also been organized all these years, to motivate students. The idea is to make them go one step forward from lessons, and try to develop a real product. Their games compete in the contest against developers from all around the world, and are actually played by Amstrad CPC users. A jury of 12 known veteran game developers and gaming experts votes best games. Voting is done live in an awards giving ceremonyFootnote 8 (Fig. 2) which simulates Eurovision song contest. Best games receive economic prizes and special mentions given by the veteran game developers. This new environment helps them starting to be professionals about their developed products, instead of just treating assignments as mere college lessons.

Fig. 2.
figure 2

#CPCRetroDev 2017 awards giving ceremony live streaming

The first four years, students were allowed to use C programming language and were encouraged to include a little bit of assembler. In four years, only one student out of one hundred and ninety one added a little bit of assembler to his assignment. This latest year students were given the DEZ80 course during their first week, and then they were asked to create their games in Z80 assembler. Assignment rules required at least 80% of their code to be in assembler, being able to use C for most difficult task (namely, artificial intelligence of enemies).

Switching from C language to assembler this latest year was planned to test performance and understanding differences. Considering the hypothesis, students becoming aware of their low-level flaws in understanding may show improvements on higher-level exercises and abilities. To test that hypothetical improvements, the memory management exercise was also considered. Previous year students did that exercise after creating their games in C language, and therefore statistical differences might gave some evidences on this matter. Moreover, students were polled before and after creating their assembler games to know their perceptions on assembler difficulty and relevance.

All editions of the #CPCRetroDev contest, along with games submitted, may be found onlineFootnote 9.

4 Results

Let us start by analysing results relative to Experiment A. 100 students out of 300 new first years were selected randomly to participate in DEZ80 course. In the survey they filled in previous to the start of the course, they were asked about their prior programming knowledge. As can be seen in Fig. 3, 55% of them reported no prior knowledge at all (represented as 0 years of programming experience). With respect to students with prior knowledge, 38% reported C/C++, 18% Java and 14% Scratch as most known languages. Several students reported more than one language, which means that languages shown Fig. 3 are not disjoint.

Fig. 3.
figure 3

Student self-reported prior programming knowledge (left: languages known, right: years of experience)

Figure 4 shows performance differences between students with no prior programming experience and those with previous experience. It let us test the hypothesis with respect to several points. First, there seems to be no significant difference between experienced and inexperienced students. They overcome challenges in similar proportions, with apparently no significant dependence of their prior experience. Only challenges 8 to 10 seem to be slightly easier for the experienced. In fact, only some inexperienced students passed challenges 13 to 15. It is also interesting the behaviour in challenges 2 to 4. Students were instructed to select their own pace, which included doing challenges in their preferred order. Some experienced students took this advice and skipped some of the easiest challenges. This behaviour did not happen to inexperienced students: without no previous knowledge, it seems logical not to skip any challenge as there are more things to be learnt.

Fig. 4.
figure 4

Challenges passed by students depending on prior programming experience

After the course, students were asked to report their perceived difficulty for each challenge with a Likert [15] scale, from 1 (very easy) to 5 (very difficult). Figure 5 confirms that students’ perceived difficulty is clearly aligned with challenges overcome. It shows that students’ perception does not differ significantly between experienced and inexperienced. There is an interesting difference between challenges 7 to 9, which are found a little bit easier by experienced students. It seems correlated with the small performance difference shown at challenges 8 to 10. Concretely, challenge 8 seem to be easier for experienced students. This challenges refer to the creation of animations with repeating patterns. For these patterns, loops construction is required. Loop structures might be slightly difficult to understand for inexperienced students.

Fig. 5.
figure 5

Average students’ perceived difficulty by challenge (only challenges passed by more than 1 student are considered)

Although students perceive difficulty as increasing with each new challenge, this increase seems to be perceived as moderate. Figure 5 shows that it starts around 2 (easy) for the first challenge, and ends at 3 (normal). It is only reported slightly above normal for challenge 10. This result is unexpected for assembler programming challenges, as it contradicts the general perception that assembler is inherently difficult.

Experiment B was conducted with fourth year students. It had two main parts: first was to replicate a code shown by the teacher, and second was to elaborate a new memory management model as requested. Only results from the second part are analysed here. Students were asked to reserve memory for a matrix, using only two requests, then assign pointers, and then release the matrix. Figure 6 shows percentage of students that correctly solved each stage of the exercise.

Fig. 6.
figure 6

Student achievements in the second part of experiment B

Results of experiment B are deeply significant. As incredible as it might seem, students from previous year (2016) were unable to solve the exercise. Only one student was able to correctly perform one of the three parts of the exercise: the memory reservation. In fact, the classroom experience showed that no single student was able to make a proper drawing of the memory layout of the matrix. They were hinted to do it, then required to do it. Only some of the students were able to print the values of the pointers without assistance, and no one of them was able to make a drawing properly locating pointers and their spatial relations.

As Fig. 6 shows, a significant part of latest year students (2017) were able to solve the exercise. 25% of the students solved all 3 parts of the exercise and more than 50% solved at least one of the three parts (note that results from the parts are not disjoint). This represents a heavily significant difference in performance with respect to previous year. Lessons timing, assignments and teacher were the same during 2017 than in 2016, except for students being asked to develop their games in Z80 assembler. 2017’s students were asked to do DEZ80 course on their first week then develop their 8-bit games in Z80 assembler instead of C language.

Evidence suggests that using assembler instead of C could have had an impact in their understanding that helped them solve the memory exercise afterwards. As working at assembler level requires students to directly manage memory by themselves (even without allocators like new or malloc), a plausible explanation would be that it has helped them understand and visualize memory. This explanation is also supported by student declarations on personal interviews with teachers. In these interviews, 2017’s students reported that working at lower level had helped them understand pointers and memory. Some of them even declared that they are now aware that they previously did not understand pointers, even though they thought they did.

8-bit games developed by fourth year students as their first assignment also reveal interesting differences between 2017 and previous years. 2017’s students had same amount of weeks than previous year students (8 weeks) to develop their games. However, 2017’s students started by investing one week in DEZ80 course and had to develop their games in Z80 assembler. No single student had programmed in Z80 assembler before or for the Amstrad CPC 464. Approximately 35% of them had not even developed a game before. 27% had done a 2-weeks assembler programming assignment on previous years.

In order to develop a game for Amstrad CPC 464, 2017’s students had to learn Z80 assembler along with Amstrad CPC 464 architecture details, and develop enough experience in assembler programming. Students developed this abilities at the same time they were producing their 8-bit games, in 8 weeks. In contrast, 2016’s and previous years’ made their games in C programming language, which they already knew and have had experience with.

Teachers were prepared to be less strict assessing 2017’s games because of these starting conditions. However, results were very different from expectations, as can be seen on Fig. 7. Games are considered to be Fully Playable if they have a start, they can be played without problems, and they can be ended either on winning or on game over status. Games that have stability problems or glitches but can be played are considered Faulty, whereas games that cannot be played or do not work ar considered Incomplete. Advanced games are those that are Fully Playable and exhibit advanced programming techniques.

Fig. 7.
figure 7

Quality of 8-bit games developed during years 2016 and 2017

Figure 7 shows that Playable games in 2017 were 1.71 times those of 2016. That is a 71% proportional increase (from 40.9% to 70%). Moreover, in this increase notoriously affects Advanced games, which multiplied by more than 5 (from 4.5% to 25%). Just Playable games also increased from 36.4% to 45%. It is also relevant that this increase came from the dramatic decrease of the Incomplete games, that dropped from 36.4% to 5%. These results were also confirmed by most of the members of the #CPCRetroDev jury that made explicit their impression that the overall quality of the games improved from 2016 to 2017.

Again, this evidence contradicts the idea that assembler programming is intrinsically difficult. In fact, this evidence suggests that assembler programming might even be easier than higher-level languages in some contexts. It also supports the hypothesis that many higher-level problems shown by students may have their root in lower-level missconceptions. A plausible explanation for these results would be that students have improved their understanding on how their programs actually work in the execution environment. This improvement may explain these great performance gains with respect to previous years. It is important not to forget that their conditions were considered much harder (more thing to do in the same time, lower-level language, lack of knowledge) and that did not prevented them from achieving much greater results.

5 Conclusions

This work started by questioning that some subjects be difficult by nature. The existence of inherent difficulties inside these subjects is generally accepted by teachers and students. This work also proposes the idea that understanding problems when dealing with higher-level concepts or abstractions may be due to missundertanding at lower-levels of knowledge.

In order to gather evidence to test these ideas, this work restricted its application range to programming. In this subject, machine code and assembler language programming are considered to be intrinsically difficult and are also the lowest-level computing abstraction. Therefore, they were selected as subjects to design experiments to test the hypothesis.

First proposed experiment was an innovative course on machine code and Z80 assembler programming. The course was designed to teach both to students without prior programming experience. 100 undergraduate students undertook this course in person one week before the start of their first term at the university. Students attended from monday to friday, 16:00 h to 20:00 h investing a total of 20 h. They were given pre and post surveys to know their achievements and opinions. Results shown that students perceived Z80 programming assignments as easy or normal (between 2 and 3 in a Likert scale) in contrast to difficult or very difficult, which was the expectation. Moreover, students with previous experience on higher-level languages did not exhibit significant edge against inexpert students.

The course was also given to fourth year students the first week of the term. During latest five years, fourth year students had to develop an 8-bit computer game for Amstrad CPC 464. On previous years, this was done in C programming language. This latest year, after given the Z80 assembler programming course, they were required to develop their games in assembler. All games were submitted to the #CPCRetroDev international game development contest and were compared to previous year games by teachers and by an international jury. Although 2017’s students had harder starting conditions (need to learn Z80 assembler while they developed their games), their final performance was significantly higher that previous years’. The international jury assessed a general quality improvement in the games presented by students.

Another experiment was also performed to test the impact of low-level teaching with respect to higher-level programming abilities. Fourth year students were requested to solve a memory management exercise. This exercise required manual memory management, forcing students to visualize memory layout and understand management and pointers. 2016’s students were unable to perform this exercise. In contrast, 25% of 2017’s students completely solved the exercise and more than 50% correctly solved some of its parts. Only different between both years was the introduction of assembler programming in their first assignment (the 8-bits game).

All presented evidence suggests that machine code and assembler programming are not difficult by nature, as it is generally accepted. Presented results show that students were able to learn Z80 assembler programming and even performed better at programming with it than with higher-level languages in creating 8-bit computer games. This evidence was consistent amongst first and fourth year students.

Moreover, an increase in the ability to understand memory layout and management was shown by 2017’s students. Evidence suggests that this increase may be due to gains in understanding after programming in assembler. This evidence is also suggesting that higher-level understanding problems may be caused by lower-level missunderstanding, as it was hypothesized.

More evidence is required to support these conclusions and to test them in environments different to computer programming. However, assuming there was a low probability for the three experiments to give strong evidence in the same direction, it seems logical to conclude that machine code/assembler may not be difficult by nature. It is plausible to think that a hostile environment with many teachers and/or students prejudiced against machine code/assembler may be generating bad mindsets. It seems also promising to continue gathering evidence about effects of low-level programming in higher-level student abilities. More evidence supporting this idea could suggest that abandoning lower-level languages in teaching may be prejudicial for future engineers. Also, if this idea was to generalize well to other subjects, it could potentially have a great impact on opening new ways to improve teaching and learning in general.