3.2 Data Collection
This study collected two datasets to understand students’ attitudes and learning experiences: (1) pre- and post-student survey responses; (2) the apps created by students who were enrolled in the project curriculum.
The pre- and post-student surveys collected data on three dimensions of student attitudes: confidence and interest in coding and in tasks to be facilitated by the program (e.g., creating apps), and perceptions on CRC focused on serving community and social good. These items were measured on a five-point Likert scale (1
\(=\) low or negative, 5
\(=\) high or positive). The majority of the survey items were adapted from previously validated instruments. The items assessing confidence and interest in coding were adapted from the confidence and interest constructs of the Elementary Student Coding Attitudes Survey [
32]. Three new items were included to specifically refer to app design. Another additional item was adapted from the STEM Career Interest Survey [
22] to measure interest in a job related to coding. The CRC items assessed the extent to which students feel (1) they understand and are interested in learning about their own culture and community; (2) they are interested in other students’ cultures and can collaborate with others from different cultures; and (3) they can make apps connecting with their interests, life experiences, cultures and serving their community. These items were inspired by the
culturally responsive teaching (CRT) Survey [
29], which was designed to measure teachers’ CRT self-efficacy and associated students’ outcomes. Example items include “I feel comfortable describing my cultural background in this classroom,” “I can make apps to share my culture with others,” and “I can use my interests to make apps to help others.”
The survey items underwent content validation through review by the project’s external evaluator, teacher participants, and the whole research team. Before this study, the survey was piloted during the 2020–2021 school year, involving 51 students who completed both pre- and post-surveys. Teacher participants provided further feedback to the research team based on their students’ comments during survey administration. Although multiple qualitative validations were conducted, we recognize the need for more rigorous quantitative validations, particularly in the form of construct validation, to strengthen the overall validity of the student attitude survey. The reliability of the survey items was checked using Cronbach’s alpha. The results showed strong internal consistency of the three subscales: confidence (9 items, pre: \({\alpha}=0.87\); post: \({\alpha}=0.91\)), interest (8 items, pre: \({\alpha}=0.94\); post: \({\alpha}=0.95\)), and CRC perceptions (11 items, pre: \({\alpha}=0.87\); post: \({\alpha}=0.88\)).
The post-survey also contained three open-ended questions to capture students’ learning experiences. These open-ended questions included “What was your favorite part of this class?” “What was your least favorite part of this class?” “In this class, what did you learn about CS and making apps?” The open-ended questions provided students with the opportunity to express their thoughts about the course and elaborate on their responses and provide more detailed feedback.
Both pre- and post-student surveys were distributed through Qualtrics by the participating teachers during the first and last classes they taught the curriculum, respectively. Student apps were collected through their teachers. The apps were created either as final projects or as assignments while learning the course. In total, 92 apps were collected, including 24 apps from seventh grade and 68 from eighth grade.
3.3 Data Analysis
Student Attitude Analyses. To answer the first two research questions, we conducted three sets of analyses using the responses from the 294 students who answered both pre- and post-surveys. After removing responses with missing values, the sample sizes per construct were 245 (confidence), 247 (interest), and 250 (CRC), respectively.
First, descriptive statistics and paired sample t-tests were conducted to understand students’ attitudes before and after they learned the curriculum (RQ1). These tests compared the mean differences in students’ attitudes (confidence, interest, and CRC). The results provided us with an initial understanding of students’ attitude changes. Second, we compared the attitudes among students of different gender, race/ethnicity, and grade level to answer the second research question (RQ2). One-way ANOVA tests were used to examine whether students’ confidence, interest, and CRC perspectives differed significantly by gender and race/ethnicity. For grade comparison, unpaired sample t-tests were applied to examine the differences by grade (Grade 7 and Grade 8), excluding the Grade 6 data (\(n=1\)). For those significant ANOVA test results, post hoc analyses with a Bonferroni adjustment were conducted for multivariate pairwise comparisons to identify the source of differences. Third, based on ANOVA and post hoc analysis results at the construct level, responses to specific items in selected construct(s) were analyzed to gain a further understanding of the differences. All the quantitative analyses were conducted using RStudio.
When analyzing race/ethnicity differences, we reduced the race/ethnicity categories from 10 (
Table 2) to 6 because of the low number of responses in our sample. Thus, the categories we used in analysis were Asian all (
\(n=99\)), Southeast Asian (
\(n=82\)), Black/African American (
\(n=20\)), Hispanic/Latino (
\(n=50\)), White/Caucasian (
\(n=69\)), and Multiracial (
\(n=41\)). The American Indian and Alaska Native category was removed because of the low number of responses (
\(n=3\)). The Native Hawaiian/Pacific Islander category (
\(n=1\)) was combined into the Asian category to include the response, which is aligned with the student demographics categories in New York State [
35]. Similarly, East Asian, South Asian, Southeast Asian, and Native Hawaiian/Pacific Islander were combined as one group (Asian all). Meanwhile, Southeast Asians were analyzed as an individual group as they represent a significant population in one school district serving Southeast Asian refugees [
46]. To maintain consistency, we re-categorized students in the multiracial group to align them with other five categories. For instance, students who chose only East Asian and South Asian were reclassified as “Asian all” instead of “Multiracial.” It is important to note that the necessity to remove and combine responses due to low sample size represents a limitation of this study.
Analysis of Open-Ended Question Responses. To answer the third research question (RQ3), we conducted thematic analysis on students’ responses to the three open-ended questions. The inquiry of the qualitative data involved two steps. First, we performed thematic coding on the responses from the 294 students. Two researchers coded the responses separately and developed initial codes. Then these two researchers worked together to refine the codes including merging similar codes and identifying new codes. For the questions of students’ favorite or least favorite aspects of the class, 17 independent codes were generated for their favorite part of the class, with another 16 codes for the least favorite part of the class. These codes were further discussed with a third researcher to generate emerging themes inductively [
11]. Eventually, we identified five major themes regarding both students’ favorite and least favorite aspects of the class.
Tables 3 and
4 present the hierarchy of the codes and emerging themes.
The theme “Coding Experience” encompasses student responses related to their general coding experience, while “App Creation” is more specific to students’ experience in app development. “Classroom Instruction” summarizes responses pertaining to learning activities, topic introduced, and tools used in the class. “Classroom Community” focused on the interactions and collaboration among students and the teacher within the class. “Learning Opportunity” highlights students’ comments regarding the type of learning opportunities provided in the class. Additionally, “Lack of Confidence” addresses students’ comments regarding their lack of confidence in coding or creating apps.
Second, we sorted the codes and themes based on gender, grade, and race/ethnicity for cross-group comparisons. While the Alaska Native of Native American group (
\(n=3\)) was excluded from the quantitative analyses, we acknowledge that data from groups with small sample sizes are important and valuable [
47]. Therefore, open-ended question responses from this group were examined and will be discussed.
Analysis of Student Apps. To answer our research question regarding students’ capacity of making community-serving apps (RQ4), student apps were analyzed based on the topics addressed, app type, and complexity. The topics of student apps were categorized based on the purpose of the apps, using creators’ descriptions of the app purposes and their functionality. Apps were also examined in terms of whether they addressed issues or topics connected with students, their families, communities, and students being change agents for their communities or society.
Student apps were classified into four types: informational, utility (e.g., survey apps), game, or multiple. An informational app provides the user with information. A utility app was designed as a tool, such as a quiz to test the user’s knowledge with a defined answer, a survey to gather information, or a service app providing service for those in need. Game apps provided recreation. Apps presenting multiple types were coded as the multiple type.
To measure the complexity of student apps, we developed a rubric adapted from Sherman et al.’s work [
45], aligned with the project curriculum’s learning objectives. The rubric focused on basic App Lab functionality and CS concepts introduced in the curriculum. It examined to what extent the apps addressed the five CS concepts (events, variables, conditionals, iteration, and data storage as an optional component) and
user interface (UI) design (e.g., using images, buttons, and user inputs). The apps were scored using a 1–3 scale (1-beginner, 2-proficient, and 3-advanced) per element. Apps were then classified as beginner apps with a total score of 1–5 points, proficient apps with 6–10 points, and advanced apps scoring between 11 and 15 points. Apps with data storage features gained an extra 3 points in the complexity score.