Content area
Background
This study’s background is situated within the overlap between the 2022 administration of the National Assessment of Educational Progress (NAEP) and the COVID-19 pandemic. This made NAEP 2022 useful as a nationally representative, large-scale dataset of students who coincidentally underwent mass-remote online learning.
Purpose
The purpose of the present study was to utilize a temporally distinct dataset to investigate contextual factors related to remote learning that may inform remote learning more broadly in a post-pandemic world.
Methods
The study’s methods used a conceptual framework of physical and psychological resources and stressors to help select NAEP variables and present findings. A sample of 36,200 students who experienced remote learning was analyzed using hierarchical linear modeling. These procedures were chosen due to the nested structure of NAEP data.
Main finding
One main finding was that computing hardware and physical school supplies were key physical resources (when available) or stressors (when lacking) on student achievement. Another finding was that self-confidence and self-awareness in mathematics knowledge and skills were major psychological resources. Additionally, the study provided evidence about the potential drawback of unfettered, unsupervised internet access while in remote learning environments.
Conclusion
The study concluded on the educational importance of provisioning students with internet access and digital devices for remote learning, as well as, building confidence in remote learning technology.
Implications
Implications were actionable factors to support remote learning, increase remote learning equity, and reduce score gaps of students from economically disadvantaged backgrounds.
Introduction
The National Assessment of Educational Progress (NAEP) is the largest continual and nationally representative assessment of what students in the United States know and can do. Since 1969, NAEP has typically been administered every two years to a sample of students in grades 4, 8, and 12 (Beaton et al., 2011). Mathematics and reading are two principal content areas assessed. Additionally, contextual information is collected from participating students, teachers, and schools (e.g., ethnicity, teacher practices, schools’ conditions, etc.). NAEP results are released as aggregated estimates of student performance in each subject for various groups (e.g., gender, socioeconomic status, regional location, etc.) in a report named “The Nation’s Report Card” (Beaton et al., 2011). NAEP is administered by the National Center for Education Statistics, within the Institute of Education Sciences of the U.S. Department of Education.
There was a general decline in performance on NAEP 2022, compared to NAEP 2019. On the national level, grade 4 mathematics scores decreased in 43 states by an average decline of 5 points. On the state level, most states declined as well. Only 10 states avoided a significant change in any scoring domain (e.g., Illinois grade 4 mathematics) (NCES, 2022a). Importantly, these declines overlapped with the COVID-19 pandemic, a time when learners were thrust into obligatory remote online education. This made NAEP 2022 useful as a nationally representative, large-scale dataset of students who coincidentally underwent mass-remote online learning.
This study’s novel contribution relative to existing literature
The timing of NAEP 2022 underpins the necessity of this study. A temporally distinct NAEP 2022 provides affordances over prior NAEP years with the inclusion of new survey items focusing on remote teaching and learning. Importantly, these remote learning items will carry forward into future NAEP years. Therefore, the present study may serve as an initial bookend for longer-term research on the topic.
It is important to acknowledge that NAEP 2022 is not the only large-scale assessment to provide insights on remote learning. For example, the Program for International Student Assessment (PISA) 2022 reported that 44% of students in the United States of America (U.S.) had problems understanding school assignments while remote learning (OECD, 2023). Moreover, 28% of U.S. students experienced difficulty finding someone to help with schoolwork (OECD, 2023). For the purpose of this study, NAEP 2022 offers several affordances over PISA. For example, NAEP surveys both primary and secondary levels while PISA is limited to 15-year old students. This is relevant as remote learning effects may have longer lasting effects on younger learners compared to those nearing the end of compulsory education. Furthermore, NAEP allows the authors to revisit the topic when the grade 4 cohort reaches grade 8 and are assessed again by NAEP 2026; though the grade 8 cohort will be a nationally representative sample consisting of different individuals. In addition, NAEP has a larger sample of students from the U.S. (~ 100,000) compared to PISA (~ 4,800) which allows for research methods that investigate multiple survey items or constructs in unison (Burg et al., 2024).
A different large-scale assessment is the Northwest Evaluation Association (NWEA) MAP Growth. However, NWEA MAP is an interim assessment designed to measure growth, set goals, and plan near-term instruction (Meyer & Dahlin, 2022). In addition, NWEA MAP is an opt-in assessment, with schools and districts choosing to partner with NWEA, introducing some selection bias (Goldhaber et al., 2022). One more example is the Responses to Educational Disruptions Survey (REDS), a one-time, data collection on the impact of COVID-19 pandemic on grade 8 students (Meinck et al., 2022). Unfortunately, REDS didn’t assess primary level students. It also didn’t include the U.S. in the 12 nations surveyed. For these reasons, REDS did not meet the needs for this study.
To this end, the purpose of the present study was to utilize the temporally distinct NAEP 2022 dataset (i.e., mass-remote learning during pandemic) to investigate contextual factors related to remote learning to inform remote learning more broadly in a post-pandemic world. The study aims to provide implications for educators about actionable factors to support remote learning, increase remote learning equity, and reduce score gaps of students from economically disadvantaged backgrounds. Such actionable factors help make remote education a net positive experience rather than contribute to further inequity (Fahle et al., 2023; Kuhfeld et al., 2023 ).
Theoretical framework & related research
As remote learning became more common, scholars took an interest in applying existing theories of teaching and learning, which were largely developed for in-person modalities, with newer online modalities (Greenhow et al., 2022). As this scholarly development took place, scholars learned how some theories still fit well when mediated through online, remote modalities. For example, theories on communities of practice (Wenger, 1998) and social presence (Kreijns et al., 2013; Rourke et al., 1999) found an audience with educators of both in-person and online modalities. Other theories, such as self-regulated learning (Zimmerman & Schunk, 2011) and learner engagement (e.g., Chi & Wylie, 2014), would take on new meaning in remote contexts where asynchronous instruction reigned and teachers lacked a physical presence.
With the unexpected COVID-19 pandemic, educators and learners were suddenly placed into remote education environments that varied in readiness to support teaching and learning. Physical resources were stressed, such as schools trying to provide maintenance for students’ educational technology, or students’ losing a dedicated educational space to a guardian working from home. Likewise, psychological resources were stressed, such as social disconnection from learning communities or facing digital distraction in the home, taxing self-regulation.
The rationale of this study is to investigate how the resources and stressors of pandemic-induced remote learning contexts were associated with academic achievement. This is important to study because the results may help make education systems more robust to future mass disruptions. Furthermore, this study can serve as an initial step towards longer-term research on the topic across multiple NAEP years.
What large-scale assessments say about contextual factors
Research has consistently found relationships between learning context and academic achievement. Examples include: socioeconomic status (Caponera & Losito, 2016; Wang et al., 2023), sociocultural background (e.g., motivation in collectivist versus individual cultures) (Wang et al., 2020), classroom-level factors (e.g., instruction, feedback, behavior climate) (Teodorović, 2011), school-level factors (e.g., interpersonal relationships, sense of belonging) (Zysberg & Schwabsky, 2021), and community-level factors such as social capital (Israel et al., 2001). Evidence from large-scale assessments has continued to support the importance of contextual factors on education. For example, an economically disadvantaged background is consistently associated with lower academic achievement (Wang et al., 2023). Multiple studies using the Stanford Education Data Archive (SEDA) investigated associations between academic achievement and measures of racial, ethnic, and economic disparities from 2009 to 2019. Reardon et al. (2022) reported that higher racial segregation was associated with higher achievement gaps, due to economic segregation concentrating minority students into high-poverty schools, which tend to be less effective. A study by Matheny et al. (2023) reported that academic achievement was associated with within-district segregation by race and socioeconomic status, and by access to certified teachers. Furthermore, achievement gaps grew between wealth levels and varied between race (e.g., Black to White gap grew; Hispanic to White gap decreased) (Matheny et al., 2023). Schult et al. (2022) investigated pandemic disruptions by analyzing reading and mathematics scores on annual large-scale assessments for grade 5 students in Germany. Scores lowered by 0.03 to 0.09 standard deviations in certain skills and lower socio-cultural capital was associated with larger learning loss (Schult et al., 2022).
How pandemic-related contextual factors carry over into remote learning
The association between academic achievement and contextual characteristics is well studied within traditional, in-person education. However, the association is also found in pandemic-induced remote learning contexts too. Kuhfeld et al. (2023) investigated how the COVID-19 pandemic impacted academic development using reading scores from NWEA MAP to measure changes in reading development of 5 million students in the U.S. grades 3 to 8. The analysis found 2021 reading scores declined an average of 0.09 to 0.17 standard deviations compared to 2019. The researchers concluded that pandemic disruption to academic development most impacted students of color attending high-poverty schools in grades 3 to 5 (Kuhfeld et al., 2023). A study by Fahle et al. (2023) investigated the mechanisms that led to performance declines and tried to predict learning recovery based on historical data. The primary finding was that among school districts, high-minority and high-poverty school districts were associated with higher mathematics learning loss per week for remote/online instruction (Fahle et al., 2023). However, within districts, the degree of learning loss was similar regardless of economic background or race (Fahle et al., 2023). The authors attributed learning loss to district or community characteristics such as: broadband internet access, social and economic disruptions, and level of trust in government institutions. Fahle et al. (2023) did not find any significant student-level factors. A study by Kennedy et al. (2022) reported variation in remote learning quality attributed to access and confidence in remote learning technology across 7 countries from REDS (i.e., Denmark, Ethiopia, Kenya, Russian Federation, Slovenia, United Arab Emirates, and Uzbekistan). These studies provide evidence of the importance of context and demographics for understanding academic performance related to the pandemic. The present study extends on this work with an analysis of the temporally unique NAEP 2022 dataset.
Conceptual framework
The conceptual framework for this study ties together literature relevant to remote learning contexts by operationalizing the theories through a framework of pandemic-related resources and stressors. Resources and stressors were used to select NAEP variables suspected to influence NAEP participants’ remote education. At the broadest level, resources were factors expected to have a positive influence on academic achievement while stressors were expected to have a negative influence.
This lens of resources and stressors has precedence from the broader psychological field. Votruba-Drzal et al. (2021) conducted a study of 17,600 early primary students, linking Early Childhood Longitudinal Study (ECLS) data with geospatial data. The researchers used community and family resources (e.g., libraries, museums, parks) and stressors (e.g., violent crime, air pollution) to explain how family income gaps are associated with academic achievement. A study by Perez et al. (2009) investigated the impact of protective resources and risk factors on academic outcomes of undocumented immigrant students in the United States. In that study, protective resources ranged from personal (e.g., multilingualism) to environmental (e.g., multi-parent household). Risk factors also ranged from personal (e.g., sense of belonging due to immigration status) to environmental (e.g., employment status during high school). A study by Miller et al. (2019) investigated poverty and academic achievement across the urban-rural continuum using ECLS-K data from approximately 3,000 kindergarten students. The researchers selected covariates for the analysis based on measures of community resources and stressors, amongst other characteristics.
For the present study, the resources and stressors were classified as either psychological, physical, or both. The physical classification is grounded in research suggesting a physically well-resourced remote learning environment can support equitable access to educational technology (Greenhow et al., 2022). This can help reduce effects attributed to digital divide theory, which suggests schools unable to provide physical resources contribute to academic inequality. For instance, providing assistive technologies (e.g., adaptive keyboard, screen reader) can improve digital accessibility to academic content (Crossland et al., 2018). Furthermore, a well-resourced environment may provide tools of creation (e.g., video editing software, graphics software) to allow learners to be both producers and consumers of educational resources, promoting a more participatory form of remote learning (Dolan, 2016).
The psychological classification is grounded in research suggesting that there is more to achievement than just what a school can physically provide, as there are several psychological constructs that are associated with academic outcomes. Schools can assist students to develop psychological resources which may influence their remote learning experience. For instance, social learning theory (Bandura, 1977) asserts that greater self-efficacy and self-regulation/discipline are associated with greater learning. Likewise, higher mathematics enjoyment is associated with higher mathematics achievement (Pinxten et al., 2014). In addition, students who take on more complex mathematics problems are associated with better achievement on future complex (i.e., multi-step) tasks (Choi & Hannafin, 1997). In addition, the psychological classification is grounded in research suggesting that a well-resourced remote learning environment fosters digital communities of practice (Wenger, 1998), collaborative learning (O’Donnell & O’Kelly, 1994), and supports social learning mediated through educational technology (Greenhow et al., 2022). These environmental factors support student belonging, socio-emotional well-being, and ultimately their academic achievement (Greenhow et al., 2022).
Table 1 shows the conceptual alignment between the resource/stressor classifications and relevant literature that guided initial variable selection. Studies are cited as evidence of the type of measures that capture the association between mathematics achievement and what is found in the hands and minds of learners.
Table 1. Conceptual alignment guiding variable selection
Domain | Resource/Stressor | Examples |
|---|---|---|
Physical | Personal | Assistive and adaptive technology, digital accessibility (Crossland et al., 2018). Educational technology home access, skills to utilize it (Dolan, 2016). Equitable educational technology access (Greenhow et al., 2022). |
Physical | Environmental | Equitable provisioning of educational technologies (Greenhow et al., 2022). School provided participatory/interactive tools (e.g., video editing software, graphics software) (Dolan, 2016). |
Psychological | Personal | Academic self-discipline (Bandura, 1977). Complex problem enjoyment (Choi & Hannafin, 1997). Math confidence, mathematics interest/enjoyment (Pinxten et al., 2014). Sense of belonging (Perez et al., 2009) |
Psychological | Environmental | Collaborative learning (O’Donnell & O’Kelly, 1994). Communities of practice (Wenger, 1998). Psycho-social distance (Greenhow et al., 2022). Social learning mediated through edtech (Greenhow et al., 2022). |
Physical/ Psychological | Personal | Multi-parent household (Perez et al., 2009). People around to ask for help from (Göl et al., 2023). |
Physical/ Psychological | Environmental | Cultural resources (e.g., museums, libraries, parks, performing arts). Social service resources (e.g., food banks, shelters, family services) (Miller et al., 2019). Violent crime, pollution, poverty, unemployment (Miller et al., 2019; Votruba-Drzal et al., 2021) |
A representative NAEP item classified as a physical item asked students about the availability of certain items in their remote schooling environment (e.g., high-speed internet; computer; paper & pencil; quiet place to work, etc.). A representative NAEP item classified as a psychological item asked students the degree to which they look forward to math class (e.g., Not at all like me, Somewhat like me, Exactly like me, etc.). The exact question prompts are shown in Appendix C.
With this classification framework in place, initial variable selection was conducted using a flowchart shown in Fig. 1. First, NAEP variable descriptions were reviewed for fit with the framework. If the variable fit, then the candidate variable was checked to see if it was one of the new remote teaching and learning items added for NAEP 2022. Because these new items did not exist in prior versions of NAEP, they were specifically included in this study. If a candidate variable was not a new remote item, then its Cohen’s D effect size from NAEP 2019 was checked. If the effect size was ≥ 0.20, then it was included in the initial stage of the study. Note that new items on NAEP 2022 did not have an effect size value from NAEP 2019.
[See PDF for image]
Fig. 1
Flowchart for candidate variable selection
Research questions
Two research questions were asked in this study, differing only in the domain of focus.
R.Q. #1–What physical resources and stressors related to remote learning are associated with the variation in grade 4 mathematics performance?
R.Q. #2–What psychological resources and stressors related to remote learning are associated with the variation in grade 4 mathematics performance?
Methods
Data Preparation & special considerations for NAEP analyses
The initial sample started with all NAEP grade 4 mathematics participants (n = 116,200), later subset by filtering for two criteria. First, students were excluded (n = 11,000) if their teachers were new to the profession or worked at a different school in the prior academic year (i.e., 2020–2021). This mitigates confounds related to teachers’ experiences in other schools or being new to the profession. Second, additional students were excluded if they reported no participation in remote schooling in 2020–2021 (n = 20,500) or did not remember (n = 23,800), this was because the study focused on remote learning. Additional cases were excluded for missing data, resulting in the sample reported in descriptive statistics found in Appendix A (n = 48,700). Finally, the statistical modeling software would further reduce the sample to its final analytic sample (n = 36,200), due to how it handled missingness using listwise deletion (n = 13,600) (see Table 14). Only public schools data were used because the NAEP 2022 private school results did not meet minimum participation rate guidelines for reporting (NCES, 2025). Grade 4 was chosen to potentially set up for future research when this age group reaches grade 8 during NAEP 2026. Mathematics was chosen as the outcome measure because recent studies on post-pandemic learning recovery suggest learning loss was highest in mathematics (Fahle et al., 2024). Furthermore, mathematics provides a more equitable assessment measure compared to the specialized, technical language of science (Andon et al., 2014) or the English language-based reading that’s typically more challenging to English language learners (Schnepf, 2007).
One special consideration for NAEP analyses is sampling weights. NAEP uses a stratified cluster sampling procedure to account for unequal probabilities of a student being selected into the sample. The use of student and school weights prevents the over or underrepresentation of certain subgroups, which would otherwise lead to incorrect estimates (Beaton et al., 2011). There is debate on the optimal use of sampling weights for multilevel modeling with large-scale assessments. Mang et al. (2021) recommended using level-2 weights alone (e.g., school weights) because it is the simplest approach to implement amongst those producing the least biased parameter estimates. Atasever et al. (2025) also recommended using level-2 weights for the most precision estimates. However, Atasever et al. (2025) noted that using both level-1 and level-2 weights together is acceptable as well, noting that this may slightly underestimate variance components. In cases where level-1 weights are used, recommendations for scaling of level-1 weights range from inconclusive (Mang et al., 2021) to not recommended (Atasever et al., 2025). Using a NAEP-specific example, Nguyen and Kelley (2018) note “For surveys such as NAEP, there have not been any widely used scaling methods in the literature, so the EdSurvey package does not do any weight transformation” (p. 3). Therefore, our analysis’s use of level-1 and level-2 weights, without level-1 rescaling, aligns with literature from Nguyen and Kelley (2018) and Atasever et al. (2025).
Another special consideration for NAEP analyses is plausible values. NAEP composite scores are derived from plausible values (Jewsbury et al., 2024). While it is advantageous to administer many questions to each student to improve the accuracy of the measurement (Beaton et al., 2011), NAEP’s balanced incomplete block (BIB) design spreads out many items across many respondents, reducing the assessment load on any given individual while still presenting a large range of items across the entire testing population (Beaton et al., 2011; Jewsbury et al., 2024). Additional technical details are found in Appendix B.
Software used
The R programming language and R package “EdSurvey” (v. 3.1.0) was used for data processing and analysis (Bailey et al., 2023). EdSurvey works natively with many large-scale assessments in education, including NAEP, as it handles the complex sample designs, weights, and plausible values of these datasets (Liao et al., 2024; Zhang et al., 2024). EdSurvey estimates the models using maximum likelihood estimation (Bailey & Cohen, 2019). The EdSurvey function “mixed.sdf” was used to estimate hierarchical models with random intercepts for a grouping variable at level-2 (i.e., “school IDs”). This EdSurvey function uses cluster robust standard errors (i.e., CR0) (Liang & Zeger, 1986) based on NAEP sampling strata and primary sampling units (PSU). The EdSurvey multilevel function uses NAEP-provided sampling weights for students and schools in the analysis (Liao et al., 2024; Zhang et al., 2024). Our analysis used the survey sampling weights provided in the data to account for the unequal probability of participant selection (Zhang et al., 2024). No weight transformations are applied (Nguyen & Kelley, 2018, p. 3). The EdSurvey multilevel function allows us to specify an initial two-level, null model to see how variances are partitioned at student and school levels, with EdSurvey handling variance partition (Leyland & Groenewegen, 2020; Nguyen & Kelley, 2018). Later, predictors are added to the null model to explain math score variance.
The methodological approach: a two-stage modeling approach
This study used a two-stage methodology. Stage 1 was a stepwise sequence of exploratory linear regression models used to investigate which candidate variables to include in a later hierarchical linear model (HLM) (Braun et al., 2009). Stage 2 used an HLM to investigate the association between a set of explanatory variables and the outcome variable.
Analysis - Stage 1
The conceptual framework of resources and stressors was used to select an initial list of 41 NAEP variables that represented relevant resources and stressors in the lives of NAEP 2022 participants (see Appendix D). This list of 41 candidate variables was reduced to a final list of explanatory variables used in Stage 2. This was done with a stepwise sequence of exploratory linear regression models, a method used in Braun et al. (2009). The stepwise sequence starts by entering blocks of explanatory variables into a general linear model with mathematics performance as the outcome variable. Blocks of variables were added in the following order: (1) control variables, (2) remote learning; (3) the indices measures; and (4) teacher and school factors. With each new block of variables, the statistically significant variables (p-value = < 0.05) were retained for the subsequent steps. This procedure yielded the final set of covariates to be used in the Stage 2 statistical modeling (see Table 2). Multiple level-1 variables (i.e., student/teacher-level) were promoted into Stage 2 but none of the level-2 variables (i.e., school-level) were promoted.
Despite not having a school-level explanatory variable, a 2-level analysis was still used, rather than using a simpler 1-level analysis. The rationale for this decision is two fold. First, we had a primary focus on school-level covariates explicitly centered on pandemic-impact (e.g., survey items on the COVID questionnaire component of NAEP). Being too inclusive of school-level indicators risked widening the intended scope of the study. Second, the modeling software uses the school grouping IDs at level 2 to allow student-level estimates to vary at the school-level, which allows us to report the variance found at the school level (see Table 3). This unexplained variance suggests that while there may exist other NAEP school-level variables which could help explain the outcome, we did not want to expand the scope trying to track down additional variables that fell outside of the study’s framework. One explanation for why the school-level indicators dropped off may be the presence of related student-level indicators offering more granular measurement. Take for instance a school reported measure of digital device access versus a student reported one. While the school may report providing digital devices or an internet hotspot to students, the student may report a different level of actual access and usage due to technical support issues around installation, maintenance, or operation of educational technologies (Nawaz & Khan, 2012). Even for technically competent students, general wear and tear on digital devices may result in breakdown rates that lower the effective rate of student access compared to initial school offerings (Rashed et al., 2021).
Table 2. Final list of explanatory variables
# | Description | Domain | Level |
|---|---|---|---|
1 | Gender | Demographic | Student |
2 | Student’s race at time of sampling | Demographic | Student |
3 | National School Lunch Program (NSLP) eligibility | Demographic | Student |
4 | Remote: Access to high-speed internet | Physical | Student |
5 | Remote: Desktop, laptop or tablet | Physical | Student |
6 | Remote: A quiet place to work | Physical | Student |
7 | Remote math: Recognize when don’t understand | Psychological | Student |
8 | Remote math: Ask for help when you need it | Both | Student |
9 | Remote math: Find resource online if don’t understand | Both | Student |
10 | Available while remote: School supplies (e.g., paper and pencil) | Physical | Student |
11 | While remote, how often did someone help you with your schoolwork? | Both | Student |
12 | Academic Self-Discipline Index | Psychological | Student |
13 | Enjoyment of Complex Problems Index | Psychological | Student |
14 | Confidence in Math Knowledge & Skills Index | Psychological | Student |
15 | Mathematics Interest/Enjoyment Index | Psychological | Student |
16 | How severe a problem: School building needs significant repair | Physical | Teacher |
17 | How severe a problem: Teachers don’t have adequate supplies | Physical | Teacher |
18 | This year how often provided parents w/home activities | Physical | Teacher |
19 | This year how often participated in PD regarding distance learning | Physical | Teacher |
20 | I am satisfied with being a teacher at this school | Psychological | Teacher |
21 | My work inspires me | Psychological | Teacher |
A critique of stepwise regression is that the resulting model is assumed as the single best fit, leaving other combinations of variables untested (Whittingham et al., 2006). However, our study mitigates this concern with our theory-driven, a priori selection of initial variables in consultation with the aforementioned framework. Only then was a statistical test (i.e., stepwise regression) used to reduce dimensionality. In addition, our Stage 1 analysis also paired the stepwise process with an additional testing round using all Stage 1 variables at once in a single block; obtaining the same subset of variables to promote to Stage 2.
Analysis - Stage 2
The purpose of Stage 2 was to model the association between student-level NAEP mathematics scores and the selected explanatory variables. The hierarchical structure of NAEP, with data grouped in an organized manner (i.e., students grouped within schools), is a methodological match for multilevel modeling methods such as HLM (Braun et al., 2009; Raudenbush & Bryk, 2002). HLM allows modeling both student and school characteristics, providing covariate and variance estimates at each respective level (Beaton et al., 2011).
Statistical models: random-intercepts HLM
A random-intercepts variation of HLM was used in this study. A random-intercepts HLM allows each student’s individual intercept to vary across groups (i.e., school-level) rather than all students sharing a single averaged intercept. This accounts for school-level grouping effects and increases the usefulness of the model for predicting scores for a given student (Anderson, 2012). For example, the digital resources for remote learning at one school may differ greatly at a second school.
The analysis began with a null/unconditioned model (Model 0). The null model is an intercept-only model, with no explanatory variables, and one grouping variable. The null model decomposes total variance into level-1 (i.e., students) and level-2 (i.e., schools) which helps assess how much variance depends on group membership (Anderson, 2012). Additionally, the null model establishes a baseline model to compare against subsequent models (Anderson, 2012). The subsequent model (Model 1) builds on the null model, entering all explanatory variables at once into the model. Model 1 notation is shown in Eqs. 1 and 2.
Level 1:
1
Level 2:
2
Model 1 variance terms are reported in Table 3. In proportional terms, 77% of the variance in grade 4 math scores is found at the student-level while 23% of the variance is found at the school-level. While this model did not use an explanatory variable at the school-level (due to the variable selection process), with EdSurvey the variances are still partitioned and interpretable at both student-level and school-level (Leyland & Groenewegen, 2020; Nguyen & Kelley, 2018).
Table 3. Variance terms for model 1
Level | Group | Variance | Variance decomposition | Std. error | Std. dev. |
|---|---|---|---|---|---|
2 | School | 127.1 | 23% | 7.129 | 11.27 |
1 | Student | 426.3 | 77% | 5.848 | 20.65 |
School sample size (n = 4,640). Student sample size is (n = 36,200). ICC = 0.230
Model 1 fixed effects, the primary results of the analysis, are reported in the Results section by domain (i.e., physical, psychological). Extended descriptions and results of the analysis are reported in Appendix B.
Results
The analysis sample reflected the greater NAEP sample
First, Table 4 reports that the analytic sample was indeed reflective of the greater NAEP sample, varying between 0% and 4%, aside from characteristics used to filter the sample (i.e., remote schooling, teacher worked at the same school prior year).
Table 4. Demographics comparison of full dataset versus sample
Demographic | Response Category | % Full Dataset (n = 117,000) | % in Sample (n = 48,700) |
|---|---|---|---|
Student attended remotely last year | Yes | 89% | 100% |
No | 11% | Excluded | |
Student’s teacher worked same school last year | Yes | 58% | 100% |
No | 20% | Excluded | |
Don’t remember | 22% | Excluded | |
Gender | Female | 51% | 52% |
Male | 49% | 48% | |
Race/Ethnicity | White, not Hispanic | 47% | 49% |
African American, not Hispanic | 14% | 13% | |
Hispanic of any race | 27% | 25% | |
Asian, not Hispanic | 6% | 6% | |
American Indian/Alaska Native | 1% | 1% | |
Native Hawaiian/Pacific Islander | < 1% | < 1% | |
> 1 race, not Hispanic | 5% | 5% | |
National School Lunch Program Eligibility (NSLP) | Eligible | 48% | 45% |
Not eligible | 44% | 48% | |
Info not available | 8% | 5% | |
Mathematics Achievement | N/Aa | 96% | 100% |
aMissingness of mathematics composite scores was 0% for the analytic sample because cases with valid mathematics outcomes were necessary for statistical modeling.
Understanding NAEP score variation
Some context is needed to interpret the practical significance of the statistically significant results reported in the next section. First, NAEP math scores for grade 4 range from 0 to 500. The overall NAEP 2022 grade 4 math average score was 236. The overall standard deviation was 33 points. The scores for the 50% percentile of NAEP 2022 participants was 238. NAEP 2022 categorized scores into three proficiency levels by a minimum cut score: NAEP Basic (208); NAEP Proficient (238); and NAEP Advanced (268). Comparing NAEP 2022 to NAEP 2019, average grade 4 math scores decreased by 5 points, or 0.15 standard deviations.
Demographic results were typical
Table 5. Demographic variables: statistically significant coefficients from model 1
Variable | Response Level | Estimate | Std. Error | 95% CI | Significance |
|---|---|---|---|---|---|
Intercept | NA | 212.20 | – | – | |
NSLP Eligibility | Eligible | Reference | – | – | |
Not eligible | 8.79 | 0.53 | 7.76, 9.82 | * | |
Info not available | 8.57 | 1.60 | 5.43, 11.7 | * | |
Gender | Male | Reference | – | – | |
Female | -6.57 | 0.41 | −7.38, −5.76 | * | |
Race | White, not Hispanic | Reference | – | – | |
African American, not Hispanic | −12.50 | 0.78 | −14.00, −11.00 | * | |
Hispanic of any race | −8.20 | 0.72 | −9.60, −6.80 | * | |
Asian, not Hispanic | 7.45 | 1.17 | 5.15, 9.74 | * | |
American Indian/Alaska Native | −7.84 | 2.46 | −12.70, −3.01 | * | |
Native Hawaiian/Pacific Islander | −11.90 | 3.50 | −18.80, −5.08 | * | |
> 1 race, not Hispanic | −2.83 | 1.12 | −5.02, −0.63 | * | |
Not collected | 15.30 | 4.11 | 7.21, 23.3 | * | |
Unavailable | −19.90 | 8.35 | −36.20, − 3.51 | * |
Grade 4 students’ mathematics scores varied by multiple demographic characteristics (see Table 5). In this study, economic disadvantage was defined by eligibility for the National School Lunch Program (NSLP). The NSLP is a federally assisted meal program to provide low-cost or free lunches to children each school day (U.S. Department of Agriculture, 2023). Eligibility is dependent on either participation in Federal Assistance Programs (e.g., Supplemental Nutrition Assistance Program) or by having a recognized status: migrant, foster, homeless, or runaway (U.S. Department of Agriculture, 2023). When holding financial resources and demographic factors steady, NSLP eligible students were associated with scores 8.79 points (0.27 standard deviations) lower than those students who were ineligible for free or reduced lunches. Additional score differences were found with gender and race as well.
Physical domain: access to school Supplies, computer Hardware, and internet
In the physical domain, there were both expected and counterintuitive results (see Table 6). One expected result was that students who reported restricted access (e.g., No, not available) to physical school supplies (e.g., paper, pencils) were associated with mathematics composite scores 11.60 points (0.35 standard deviations) lower compared to students who had access all the time. Another example was that students who had restricted access (e.g., No, not available) to computing hardware (e.g., desktop, laptop, tablet) were associated with scores 6.87 points (0.21 standard deviations) lower compared to students who had access all the time.
Two results were counterintuitive. First, students who reported partially restricted access to the internet (i.e., Yes, some of the time) were associated with math scores 4.48 points (0.14 standard deviations) higher than those who always had access. Second, students who had partial access to a quiet place to work (i.e., Yes, some of the time) were associated with scores 7.01 points (0.21 standard deviations) higher than those who always had access.
Table 6. Physical variables: statistically significant coefficients from model 1
Variable | Response Level | Estimate | Std. Error | 95% CI | Significance |
|---|---|---|---|---|---|
Intercept | NA | 212.20 | – | – | |
Available while remote: school supplies (e.g., paper, pencils, etc.) | Yes, all the time | Reference | – | – | |
Yes, some of the time | −3.10 | 0.75 | −4.58, −1.62 | * | |
No, not available | −11.60 | 1.37 | −14.30, −8.90 | * | |
I do not remember | −8.90 | 1.44 | −11.70, −6.08 | * | |
Available while remote: desktop, laptop, or tablet | Yes, all the time | Reference | – | – | |
Yes, some of the time | −5.38 | 0.59 | −6.54, −4.21 | * | |
No, not available | −6.87 | 1.04 | −8.91, −4.84 | * | |
I do not remember | −7.67 | 1.08 | −9.80, −5.54 | * | |
Available while remote: access to high-speed internet | Yes, all the time | Reference | – | – | |
Yes, some of the time | 4.48 | 0.48 | 3.54, 5.41 | * | |
Available while remote: a quiet place to work | Yes, all the time | Reference | – | – | |
Yes, some of the time | 7.01 | 0.49 | 6.05, 7.96 | * | |
No, not available | 3.12 | 0.78 | 1.59, 4.64 | * | |
I do not remember | −2.93 | 1.28 | −5.44, −0.42 | * | |
How severe a problem: Teachers don’t have adequate supplies | Not a problem | Reference | – | – | |
Serious problem | −4.68 | 1.52 | −7.66, −1.69 | * | |
This year how often provided parents with home activities | Never | Reference | – | – | |
Once or twice/week | −1.82 | 0.86 | −3.50, −0.13 | * |
Psychological domain: Self-Confidence and recognizing own Understanding
In the psychological domain, the results aligned with prior research on the positive correlations between student beliefs and student academic performance (see Table 7). One result was students who reported higher levels of confidence in their math knowledge and skills (e.g., High) were associated with mathematics scores 20.70 points (0.63 standard deviations) higher than students with low confidence. Another result was that students who were able to recognize breakdowns in their own understanding (e.g., I probably can) were associated with scores 12.80 points (0.39 standard deviations) higher than those who could not (e.g., I definitely can’t).
Table 7. Psychological variables: statistically significant coefficients for model 1
Variable | Response level | Estimate | Std. error | 95% CI | Significance |
|---|---|---|---|---|---|
Intercept | NA | 212.20 | – | – | |
Students’ confidence in math knowledge/skills index | Low | Reference | – | – | |
Moderate | 9.48 | 0.73 | 8.05, 10.90 | * | |
High | 20.70 | 0.77 | 19.10, 22.20 | * | |
For remote class: can recognize when you don’t understand | I definitely can’t | Reference | – | – | |
I probably can’t | 7.03 | 1.31 | 4.46, 9.62 | * | |
Maybe | 7.76 | 1.03 | 5.75, 9.78 | * | |
I probably can | 12.80 | 1.08 | 10.70, 14.90 | * | |
I definitely can | 11.50 | 1.02 | 9.52, 13.50 | * | |
Students’ academic self-discipline index | Low | Reference | – | – | |
Moderate | 3.43 | 0.75 | 1.96, 4.90 | * | |
High | 6.20 | 0.80 | 4.64, 7.77 | * | |
Students’ enjoyment of complex problems index | Low | Reference | – | – | |
High | 2.27 | 0.63 | 1.05, 3.50 | * | |
Students’ interest/enjoyment in mathematics index | Low | Reference | – | – | |
High | 2.12 | 0.78 | 0.59, 3.65 | * | |
Teacher: I am satisfied with being a teacher at this school | Not at all like me | Reference | – | – | |
A little bit like me | 5.67 | 2.68 | 0.41, 10.90 | * |
Dual domain: access to help
Regarding the variables that fit into both the physical and psychological domains, there were mostly expected results (see Table 8). One result was that students who could ask for help when needed while remote learning (e.g., I probably can) were associated with math composite scores 8.17 points (0.25 standard deviations) higher than students who could not ask for help. Another result was that students who reported a higher frequency of help while remote learning (e.g., Every day or almost), were associated with scores 6.70 points (0.20 standard deviations) lower than students who never needed to ask for help. Yet another result was that students who could find resources while remote (e.g., probably can), were associated with scores 3.80 points higher (0.12 standard deviations) than students who could not.
Table 8. Physical & psychological variables: statistically significant coefficients for Model 1
Variable | Response level | Estimate | Std error | 95% CI | Significance |
|---|---|---|---|---|---|
Intercept | NA | 212.2 | – | – | |
For remote class: can ask for help when needed | I definitely can’t | Reference | – | – | |
Maybe | 4.99 | 1.33 | 2.39, 7.59 | * | |
I probably can | 8.17 | 1.31 | 5.62, 10.70 | * | |
I definitely can | 4.51 | 1.23 | 2.10, 6.92 | * | |
While remote: how often did someone help you with your schoolwork | Never | Reference | – | – | |
Once/twice a month | −1.52 | 0.74 | -2.97, −0.07 | * | |
Once/twice a week | −2.80 | 0.70 | -4.17, −1.44 | * | |
Every day or almost | −6.70 | 0.67 | -8.02, −5.39 | * | |
I do not remember | −3.50 | 0.79 | -5.06, −1.95 | * | |
Remote class: find online resource online when don’t understand | I definitely can’t | Reference | – | – | |
I probably can’t | 2.54 | 1.18 | 0.22, 4.85 | * | |
Maybe | 2.60 | 1.01 | 0.62, 4.58 | * | |
I probably can | 3.80 | 1.07 | 1.71, 5.90 | * |
Discussion
The discussion of the results starts by answering the research questions and then elaborating and contextualizing the findings within the current literature. Then the implications and limitations are discussed.
Answering the research questions
Research question #1 asked–What physical resources and stressors related to remote learning are associated with the variation in grade 4 mathematics performance? One answer was that key stressors included restrictions on physical school supplies (-11.60 points, 0.35 effect size) or computing hardware (-6.87 points, 0.21 effect size). A second answer was that partial limitations to high-speed internet access (+ 4.48 points, 0.14 effect size) was associated with being a resource. A third answer to this question was that a partial limitation to a quiet workspace (+ 7.01 points, 0.21 effect size) was associated with being a resource.
Research question #2 asked–What psychological resources and stressors related to remote learning are associated with the variation in grade 4 mathematics performance? One answer to this question was that students’ self-confidence in their mathematics knowledge and skills was associated with being an impactful psychological resource (+ 20.70 points, 0.63 effect size). A second answer was that the ability to recognize if one is experiencing a breakdown in their understanding was a psychological resource (+ 12.80 points, 0.39 effect size).
Contextualizing the findings
The findings related to the first research question were both expected and counterintuitive. The findings provided evidence of the importance of physical school supplies and computing hardware. Indeed, prior research has shown that access to internet and digital devices, plus confidence in using them, positively impacts the remote educational environment (Diaz Lema et al., 2023; Kennedy et al., 2022).
While the results regarding physical resources were intuitive, the results about unlimited internet access and a quiet workspace were harder to interpret. The findings regarding partial restrictions to internet and quiet work spaces is evidence suggesting that unrestricted internet access may also come with unrestricted distractions as well (Göl et al., 2023). Prior research has found that maintaining focus in remote learning environments can be a challenge (Means et al., 2021). For example, unlimited internet access to social media or video content (e.g., YouTube) on a device used for both personal and academic purposes can blur the line between learning time and non-learning time (Gikas & Grant, 2013). Regarding a quiet place to work, a completely quiet place may be the result of an empty home environment with insufficient parental presence to help structure learning time and maintain focus.
Further in regards to parents in the remote learning context, technical skills development is rarely available to guardians in the home. This is a problem because remote learning may burden guardians to serve as de facto teachers in the home. However, parents may not have the freedom to structure their own workday to support student learning (Molnar et al., 2019). Therefore, parents should not be expected to take the lead in balancing supervision versus independence with students’ digital resources (Molnar et al., 2019). Instead, schools must provide guidance to parents (i.e., how to monitor internet use during learning time). Guidance is especially important for younger learners who are not yet independent enough for remote learning without both teacher and parent support (Yan et al., 2021).
The findings to the second research question were expected. The findings regarding students’ self-confidence in mathematics and their ability to self-recognize breakdowns in their own understanding is evidence of the importance of self-awareness and self-confidence on academic achievement. Prior related research consistently finds similar results regarding self-confidence and academic achievement (Stankov et al., 2017). A study of NAEP 2019 reported mathematics self-efficacy as a predictor of math achievement (Yang et al., 2024). An analysis of the High School Longitudinal Study of 2009 reported that self-efficacy was significantly related to mathematics achievement (Liu et al., 2024). Not only does self-confidence impact academic outcomes in general but so does self-confidence with the remote learning modality itself (Kennedy et al., 2022; Landrum, 2020). However, a caveat is that self-confidence may be a lagging indicator of mathematics knowledge and skill, rather than a driver of it. Indeed, longitudinal evidence suggests that prior academic performance predicts self-efficacy (Lui et al., 2024).
Interpreting counterintuitive findings
The counterintuitive results warrants specific attention. While we expected more internet access and more quiet work space access to be associated with higher math scores, the relationship held true only to a point. While a simplistic (and incomplete) interpretation of this result may suggest it is good to restrict certain resources from students, there are more complex factors at play. At the unrestricted level of access, digital distraction may come into play. The literature defines digital distraction as the use of digital devices for non-academic purposes during teaching or learning times (Flanigan et al., 2022; Park et al., 2025). There is evidence that digital distraction affects students differently. Students who are younger, female, and lack training in distance learning are more affected by digital distraction (Göl et al., 2023). There is evidence that increasing student awareness of digital distraction may mitigate the distraction, though the results are mixed (Park et al., 2025; Santos et al., 2018). A step further is teaching digital distraction prevention techniques, such as how to prepare a clean digital environment prior to remote learning. Example tasks include closing nonessential applications, bookmarking non-academic web browser tabs for later, or asking family members in the home to refrain from resource intensive internet usage (e.g., downloading, gaming, video streaming) (Wu, 2017).
On the topic of digital distraction in the home learning context, student self-regulation is important here. Kärki (2024) delineates top-down attention regulation versus bottom-up regulation. Top-down begins with one’s executive functioning. Bottom-up is regulated by environmental stimuli (e.g., loud noise in the room will draw one’s attention). Typically, online learning demands higher levels of self-regulation than traditional learning (Azevedo, 2005; Greenhow et al., 2022). Home guardians play a role in helping regulate the remote learning environment, especially for younger learners such as the ones in this study. Connecting back to this study’s counterintuitive findings, perhaps a student who always has a quiet space to remote learn is afforded that quiet space at the expense of a vigilant guardian presence to help mitigate digital distraction. This may lead to unrestricted internet access along with unrestricted distractions as well (Göl et al., 2023). While the counterintuitive results are interpreted within sight of relevant literature, further research is needed to investigate potential effects of confounding home-environment variables that NAEP does not collect (e.g., measure of parental involvement).
Implications of the findings in context of existing research
The study’s findings provided evidence of the continued importance of the immutable demographic characteristics (e.g., race), as well as economic disadvantage (NSLP), on student academic achievement in remote education. These effects are present in both traditional, pre-pandemic education (Matheny et al., 2023; Reardon et al., 2022; Schult et al., 2022), as well as, in remote, pandemic-era education (Fahle et al., 2023; Kuhfeld et al., 2023). This suggests that remote education is not itself inherently a more equitable mode of education. In addition, the findings of this study substantiated prior findings showing the educational importance of provisioning students with internet access and digital devices (Diaz Lema et al., 2023; Fahle et al., 2023), as well as, building confidence in remote learning technology (Kennedy et al., 2022). Additionally, the study provides new evidence about the potential drawback of unfettered, unsupervised access (Göl et al., 2023).
Finally, the findings also serve as a complement to larger and farther-reaching efforts to support educational equity, by pinpointing actionable factors to support equity in the near-term future. The study provided a list of resources and stressors in two domains. In the physical domain, findings aligned with prior international-level research using educational disruption data (e.g., REDS) which reported variation in remote learning quality attributed partially to access in remote learning technology (Kennedy et al., 2022). In the psychological domain, findings aligned with prior research generally linking self-confidence to academic achievement (Stankov et al., 2017) but also specifically linking self-confidence with the online/remote learning modality both domestically (Landrum, 2020) and internationally (Kennedy et al., 2022). These findings also extended prior research with REDS data by adding evidence from the U.S. context using the U.S.-specific NAEP data.
Implications for practice and future research
An analysis of a novel dataset created by circumstance (i.e., NAEP during pandemic) provided lessons for remote online learning. One implication of this study is for education policy makers, for whom the results provide support for the importance of the availability of computing hardware or physical school supplies as potential key resources that, when lacking, become influential stressors to student achievement. Efforts should be made to provide baseline infrastructure for remote learners.
A second implication of this study is for classroom teachers, for whom measures of student self-confidence may be useful for informal, formative assessment of mathematics achievement. This may be especially useful for assessing students with math anxiety, as self-confidence is an indirect measure that side-steps a direct math assessment. Mitigation of math anxiety is important because higher math anxiety has been shown to be a predictor of lower mathematics achievement on PISA (Wang et al., 2023).
A third implication of this study is for researchers, for whom this study provides an opportunity for additional research regarding the unexpected results of higher mathematics scores being associated with only partial access to high-speed internet and a quiet place to work. Specifically, this analysis implies that further work is needed to understand connections between performance, engagement with others in a remote/distributed learning context, and access to the internet and/or other sources of digital distraction. One more research implication is that this study serves as a starting point for ongoing research on post-pandemic remote learning using NAEP. The results of the study provided insights into some of the ways that remote learning context was associated with mathematics performance on NAEP 2022. The scope of this study was limited to students who experienced remote learning immediately following the 2020 pandemic event. However, as time moves forward, remote learning may look different. The remote learning that subsequent NAEP participants experience will likely be just one component of a traditional, in-person education rather than an emergency substitute for the in-person modality. NAEP 2026, and onward, will allow researchers to revisit the topic when the cohort of grade 4 students reach grade 8 and are assessed once more by NAEP; though the grade 8 cohort will be a nationally representative sample consisting of different individuals.
Limitations of the study
The NAEP design presents limitations. First, NAEP survey questionnaires collect self-reported data, which requires respondents to interpret questions themselves and to respond accurately (Braun et al., 2009). Second, NAEP is not an experimental design; it is an observational, cross-sectional study (NCES, 2022b). There is no random assignment of students to groups (e.g., high confidence vs. low confidence groups) nor any interventions on teaching or learning practices and behaviors. Therefore, the relationships in this study are not causal and not interpreted causally. Second, it is important to emphasize that this particular study is designed to generalize to participants who experienced the remote learning modality. One reason is because the remote learning items on NAEP 2022 are phrased under the assumption that remote learning did take place (e.g., “While remote, how often did someone help you with your schoolwork?”). Therefore, students who experienced only in-person learning would have missing data for items that remote learners all have. Therefore, we assert that a direct comparison between students who did experience remote learning versus those who did not is better left to a separate study with a specific focus on the topic. Finally, the analysis reported 23% of the variance was found at the school level, of which was left unexplained (see Table 3). This unexplained variance suggests that while there may exist other NAEP school-level variables which could help explain the outcome, we did not want to expand the scope trying to track down additional variables that fell outside of the study’s framework. Additionally, using a single-level analysis would mean losing the insights about the school-level variance. Leaving some unexplained variance on the table, so to speak, is a limitation of this study. However, it also presents a jumping-off point for future research, especially with remote-learning indicators becoming more common in large-scale assessments post-pandemic.
Conclusion
The purpose of the present study was to utilize the temporally distinct NAEP 2022 dataset (i.e., mass-remote learning during pandemic) to investigate contextual factors related to remote learning to inform remote learning more broadly in a post-pandemic world. A framework of physical and psychological resources and stressors was used to help select NAEP variables and present findings. Hierarchical linear modeling was used on a sample of 36,200 grade 4 students to answer two research questions regarding mathematics scores and remote learning. One finding was that computing hardware and physical school supplies were key physical resources (when available) or stressors (when lacking) on student achievement. Another finding was that self-confidence and self-awareness in mathematics knowledge and skills were major psychological resources. Additionally, the study provided evidence about the potential drawback of unfettered, unsupervised internet access while in remote learning environments. The implications of these findings were actionable factors to support remote learning, increase remote learning equity, and reduce score gaps of students from economically disadvantaged backgrounds.
Author contributions
WNBR conducted all data and analysis work, WNBR also co-wrote the manuscript. RF and HHP co-wrote the manuscript.
Data availability
The dataset analysed during the current study is available on request from the National Center for Education Statistics: https://nces.ed.gov/nationsreportcard/researchcenter/datatools.aspx.
Declarations
Competing interests
All three authors were employed at ETS while conducting the study. ETS is under contract to administer the NAEP assessment.
Abbreviations
Balanced incomplete block
Early childhood longitudinal study
Hierarchical linear model
Intraclass correlation coefficient
National assessment of educational progress
National school lunch program
Northwest evaluation association
Program for international student assessment
Responses to educational disruptions survey
Stanford education data archive
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
Anderson, D. (2012). Hierarchical linear modeling (HLM): An introduction to key concepts within cross-sectional and growth modeling frameworks. Behavioral Research & Teaching.
Andon, A; Thompson, CG; Becker, BJ. A quantitative synthesis of the immigrant achievement gap across OECD countries. Large-Scale Assessments in Education; 2014; 2,
Atasever, U., Huang, F. L., & Rutkowski, L. (2025). Reassessing weights in large-scale assessments and multilevel models. Large-scale Assessments in Education, 13(1). https://doi.org/10.1186/s40536-025-00245-y
Azevedo, R. Using hypermedia as a metacognitive tool for enhancing student learning? The role of self-regulated learning. Educational Psychologist; 2005; 40,
Bailey, P., Emad, A., Huo, H., Lee, M., Liao, Y., Lishinski, A., Nguyen, T., Xie, Q., Yu, J., Zhang, T., Buehler, E., Lee, S., Webb, B., Fink, T., Sikali, E., Kelley, C., Bundsgaard, J., C’deBaca, R., & Christensen, A. A. (2023). EdSurvey (Version 3.0.2). American Institutes for Research. https://doi.org/10.32614/CRAN.package.EdSurvey
Bailey, P., & Cohen, M. (2019, March 29). Statistical methods used in EdSurvey. NCES Data R Project – EdSurvey. https://www.air.org/sites/default/files/EdSurvey-Statistics.pdf
Bandura, A. Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review; 1977; 84,
Beaton, A. E., Rogers, A. M., Gonzalez, E., Hanly, M. B., Kolstad, A., Rust, K. F., Sikali, E., Stokes, L., & Jia, Y. (2011). The NAEP primer. U.S. Department of Education, National Center for Education Statistics. NCES 2011 – 463.
Bohnert, M; Gracia, P. Digital use and socioeconomic inequalities in adolescent well-being: Longitudinal evidence on socioemotional and educational outcomes. Journal of Adolescence; 2023; 95,
Braun, H., Coley, R., Jia, Y., & Trapani, C. (2009). Exploring what works in science instruction: A look at the Eighth-Grade science Classroom. Policy information report. Educational Testing Service.
Burg, S., Stephens, M., & Tsokodayi, Y. (2024). Comparison of the PISA 2022 mathematics, reading, and science assessments with NAEP. https://nces.ed.gov/surveys/pisa/pdf/PISA2022-NAEP-Comparison-Report_508.pdf
Caponera, E; Losito, B. Context factors and student achievement in the IEA studies: Evidence from TIMSS. Large-Scale Assessments in Education; 2016; 4, pp. 1-22. [DOI: https://dx.doi.org/10.1186/s40536-016-0030-6]
Chhetri, C. (2020, October). I Lost Track of Things Student Experiences of Remote Learning in the Covid-19 Pandemic. In Proceedings of the 21st Annual Conference on Information Technology Education (pp. 314–319). https://doi.org/10.1145/3368308.3415413
Chi, MT; Wylie, R. The ICAP framework: Linking cognitive engagement to active learning outcomes. Educational Psychologist; 2014; 49,
Choi, JI; Hannafin, M. The effects of instructional context and reasoning complexity on mathematics problem-solving. Educational Technology Research and Development; 1997; 45,
Crossland, A., Gray, T., & Reynolds, J. (2018). ESSA and digital learning: Closing the digital accessibility gap. American Institutes for Research. https://www.air.org/sites/default/files/2021-06/ESSA-Digital-Lrng-508.pdf
U.S. Department of Agriculture Food and Nutrition Service. (2023, December). National School Lunch Program. https://www.fns.usda.gov/nslp
Diaz Lema, ML; Rossi, L; Soncin, M. Teaching away from school: Do school digital support influence teachers’ well-being during Covid-19 emergency?. Large-Scale Assessments in Education; 2023; 11,
Dolan, JE. Splicing the divide: A review of research on the evolving digital divide among K–12 students. Journal of Research on Technology in Education; 2016; 48,
Fahle, E. M., Kane, T. J., Patterson, T., Reardon, S. F., Staiger, D. O., & Stuart, E. A. (2023). School district and community factors associated with learning loss during the COVID-19 pandemic. Center for Education Policy Research at Harvard University: Cambridge, MA, USA. https://cepr.harvard.edu/sites/hwpi.harvard.edu/files/cepr/files/explaining_covid_losses_5.23.pdf
Fahle, E., Kane, T. J., Reardon, S. F., & Staiger, D. O. (2024). The first year of pandemic recovery: A district-level analysis. Education Recovery Scorecard. https://educationrecoveryscorecard.org/wp-content/uploads/2024/01/ERS-Report-Final-1.31.pdf
Flanigan, A. E., Babchuk, W. A., & Kim, J. H. (2022). Understanding and reacting to the digital distraction phenomenon in college classrooms. In Digital Distraction in the College Classroom (pp. 1–21). IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-7998-9243-4.ch001
Frank, KA; Maroulis, SJ; Duong, MQ; Kelcey, BM. What would it take to change an inference? Using rubin’s causal model to interpret the robustness of causal inferences. Educational Evaluation and Policy Analysis; 2013; 35,
Frank, KA; Lin, Q; Maroulis, S; Mueller, AS; Xu, R; Rosenberg, JM; Hayter, CS; Mahmoud, RA; Kolak, M; Dietz, T; Zhang, L. Hypothetical case replacement can be used to quantify the robustness of trial results. Journal of Clinical Epidemiology; 2021; 134, pp. 150-159. [DOI: https://dx.doi.org/10.1016/j.jclinepi.2021.01.025]
Frank, KA; Lin, Q; Xu, R; Maroulis, SJ; Mueller, A. Quantifying the robustness of causal inferences: Sensitivity analysis for pragmatic social science. Social Science Research; 2023; 110, 102815. [DOI: https://dx.doi.org/10.1016/j.ssresearch.2022.102815]
Garson, G. D. (2013). Hierarchical linear modeling: Guide and applications. Sage.
Gikas, J; Grant, MM. Mobile computing devices in higher education: Student perspectives on learning with cellphones, smartphones & social media. The Internet and Higher Education; 2013; 19, pp. 18-26. [DOI: https://dx.doi.org/10.1016/j.iheduc.2013.06.002]
Göl, B; Özbek, U; Horzum, MB. Digital distraction levels of university students in emergency remote teaching. Education and Information Technologies; 2023; 28,
Goldhaber, D., Kane, T. J., McEachin, A., & Morton, E. (2022). A Comprehensive Picture of Achievement across the COVID-19 Pandemic Years: Examining Variation in Test Levels and Growth across Districts, Schools, Grades, and Students. Working Paper No. 266–0522. National Center for Analysis of Longitudinal Data in Education Research (CALDER). https://files.eric.ed.gov/fulltext/ED620384.pdf
Greenhow, C; Graham, CR; Koehler, MJ. Foundations of online learning: Challenges and opportunities. Educational Psychologist; 2022; 57,
Israel, GD; Beaulieu, LJ; Hartless, G. The influence of family and community social capital on educational achievement. Rural Sociology; 2001; 66,
Jewsbury, PA; Jia, Y; Gonzalez, EJ. Considerations for the use of plausible values in large-scale assessments. Large-scale Assessments in Education; 2024; 12,
Kärki, K. Digital distraction, attention regulation, and inequality. Philosophy & Technology; 2024; 37,
Kennedy, AI; Mejía-Rodríguez, AM; Strello, A. Inequality in remote learning quality during COVID-19: Student perspectives and mitigating factors. Large-scale Assessments in Education; 2022; 10,
Kostaki, D., & Karayianni, I. (2022). Houston, we Have a Pandemic: Technical Difficulties, Distractions and Online Student Engagement. Student Engagement in Higher Education Journal, 4(2), 105–127. Retrieved from https://sehej.raise-network.com/raise/article/view/1063
Kreijns, K; Kirschner, P; Vermeulen, P. Social aspects of CSCL environments: A research framework. Educational Psychologist; 2013; 48,
Kuhfeld, M; Lewis, K; Peltier, T. Reading achievement declines during the COVID-19 pandemic: Evidence from 5 million US students in grades 3–8. Reading and Writing; 2023; 36,
Landrum, B. Examining students’ confidence to learn online, self-regulation skills and perceptions of satisfaction and usefulness of online classes. Online Learning; 2020; 24,
Lee, VE. Using hierarchical linear modeling to study social contexts: The case of school effects. Educational Psychologist; 2000; 35,
Leyland, A. H., & Groenewegen, P. P. (2020). Apportioning variation in multilevel models. In: A.H. Leyland, & P.P. Groenewegen (Eds.), Multilevel modelling for public health and health services research (pp 89–104). Springer Nature. https://doi.org/10.1007/978-3-030-34801-4_6
Liang, KY; Zeger, SL. Longitudinal data analysis using generalized linear models. Biometrika; 1986; 73,
Liao, Y., Bailey, P., & Yavuz, S. (2024). Models. In Bailey, P. & Zhang, T. (Eds.), Analyzing NCES data using EdSurvey: A user’s guide.https://naep-research.airprojects.org/Portals/0/EdSurvey_A_Users_Guide/_book/models.html#mixed-models-with-mixed.sdf
Liu, R; Jong, C; Fan, M. Reciprocal relationship between self-efficacy and achievement in mathematics among high school students: First author. Large-scale Assessments in Education; 2024; 12,
Livingstone, S; Mascheroni, G; Staksrud, E. European research on children’s internet use: Assessing the past and anticipating the future. New Media & Society; 2018; 20,
Mang, J., Küchenhoff, H., Meinck, S., & Prenzel, M. (2021). Sampling weights in multilevel modelling: An investigation using PISA sampling structures. Large-scale Assessments in Education, 9(1). https://doi.org/10.1186/s40536-021-00099-0
Matheny, KT; Thompson, ME; Townley-Flores, C; Reardon, SF. Uneven progress: Recent trends in academic performance among U.S. School districts. American Educational Research Journal; 2023; 60,
Means, B; Peters, V; Neisler, J; Wiley, K; Griffiths, R. Lessons from remote learning during COVID-19. Digital Promise; 2021; [DOI: https://dx.doi.org/10.51388/20.500.12265/116]
Meinck, S., Fraillon, J., & Strietholt, R. (2022). The impact of the COVID-19 pandemic on education: International evidence from the responses to educational disruption survey (REDS). International Association for the Evaluation of Educational Achievementhttps://www.iea.nl/studies/iea/REDS
Meyer, J. P., & Dahlin, M. (2022). MAP Growth theory of action. NWEA. https://files.eric.ed.gov/fulltext/ED623609.pdf
Miller, P; Votruba-Drzal, E; Coley, RL. Poverty and academic achievement across the urban to rural landscape: Associations with community resources and stressors. RSF: the Russell Sage Foundation Journal of the Social Sciences; 2019; 5,
Molnar, A., Miron, G., Elgeberi, N., Barbour, M. K., Huerta, L., Shafer, S. R., & Rice, J. K. (2019). Virtual schools in the US 2019. National Education Policy Center. https://nepc.colorado.edu/publication/virtual-schools-annual-2019
National Center for Education Statistics (NCES) (2023, September 19). Technical Documentation. https://nces.ed.gov/nationsreportcard/tdw/
National Center for Educational Statistics (NCES) (2025, January 16). Interpreting NAEP Mathematics Results. https://nces.ed.gov/nationsreportcard/mathematics/interpret_results.aspx#repgroups
National Center for Education Statistics (NCES) (2022b). Largest score declines in NAEP mathematics at grades 4 and 8 since initial assessments in 1990. https://www.nationsreportcard.gov/highlights/mathematics/2022/
National Center for Education Statistics (NCES) (2022a, October, 24). Mathematics and reading scores of fourth- and eighth-graders declined in most states during pandemic, nation’s report card shows. https://www.nationsreportcard.gov/
Nawaz, A; Khan, MZ. Issues of technical support for e-learning systems in higher education institutions. International Journal of Modern Education and Computer Science; 2012; 4,
Nguyen, T., & Kelley, C. (2018). Methods used for estimating mixed-effects models in EdSurvey. https://www.air.org/sites/default/files/EdSurvey-Mixed_Models.pdf
OECD (2023). PISA 2022 Results (Volume I and II) - Country Notes: United States. PISA, OECD Publishing, Paris. https://www.oecd.org/en/publications/pisa-2022-results-volume-i-and-ii-country-notes_ed6fbcc5-en/united-states_a78ba65a-en.html
O’Donnell, A. M., & O’Kelly, J. (1994). Learning from peers: Beyond the rhetoric of positive results. Educational Psychology Review, 6(4), 321-349. https://doi.org/10.1007/BF02213419
Park, J; Paxtle-Granjeno, J; Ok, MW; Shin, M; Wilson, E. Preventing digital distraction in secondary classrooms: A quasi-experimental study. Computers & Education; 2025; 227, 105223. [DOI: https://dx.doi.org/10.1016/j.compedu.2024.105223]
Perez, W; Espinoza, R; Ramos, K; Coronado, HM; Cortes, R. Academic resilience among undocumented Latino students. Hispanic Journal of Behavioral Sciences; 2009; 31,
Pinxten, M; Marsh, HW; De Fraine, B; Van Den Noortgate, W; Van Damme, J. Enjoying mathematics or feeling competent in mathematics? Reciprocal effects on mathematics achievement and perceived math effort expenditure. British Journal of Educational Psychology; 2014; 84,
Rashed, R., Rifat, M. R., & Ahmed, S. I. (2021, June). Pandemic, repair, and resilience: Coping with technology breakdown during COVID-19. In Proceedings of the 4th ACM SIGCAS Conference on Computing and Sustainable Societies (pp. 312–328). https://doi.org/10.1145/3460112.3471965
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (Vol. 1). Sage.
Reardon, S. F., Weathers, E. S., Fahle, E. M., Jang, H., & Kalogrides, D. (2022). Is Separate Still Unequal? New Evidence on School Segregation and Racial Academic Achievement Gaps. (CEPA Working Paper No.19 – 06). Retrieved from Stanford Center for Education Policy Analysis: http://cepa.stanford.edu/wp19-06
Rosenberg, J. M., Narvaiz, S., Xu, R., Lin, Q., Maroulis, S., Frank, K. A., Saw, G., & Willet, S. (2024). K. B. Konfound-It! Quantify the robustness of causal inferences [R Shiny app built on konfound R package version 1.0.3]. https://konfound-project.shinyapps.io/konfound-it/
Rourke, L; Anderson, T; Garrison, DR; Archer, W. Assessing social presence in asynchronous text-based computer conferencing. The Journal of Distance Education/Revue De L’ducation Distance; 1999; 14,
Santos, IM; Bocheco, O; Habak, C. A survey of student and instructor perceptions of personal mobile technology usage and policies for the classroom. Education and Information Technologies; 2018; 23, pp. 617-632. [DOI: https://dx.doi.org/10.1007/s10639-017-9625-y]
Schnepf, SV. Immigrants’ educational disadvantage: An examination across ten countries And three surveys. Journal of Population Economics; 2007; 20, pp. 527-545. [DOI: https://dx.doi.org/10.1007/s00148-006-0102-y]
Schult, J; Mahler, N; Fauth, B; Lindner, MA. Did students learn less during the COVID-19 pandemic? Reading and mathematics competencies before and after the first pandemic wave. School Effectiveness and School Improvement; 2022; 33,
Stankov, L., Morony, S., & Lee, Y. P. (2017). Confidence: the best non-cognitive predictor of academic achievement? In Noncognitive psychological processes and academic achievement (pp. 19–38). Routledge. https://doi.org/10.1080/01443410.2013.814194
Teodorović, J. Classroom and school factors related to student achievement: What works for students?. School Effectiveness and School Improvement; 2011; 22,
Tulaskar, R; Turunen, M. What students want? Experiences, challenges, and engagement during emergency remote learning amidst COVID-19 crisis. Education and Information Technologies; 2022; 27,
Votruba-Drzal, E; Miller, P; Betancur, L; Spielvogel, B; Kruzik, C; Coley, RL. Family and community resource and stress processes related to income disparities in school-aged children’s development. Journal of Educational Psychology; 2021; 113,
Wang, MT; Guo, J; Degol, JL. The role of Sociocultural factors in student achievement motivation: A cross-cultural review. Adolescent Research Review; 2020; 5,
Wang, XS; Perry, LB; Malpique, A; Ide, T. Factors predicting mathematics achievement in PISA: A systematic review. Large-Scale Assessments in Education; 2023; 11,
Wenger, E. Communities of practice: Learning as a social sys- tem. Systems Thinker; 1998; 9,
Whittingham, M. J., Stephens, P. A., Bradbury, R. B., & Freckleton, R. P. (2006). Why do we still use stepwise modelling in ecology and behaviour? Journal of animal ecology, 75(5), 1182–1189. https://doi.org/10.1111/j.1365-2656.2006.01141.x
Wu, JY. The indirect relationship of media multitasking self-efficacy on learning performance within the personal learning environment: Implications from the mechanism of perceived attention problems and self-regulation strategies. Computers & Education; 2017; 106, pp. 56-72. [DOI: https://dx.doi.org/10.1016/j.compedu.2016.10.010]
Yan, L; Whitelock-Wainwright, A; Guan, Q; Wen, G; Gašević, D; Chen, G. Students’ experience of online learning during the COVID‐19 pandemic: A province‐wide survey study. British Journal of Educational Technology; 2021; 52,
Yang, Y; Maeda, Y; Gentry, M. The relationship between mathematics self-efficacy and mathematics achievement: Multilevel analysis with NAEP 2019. Large-scale Assessments in Education; 2024; 12,
Zhang, T; Bailey, P; Liao, Y; Sikali, E. EdSurvey: An R package to Analyze large-scale educational assessments data from NCES. Large-scale Assessments in Education; 2024; 12,
Zimmerman, B. J., & Schunk, D. H. (Eds.). (2011). Handbook of self- regulation of learning and performance. Routledge/Taylor & Francis.
Zysberg, L; Schwabsky, N. School climate, academic self-efficacy and student achievement. Educational Psychology; 2021; 41,
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.