Taking PISA Seriously: How Accurate Are Low-Stakes Exams?

Working Paper: NBER ID: w24930

Authors: Pelin Akyol; Kala Krishna; Jinwen Wang

Abstract: PISA is seen as the gold standard for evaluating educational outcomes worldwide. Yet, being a low-stakes exam, students may not take it seriously resulting in downward biased scores and inaccurate rankings. This paper provides a method to identify and account for non-serious behavior in low-stakes exams by leveraging information in computer-based assessments in PISA 2015. We compare the score/rankings with no corrections to those generated using the PISA approach as well as our method which fully corrects for the bias. We show that the total bias is large and that the PISA approach corrects for only about half of it.

Keywords: PISA; low-stakes exams; educational outcomes; student performance; non-serious behavior

JEL Codes: C53; I20; I21

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
non-serious behavior (K40)	downward-biased scores (C46)
non-serious behavior (K40)	inaccurate scores (Y10)
non-serious behavior (K40)	rankings (A14)
students' ability (D29)	engagement during the exam (Y20)
socioeconomic status (P36)	engagement during the exam (Y20)
non-serious behavior (K40)	performance outcomes (L25)
correcting for non-serious behavior (K40)	rankings (A14)

Back to index