There are 4 sets of questions to be answered. You can get up to 100 points + bonus questions. Points are indicated next to each question.
Remember to:
Research question: Does summmer school improves low-achieving students’ grades?
Summer school programs are designed to help students improve their reading and math ability. They are generallly dedicated to students who have not yet achieved the skills required by the next level. There are, however, mixed evidence on whether summer school works.
For this lab, we are interested in testing the following:
Hypothesis: Students who attend summer school will achieve higher grades the following year.
For this lab you will use the regression-discontinuity-lab.csv dataset.
URL <- "https://raw.githubusercontent.com/DS4PS/pe4ps-textbook/master/labs/DATA/regression-discontinuity-lab.csv"
data <- read.csv( URL, stringsAsFactors=F )
math_7 | math_8 | gpa_8 | gpa_9 |
---|---|---|---|
28 | 34 | 0.65 | 1.12 |
86 | 84 | 3.32 | 3.39 |
71 | 69 | 2.57 | 2.46 |
97 | 92 | 3.57 | 3.32 |
47 | 45 | 1.58 | 1.71 |
79 | 76 | 3 | 2.83 |
The synthetic data contains observations on 1,000 high school students from a US public school system. Students that failed to demonstrate what the district deemed to be mathematical foundations necessary to be successful in high school were required to participate in a summer school program after 8th grade before their first year in high schoo (9th grade). A score of 60 or above on the 8th grade standardized math exam was deemed competent, a score below 60 resulted in summer school.
We want to understand whether attending summer school district improved student performance in math, measured by student math GPAs at the end of 9th-grade.
The dataset that the school principal provides you contains four variables:
Variable name | Description |
---|---|
math_7 | Score on standardized math exam at the end of 7th-grade, from 0 to 100 |
gpa_8 | Math GPA during the 8th-grade academic year (prior to summer school), from 0 to 4 |
math_8 | Score on standardized math exam at the end of 8th-grade, from 0 to 100 |
gpa_9 | Math GPA during the 9th-grade academic year (following summer school), from 0 to 4 |
You propose to use a regression discontinuity method to analyze the impact of summer school on students’ grades.
Let’s start by discussing why regression discontinuity is the right approach here
Now prepare your data for the analysis:
Once data are ready, you can estimate the regression discontinuity model. Pay attention to include the correct variables.
Examine the counterfactual.
What is the purpose of including the rating / score variable in the model? (3 points)
The regression discontinuity model is powerful in instances where program participation is determined by some qualification criteria that is measured using a numeric scale (e.g. a test score or a means-test for social services using income), and we only have performance data from the post-treatment period. In these instances the post-test only estimate would be extremely biased since the treatment and control groups represented high and low performers prior to the treatment. The RDD uses the selection criteria and eligibility cut-off score to mitigate the bias. Thus it might be the ONLY way to get a clean estimate of program impact in these circumstances.
In this lab we have GPA data from before and after the treatment period (8th grade and 9th grade math GPAs). As a result, we can also run a difference-in-difference model to estimate the impact of summer school on 9th grade math performance.
Run a difference-in-difference model and compare results with the RDD models above. (7 points)