For this lab you need to upload the FixEff_Lab data available in the class R package.
There are 4 sets of questions to be answered. You can get up to 100 points + bonus questions. Points are indicated next to each question.
Remember to:
Research question: Do beer taxes affect car accidents at night time?
For this lab, we want to consider the effect of beer taxes on mortality rates due to car accidents during night time. There has been several studies examining how state policies aiming to control alcohol consumption affect car accidents. Beer taxes are a way for states to control consumption of alcohol; higher beer taxes increase prices, which in turn decrease consumption. Lower consumption of alcohol is expected to decrease drunk driving and therefore accidents, especially at nigth time.
Hypothesis Beer taxes will be negatively correlated with car accidents at night time.
We are going to use a set of simulated data that look at 7 southern US states. For each state we have observations across 7 years.
URL <- "https://raw.githubusercontent.com/DS4PS/pe4ps-textbook/master/labs/DATA/beer-tax-fixed-effects.csv"
data <- read.csv( URL, stringsAsFactors=F )
head( data ) %>% pander()
X | state | S1 | S2 | S3 | S4 | S5 | S6 | S7 | year | Y1 | Y2 | Y3 | Y4 | Y5 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2010 | 1 | 0 | 0 | 0 | 0 |
2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2011 | 0 | 1 | 0 | 0 | 0 |
3 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2012 | 0 | 0 | 1 | 0 | 0 |
4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2013 | 0 | 0 | 0 | 1 | 0 |
5 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2014 | 0 | 0 | 0 | 0 | 1 |
6 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2015 | 0 | 0 | 0 | 0 | 0 |
Y6 | Y7 | taxes | accidents |
---|---|---|---|
0 | 0 | 0.4955 | 2065 |
0 | 0 | 0.5988 | 2064 |
0 | 0 | 0.5461 | 2069 |
0 | 0 | 0.758 | 2070 |
0 | 0 | 0.4728 | 2079 |
1 | 0 | 0.4402 | 2088 |
The data are structured as a panel dataset. Variables are the following:
Variable name | Description |
---|---|
state | Each state is indicated with a number, from 1 to 7 |
taxes | Beer taxes as percentage of cost, from 0 to 1 |
year | Year in which observations were collected |
accidents | Number of car accidents |
Q1: Your colleague starts analyzing this data using a pooled OLS model.
Q2: However, you know that grouped data might lead to biased results because of the Simpson’s paradox (trends in the data are different when data are looked at the group or aggregate level). You propose a fixed effect model.
Q3: We now have a look at state-level differences.
Q4: Considering what you know about fixed effect models and the current study, which of these variables would you suggest that your colleague add in the model? Specify why. (10 points)
BONUS QUESTION: We can run a fixed effect model by de-meaning our data and then using an OLS predictor.