CPP 523
Dependent Variable: Test Scores | |||||
Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | |
(1) | (2) | (3) | (4) | (5) | |
Classroom Size | -4.22*** | -3.91*** | -2.67 | -2.22*** | |
(0.18) | (0.03) | (1.63) | (0.23) | ||
Teacher Quality | 55.01*** | 55.03*** | 55.01*** | ||
(0.25) | (0.26) | (0.25) | |||
Socio-Economic Status | 40.94*** | 16.34 | 17.77*** | ||
(0.27) | (17.10) | (2.40) | |||
Intercept | 738.34*** | 456.70*** | 272.91*** | 665.29*** | 377.26*** |
(4.88) | (1.48) | (1.39) | (76.57) | (10.82) | |
Observations | 1,000 | 1,000 | 1,000 | 1,000 | 1,000 |
Adjusted R2 | 0.36 | 0.99 | 0.99 | 0.36 | 0.99 |
Standard errors in parentheses | p<0.1; p<0.05; p<0.01 |
Consider the effect of omitting SES from the full model of CS + SES:
EQUATION | MODEL |
---|---|
\(TestScore = B_0 + B_1 \cdot ClassSize + B_2 \cdot SES + e_1\) | \((Full \ Model)\) |
\(TestScore = b_0 + b_1 \cdot ClassSize + e_2\) | \((Naive \ Model)\) |
\(SES = a_0 + a_1 \cdot ClassSize + e_3\) | \((Auxiliary \ Regression)\) |
Dependent Variables | |||
TS | TS | SES | |
(1) | (2) | (3) | |
Classroom Size | -2.671 | -4.222*** | -0.095*** |
(1.632) | (0.176) | (0.0003) | |
Socio-Economic Status | 16.344 | ||
(17.098) | |||
Intercept | 665.289*** | 738.337*** | 4.469*** |
(76.574) | (4.879) | (0.009) | |
Observations | 1,000 | 1,000 | 1,000 |
Adjusted R2 | 0.365 | 0.365 | 0.988 |
Standard errors in parentheses | p<0.1; p<0.05; p<0.01 |
URL <- "https://raw.githubusercontent.com/DS4PS/cpp-523-fall-2019/master/labs/class-size-seed-1234.csv"
dat <- read.csv( URL )
m1 <- lm( test ~ csize, data=dat )
m2 <- lm( test ~ csize + tqual, data=dat )
m3 <- lm( test ~ tqual + ses, data=dat )
m4 <- lm( test ~ csize + ses, data=dat )
m5 <- lm( test ~ csize + tqual + ses, data=dat )
# FULL MODEL
summary( m4 )
# lm(formula = test ~ csize + ses, data = dat)
# Coefficients:
# ----------------------------------
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 665.289 76.574 8.688 <2e-16 ***
# csize -2.671 1.632 -1.637 0.102
# ses 16.344 17.098 0.956 0.339
# ----------------------------------
# NAIVE MODEL
summary( m1 )
# lm(formula = test ~ csize, data = dat)
# Coefficients:
# ----------------------------------
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 738.3366 4.8788 151.34 <2e-16 ***
# csize -4.2221 0.1761 -23.98 <2e-16 ***
# ----------------------------------
# AUXILIARY REGRESSION
m.auxiliary <- lm( ses ~ csize, data=dat )
summary( m.auxiliary )
# lm(formula = ses ~ csize, data = dat)
# Coefficients:
# ----------------------------------
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 4.469458 0.009033 494.8 <2e-16 ***
# csize -0.094876 0.000326 -291.0 <2e-16 ***
# ----------------------------------
# b1 = B1 + bias
# b1 - B1 = bias
b1 <- -4.22
B1 <- -2.67
b1 - B1
# bias = a1*B2
a1 <- -0.0949
B2 <- 16.34
a1*B2