economics analysis

Post - Regression Analysis March 16, 2017 Time Series Data • Primary Concern = serial correlation (autocorrelation) – Correlation between a variable and its lagged value • A ppears as correlation in error terms if lagged values are not included in the regression – Tests : • Visual: plot the error terms from your regression and look for trends • Durbin -Watson test (recommend estat durbinalt in STATA) – Solution : • Included lagged dependent variables in regression (and re -test) • Other concerns = non -stationarity (Dickey -Fuller test) • coefficient estimate on lagged dependent variable is 1 • d fuller in STATA Autocorrelation Tests reg lnrealgdp tbillrate ; predict resid , residuals ; scatter resid timeid ; estat durbinalt , lags(1/2) ; drop resid ; Durbin's alternative test for autocorrelation --------------------------------------------------------------------------- lags(p) | chi2 df Prob > chi2 ------------- +------------------------------------------------------------- 1 | 18580.586 1 0.0000 2 | 18591.544 2 0.0000 --------------------------------------------------------------------------- H0: no serial correlation Time Series tsset timeid ; reg lnrealgdp tbillrate L.lnrealgdp L2.realgdp; predict resid , residuals ; scatter resid timeid ; estat durbinalt , lags(1/3) ; Durbin's alternative test for autocorrelation --------------------------------------------------------------------------- lags(p) | chi2 df Prob > chi2 ------------- +------------------------------------------------------------- 1 | 2.654 1 0.1033 2 | 3.952 2 0.1386 3 | 4.553 3 0.2076 --------------------------------------------------------------------------- H0: no serial correlation Cross - Sectional Data • Primary concern = heteroskedasticity – Regression error term variance is not constant, conditional on one or more regressors (e.g., error terms display more variance as an independent variable increases) – Tests : • Visual: plot the error terms from the regression relative to the suspected independent variable and look for “fan shape” • Breusch -Pagan/Cook -Weisberg Test ( estat hettest in STATA) – Solution : • reg yvar xvar1 xvar2, vce (robust) to account for heteroskedasticity in computing standard errors Heteroskedasticity Tests reg course_eval female age minority onecredit beauty intro nnenglish ; predict resid , residuals ; scatter resid age ; estat hettest ; Even though we didn’t find evidence of heteroskedasticity above, can still run: reg course_eval female age minority onecredit beauty intro nnenglish , vce (robust) ; Breusch -Pagan / Cook -Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of course_eval chi2(1) = 0.95 Prob > chi2 = 0.3288 Panel Data • Heteroskedasticity and autocorrelation are less of a concern to test with panel data, although vce (robust) is still recommended in the final regression! • Primary concern: Is fixed -effects the appropriate model? – Tests : • Hausman test for fixed effects vs. random effects ( hausman fe re in STATA) – Reject null hypothesis  fixed effects is appropriate • F-test for pooled OLS vs. FE (reported in xtreg , or test if ran i.groupid ) – Reject null hypothesis  fixed effects is appropriate • Fixed effects vs. first -differences ( xtserial yvar xvar1 xvar2 , output in STATA) – Reject null hypothesis  first differences may be better than fixed effects – Irrelevant with only 2 time periods (and generally less critical than FE vs RE vs OLS tests) Pooled OLS : reg fatalityrate sb_useage speed65 speed70 drinkage21 ba08 income age Fixed Effects : xtset stateid ; xtreg fatalityrate sb_useage speed65 speed70 drinkage21 ba08 income age, fe ; F test that all u_i =0: F(50, 498) = 30.82 Prob > F = 0.0000 xtset stateid ; xtreg fatalityrate sb_useage speed65 speed70 drinkage21 ba08 income age, fe ; estimates store fe ; xtreg fatalityrate sb_useage speed65 speed70 drinkage21 ba08 income age, re ; estimate store re ; hausman fe re ; Reject the null hypothesis, indicating FE is better than RE tsset stateid year ; xtserial fatalityrate sb_useage speed65 speed70 drinkage21 ba08 income age, output ; Reject null hypothesis of no serial correlation.

Suggests we may want to consider a first - differenced model.