Hello, the homework is for econometrics with the use of Stata to solve all equations.

Economics 452

Assignment 8 Interactions

Assignment 8

Due: Wednesday, April 27, 2018 at 4pm on Blackboard. Make sure and submit your do and log files to Blackboard!

Part 1: Theoretical Questions

  1. What is heteroskedasticity?

  1. Does heteroskedasticity affect coefficient estimates, standard error estimates, or both?

  1. How do we correct for heteroskedasticity in Stata?




Imagine that we have the following estimated and population regressions:

Hello, the homework is for econometrics with the use of Stata  to solve all equations. 1

Where Average Income (AvgInc) is measured in 10,000 USD.

Hello, the homework is for econometrics with the use of Stata  to solve all equations. 2

  1. By how much does the regression above predict average math scores to change if average income increases from $25,000 to $26,000? Show your work.

  1. Write the null and alternative hypothesis corresponding to the test that the true effect of income on math scores is linear.

  1. Write the null and alternative hypothesis corresponding to the test that income effect on math scores.

Part 2: Empirical Questions

Setting up:

Download the file “LFP_State_ProblemSet8.dta” on Blackboard. The file is state-level aggregates based on data from “Giving Mom a Break: The Impact of Higher EITC Payments on Maternal Health” by William N. Evans and Craig L. Garthwaite. Read/look at this paper. This paper utilizes the OBRA93 expansion of the EITC to understand how it affected health related outcomes of potentially eligible mothers. The OBRA93 expansion was phased-in over the tax years 1994, 1995, and 1996. You should read this paper to fully understand the problem set and what you are doing.

  • The EITC is the Earned Income Tax Credit. It constitutes cash payments for poor families, particularly those with children. It is a federal program that the states can expand.

  • It also serves as a nice example of what your paper should aim to be. Notice the way that the authors describe the dataset that they are using; notice the way that you could understand the tables without even reading the paper; notice how in the estimating equation the authors describe each of the variables that they are using.


The dataset I created includes the labor force participation rates and health outcomes for single women with a high school degree or less by number of children in the household. This dataset is at the state level.

  • In all specifications, use robust standard errors.


Variables and Brief Descriptions

year

Year

fips

State FIPS code. These are two digit codes referring to each state.

kidgroup

Indicates the subgroup that the observations refers to; 0 = no children, 1 = one child, 2= two or more children

Dependent Variables

inlf

Variable that indicates the labor force participation rates of the relevant subgroup

alcoholic

Variable that indicates the rates of single women in the relevant subgroup who are classified as at-risk for chronic drinking

now_smoke

Variable that indicates the rates of single women in the relevant subgroup who currently smoke

bad_phys_30

Variables that indicates the rates of single women in the relevant subgroup who report experiencing poor physical health at least one day in the previous month

Characteristics of Single Women with Low Education

white_nh

Portion of the relevant subgroup that consists of non-Hispanic whites

black_nh

Portion of the relevant subgroup that consists of non-Hispanic, African-Americans

hispanic

Portion of the relevant subgroup that consists of Hispanics

other

Portion of the relevant subgroup that consists of all other racial/ethnic groups

age21_25

Portion of the relevant subgroup aged between 21 and 25

age25_30

Portion of the relevant subgroup aged between 25 and 30

age30_35

Portion of the relevant subgroup aged between 30 and 35

age35_40

Portion of the relevant subgroup aged between 35 and 40


  1. Data Cleaning and Variable Creation:

  1. Create a dummy variable eitc_expand that equals 0 for all years prior to 1996 and 1 for years 1996 and after.

  2. Create a dummy variable kids that equals 0 if kidgroup equals 0 and a 1 if kidgroup equals 1 or 2.

  3. Create a dummy variable dd_kids by multiplying eitc_expand by kids.

  4. Create a dummy variable twoplus_kids that equals 0 if kidgroup equals 1 and a 1 if kidgroup equals 2.

  5. Create a dummy variable dd_twoplus by multiplying eitc_expand by twoplus_kids.

  1. Create the following Difference-in-Differences table by following these steps using real data and estimates:

Raw Difference-in-Differences Estimates, OBRA93 EITC Expansion on Labor Force Participation Rates of Less-Educated Single Women

  • This table should look like the “boxes” method that I have drawn on the board

Hello, the homework is for econometrics with the use of Stata  to solve all equations. 3

    1. In 1), indicate the mean of labor force participation for single mothers before the OBRA93 expansion. Round the mean and standard deviation to 3 decimal places.

    2. In 2), indicate the mean of labor force participation for single mothers after the OBRA93 expansion. Round the mean and standard deviation to 3 decimal places.

    3. In 3), subtract 1) from 2). Round the result to 3 decimal places.

    4. In 4) indicate the mean of labor force participation for single women without children before the OBRA93 expansion. Round the mean and standard deviation to 3 decimal places.

    5. In 5) indicate the mean of labor force participation for single women without children after the OBRA93 expansion. Round the mean and standard deviation to 3 decimal places.

    6. In 6), subtract 4) from 5). Round the result to 3 decimal places.

    7. In 7), subtract 4) from 1). Round the result to 3 decimal places.

    8. In 8), subtract 5) from 2). Round the result to 3 decimal places.

    9. In 9), subtract 6) from 3) OR subtract 7) from 8). Round the result to 3 decimal places. This is the difference-in-difference estimate

  1. The basic regression specification you should estimate is as follows:

where Treatk indicates the treated group, PostOBRA93t indicates the time period after OBRA93, Xkt is all other characteristics of the subgroup of women in the dataset (and γs are state fixed effects. (Notice how we are writing each of these!)

  1. Create a table of difference-in-differences estimates (Table 1). Be sure to give your Table 1 a title and provide headers for each column. Table 1 should be constructed as follows:

  • The outcome in all regressions in this table should be ‘in labor force’.

  • In Column 1, use the variable kids to define the treatment group; do not include any characteristics or state fixed effects.

  • In Column 2, run the full specification above using kids to define the treatment group. By full specification, you should include all of the variables listed above under ‘Characteristics of Single Women with Low Education’, and also include a state and a year fixed effect.

  • In Column 3, use the variable twoplus_kids to define the treatment group; do not include any characteristics or state fixed effects.

  • In Column 4, run the full specification above using the variable twoplus_kids to define the treatment group. By full specification, you should include all of the variables listed above under ‘Characteristics of Single Women with Low Education’ and also include a state and a year fixed effect.



  1. Compare the Raw Difference-in-Differences table you created in part b) to the result in Column 1 of Table 1. Do you get the same estimate for the effect of being in the treated group? Do you get the same estimate for the effect of time? Do you get the same difference-in-differences estimate?


  1. Interpret your difference-in-differences estimate in Column 2 and Column 4.


  1. Compare the results of Column 1 to Column 2 and the results of Column 3 to Column 4. Does adding observable characteristics and state fixed effects have an important effect on changing the difference-in-differences estimate? Why?

  1. Now create a Table 2. Table 2 should use the same “X”’s as in Table 1, but your outcome variable should be alcoholic.

  1. Interpret your difference-in-differences estimate in Column 2 and Column 4.

  1. Was the EITC expansion associated with OBRA93 associated with changes in these health-related outcomes? Explain.

  1. One way to interpret your estimates for alcohol consumption is to consider what happens to consumption of alcohol when households get additional income. Under this interpretation, what kind of good is alcohol?