Statistics Question

Week 3 Lecture 9 Effect Size When we reject the null hypothesis with an ANOVA test, we have two questions that arise. The first, which pair of means differs significantly, we have dealt with already. The second question, similar to what we asked with the t- test null hypothesis rejection is: what caused the rejection, the sample size, or the variable interactions? This question is again answered using an effect size measure.

Recall that the effect size measure shows ho w likely the variable interaction caused the null hypothesis rejection . L arge values lead us to say the variables caused the outcome, while small values lead us to say the outcome has little to no practical significance as the sample size was the most likely cause of the rejection of the null.

With the single factor ANOVA, the effect size measure is eta squared, and equals the SS(between)/SS(total) (Tanner & Youssef -Morgan, 2013). For our salary example in Lecture 8, eta squared equals 17686.02 (SS (between)) / 18066 (SS(total) ) = 0.979 (rounded). Eta squared effect size measures have different interpretation values than Cohen’s d (from the t - test). According to Nandy (2012), a small eta squared effect size has a value of 0.01, a medium of 0.06, and a large value of 0.14 or more. This means we have a large effect size, and the variables of salary and grade interaction are the most likely cause of our rejecting the null hypothesis rather than the sample size. Side note: Eta squared can also be interpreted as the percent of “differences between group scores that can be explained by the independent variable” (Tanner & Youssef -Morgan, 2013, p. 123). This is consistent with our saying the variable interactions caused the outcome.

Different Forms of ANOVA Just as the t -test has several forms, so does the ANOVA test. Excel has three versions available. While we will focus only on the single factor test, a brief description of the other two versions will be presented.

ANOVA: Two factor without replication The ANOVA – two factor without replication tests mean differences from two different variables at the same time. If we are interested in knowing if the mean salary differs by grade and also by gender, we can perform one two-factor test rather than two separate tests. As mentioned in lecture two for this week, this is more efficient and maintains our desired alpha significance level. Excel Example. To test the mean salaries by grade and gender at the same time, we would set up our hypothesis test as follows.

Step 1: Ho1: All salary means are equal across grades. Ha1: At least one mean differs. Ho2: All gender (male and female) means are equal. Ha2: At least one mean differs. Note that in this test, we need to have a hypothesis statement pair for each variable being tested. Step 2: ANOVA: Two sample without replication. Step 3: Reject the null hypothesis if the p -value is < alpha = .05. Step 4: Perform the test. While the input screen for this test is identical to that of the one factor test, the data table used is a bit different. As seen below, it has one value for each variable pair cell. Since we have multiple values for each variable pair, this table was set up with the mean values for each group. A B C D E F Male 24.3 27.7 43.3 48.0 61.7 75.3 Female 23.3 34.8 41.5 52.5 67.0 76.0 The data entry box would include the entire table, labels and all. The output for this test is: Step 5: Conclusions and Interpretation. As with the single factor ANOVA, we start out with a summary table for each variable showing the sum, average, and variance for each variable label. The ANOVA table has an extra row, and one renamed row. The Error row is what we knew as the Within row in the single factor ANOVA. The two rows dedicated to the data are Rows and Columns; these refer to how the variables are presented in the data input table.

The row line refers to our gender variable, since that is the row variable in the input. The p- value is 0.16 (rounded), so we do not reject the null hypothesis of equal means.

The Column line refers to the grade variable, as that was listed in the column position. This p -value is 3.76E -05, or 0.0000376. This is less than (<) our alpha of .05, so we reject the null hypothesis of equal salary means in each grade. We can find which pair(s) of means differ using the same technique as with the single factor ANOVA discussed in Lecture 8. The effect size measure for a Two -factor ANOVA without replication is generally the same as with the single factor ANOVA. For each variable it would be eta squared = SS(for variable) divided by the SS(total) value (Tanner & Youssef -Morgan, 2013). The effect size for our rejected null hypothesis is 3865.341/3917.059 = .987 (rounded), a very large effect – meaning the variable interaction caused the rejection of the null, and we have significant practical outcome; one we can make decisions with.

But, let’s go back to the other result , the failure to reject the null hypothesis claiming that the male and female average salaries are equal. What goes with this outcome? We have clear evidence from t -test done in Week 2 that the average salaries are not equal. This brings us to the other reason for using this test. This is to reduce one cause of error or varia tion in the measurement of a variable . For example, if we think that grade level may be a cause of differences in the salaries by grade (a reasonable ass umption), then we can remove their impact by using this approach. It will take the grade variation out of the overall analysis of salary and include it only in the grade results. What does this mean? We have been concerned that we have not been abl e to measure salary for “equal work ,” this approach does this for us. The salary average difference examined in this test has the impact of grade level differences removed , in essence, the salary that is analyzed is the salary impact of gender if every one did “equal work” (at least as far as job duties ). There is still some questions around the impact of performance ratings, education, seniority, etc. But for now, we have a better view of “equal pay for equal work” salary differences. It appears that perhaps males and females are being paid equally for equal work, on average. Ah, the power of statistics to make things clearer.  ANOVA: Two -factor with Replication (AKA Factorial ANOVA) This form of the ANOVA test is somewhat differen t than the previous two forms. While it can test for mean equality (or differences), this is not its primary purpose. The main purpose is to look at the impact of interaction between variables – that is do the results show different patterns when graphed? Interaction means the variables react differently at different measurement levels (Lind, Marchel, & Wathen, 2008). An example is water and temperature, at cold temperatures water is a solid, at mid -range temperatures it is a liquid, at high temperatur es it is a gas; there is a clear interaction going on. As with the without replication test, an example will help demonstrate this test. We will continue with our gender and grade impact on salary. While our primary research question will be if an interaction between gender and grades impacts salary, we will also repeat our questions about mean salary differences by gender and grade.

Excel Example. To test the mean salaries by grade and gender at the same time, we would set up our hypothesis test as foll ows.

Step 1: Ho1: All salary means are equal across grades. Ha1: At least one mean differs. Ho2: All gender (male and female) means are equal. Ha2: At least one mean differs. Ho3: The interaction impact is not significant. Ha3: The interaction is sig nificant.

Note that in this test, we need to have a hypothesis statement pair for each variable being tested, as well as the interaction. Step 2: ANOVA: Two sample with replication. Step 3: Reject the null hypothesis if the p -value is < alpha = .05. Step 4 : Perform the test. The input screen for this test is similar to that of the other ANVOA forms , it asks for the number of rows for each variable, which seen below would be two . The data table used is a bit different , as seen below, it has multip le values for each cell. Since several grades have only two males or females, we can only use two values in each cell in our table. If your data has more counts per cell, you can include more values. The data entry table was set up with the minimum a nd maximum salary values for each cell. A B C D E F Male 24.0 27.0 40.0 47.0 62.0 72.0 25.0 28.0 47.0 49.0 66.0 77.0 Female 22.0 34.0 41.0 50.0 65.0 75.0 24.0 36.0 42.0 55.0 69.0 77.0 The data entry box would include the entire table, labels and all. The output for this test is: Step 5: Conclusions and Interpretation. As with the other ANOVA forms, we start out with a summary of the variables. In the ANOVA table itself, we have added another row, this one for interaction . Whereas with the no replication output table we started with rows, we did not have sample , both refer to the variable listed in the input table row – gender in our case. We again do not reject our gender null hypothesis as the p-value is 0.055, greater than our p -value of .05. This test also found that gender average salaries did not significantly differ.

The column, or grade, null hypothesis was rejected with a p- value of 1.07E-11 or 0.0000000000107, which is less than (<) .05. So, our grade salary averages do di ffer by grade.

The interaction null hypothesis is also rejected with a p -value of o.135 (rounded), meaning that the salaries do not show a differing pattern by gender -grade groupings . Males and females are treated consistently through the gra des, essentially growing average salaries for each grade jump.

The effect size, eta squared, is done the same way as before – SS for the variable divided by SStotal. The calculation of differences is also done the same way – using SSwithin in the calculations. The various ANOVA formats can provide us with a lot of information that is hidden in other tests. This is one reason why single variable statistical tests that cannot separate out distinct sources of variation within our data measurements often do not pr ovide a complete understanding of the meaning hidden within the measures. More on this in the upcoming weeks. References Lakens , D. (Nov. 2013). Calculating and reporting effect sizes to facilitate cumulative science:

A practical primer for t -tests and ANOVAs. Frontiers in Psychology. 4:863. doi:

10.3389fpsyg2013.00863. Retrieved from http://journal.frontiersin.org/article/10. 3389/fpsyg.2013.00863/full Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw -Hill Irwin. Nandy, K. (2012). Understanding and Quantifying EFFECT SIZES. Retrieved from http://nursing.ucla.edu/workfiles/research/Effect%20Size%204- 9-2012.pdf Tanner, D. E. & Youssef -Morgan, C. M. (2013). Statistics for Managers. San D iego, CA:

Bridgepoint Educ ation.