Statistics Question

Week 4 Lecture 12 Significance Earlier we discussed correlations without going into how we can identify statistically significant values. Our approach to this uses the t -test. Unfortunately, Excel does not automatically produce this form of the t -test, but setting it up within an Exce l cell is fairly easy. And, with some slight algebra, we can determine the minimum value that is statistically significant for any table of correlations all of which have the same number of pairs (for example, a Correlation table for our data set would us e 50 pairs of values, since we have 50 members in our sample).

The t-test formula for a correlation (r) is t = r * sqrt(n-2)/sqrt(1 -r 2); the associated degrees of freedom are n -2 (number of pairs – 2) (Lind, Marchel, & Wathen, 2008). For some this might l ook a bit off -putting, but remember that we can translate this into Excel cells and functions and have Excel do the arithmetic for us. Excel Example If we go back to our correlation table for salary, midpoint, Age, Perf Rat, Service, and Raise, we have: Using Excel to create the formula and cell numbers for our key values allows us to quickly create a result. The T.dist.2t gives us a p- value easily.

The formula to use in finding the minimum correlation value that is statistically significant is r = sqrt( t^2/(t^2 + n- 2)). We would find the appropriate t value by using the t.inv.2T(alpha, df) with alpha = 0.05 and df = n-2 or 48. Plugging these values into the gives us a t -value of 2.0106 or 2.011(rounded). Putting 2.011 and 48 (n -2) into our formula give s us a r value of 0.278; therefore, in a correlation table based on 50 pairs, any correlation greater or equal to 0.278 would be statistically significant. Technical Point. If you are interested in how we obtained the formula for determining the minimum r value, the approach is shown below. If you are not interested in the math, you can safely skip this paragraph. t = r* sqrt(n -2)/sqrt(1 -r 2) Multiplying gives us t *sqrt (1 - r 2) = r 2* (n -2) Squaring gives us: t 2 * (1- r 2) = r 2* (n -2) Multiplying out gives us: t 2– t 2* r 2 = n r 2- 2* r 2 Adding give s us: t 2= n* r 2-2*r 2+ t 2 *r2 Factoring gives us t 2= r 2 * (n - 2+ t 2) Dividing gives us t 2 / (n - 2+ t 2) = r 2 Taking the square root gives us r = sqrt (t 2 / (n - 2+ t 2) Effect Size Measures As we have discussed, there is a difference between statistical and practical significance. Virtually any statistic can become statistically significant if the sample is large enough. In practical terms, a correlation of .30 and below is generally considered t oo weak to be of any practical significance. Additionally, the effect size measure for Pearson’s correlation is simply the absolute value of the correlation; the outcome has the same general interpretation as Cohen’s D for the t -test (0.8 is strong, and 0.2 is quite weak, for example) (Tanner & Youssef - Morgan, 2013). Spearman’s Rank Correlation Another type of correlation is the Spearman’s rank order correlation. This correlation, which is interpreted the same way as the Pearson’s Correlation, can be per formed on ordinal or any ranked data. If the data used is ordinal (rankable), we use Spearman’s rank order correlation, rho (Tanner & Youssef -Morgan, 2013). Using the same data, only assuming at least one variable is ordinal would give us the following r esults. Note in ranking from low to high, similar values are given the average rank for all of the same values . F or example, in the example below the raise of 4.7 occurs twice ( the 3rd and 4th places), so it gets a rank of 3.5.

Performance Rating Raise Raise - Rank Difference in rank Difference squared PR - Rank 1 55 3 1 0 0 2 75 3.6 2 0 0 4 80 4.7 3.5 0.5 0.25 9 100 4.7 3.5 5.5 30.25 9 100 4.8 5 4 16 4 80 4.9 6 -2 4 4 80 5.6 7 -3 9 9 100 5.7 8 1 1 6.5 90 5.8 9 -2.5 6.25 6.5 90 6 10 -3.5 12.25 Sum = 79 Spearman’s rank order correlation = 1 -6*sum of differences squared/(n*(n 2 -1)) For this data, the sum of differences = 79, and n = 10. This gives us a value of 1- 6*(79/(10 *(10 2 -1))79 = 1 – 6* (79/(10*99) = 1 -6 * ( 79/990) = 1 – 6*0.08 = 0.52.

For comparison purposes, the Pearson Correlation equals 0.686. Note that we have less information about the data when we use ranks, particularly with several ties in the data. This reduced inform ation results in a lower correlation value with Spearman’s. This correlation is tested and interpreted the same way as Pearson’s Coefficient is (Lind, Marchel, & Wathen, 2008).

References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw -Hill Irwin. Tanner, D. E. & Youssef -Morgan, C. M. (2013). Statistics for Managers. San Diego, CA:

Bridgeport Education.