Statistic in Health Care Management Week 4 Critical Reflection Paper: Chapters 7 Objective: To critically reflect your understanding of the readings and your ability to apply them to your Health care

Chapter 7 Hypothesis Testing Procedures Learning Objectives • Define null and research hypothesis, test statistic, level of significance and decision rule • Distinguish between Type I and Type II errors and discuss the implications of each • Explain the difference between one - and two - sided tests of hypothesis Learning Objectives • Estimate and interpret p -values • Explain the relationship between confidence interval estimates and p -values in drawing inferences • Perform analysis of variance by hand • Appropriately interpret the results of analysis of variance tests • Distinguish between one and two factor analysis of variance tests Learning Objectives • Perform chi -square tests by hand • Appropriately interpret the results of chi -square tests • Identify the appropriate hypothesis testing procedures based on type of outcome variable and number of samples Hypothesis Testing • Research hypothesis is generated about unknown population parameter • Sample data are analyzed and determined to support or refute the research hypothesis Hypothesis Testing Procedures Step 1 Null hypothesis (H 0):

No difference, no change Research hypothesis (H 1): What investigator believes to be true Hypothesis Testing Procedures Step 2 Collect sample data and determine whether sample data support research hypothesis or not.

For example, in test for m , evaluate .X Hypothesis Testing Procedures Step 3 • Set up decision rule to decide when to believe null versus research hypothesis • Depends on level of significance, a = P(Reject H 0|H 0 is true) Hypothesis Testing Procedures Steps 4 and 5 • Summarize sample information in test statistic (e.g., Z value) • Draw conclusion by comparing test statistic to decision rule. Provide final assessment as to whether H 1 is likely true given the observed data. P - values • P - values represent the exact significance of the data • Estimate p - values when rejecting H 0 to summarize significance of the data (can approximate with statistical tables, can get exact value with statistical computing package) • P - value is the smallest a where we still reject H 0 Hypothesis Testing Procedures 1. Set up null and research hypotheses, select a 2. Select test statistic 2. Set up decision rule 3. Compute test statistic 4. Draw conclusion & summarize significance Errors in Hypothesis Tests Hypothesis Testing for m • Continuous outcome • 1 Sample H 0: m=m 0 H 1: m>m 0, m < m 0, m ≠ m 0 Test Statistic n > 30 (Find critical value in Table 1C, n<30 Table 2, df=n -1)n s/ μ- X Z 0 = n s/ μ- X t 0 = Example 7.2. Hypothesis Testing for m The National Center for Health Statistics (NCHS) reports the mean total cholesterol for adults is 203. Is the mean total cholesterol in Framingham Heart Study participants significantly different? In 3310 participants the mean is 200.3 with a standard deviation of 36.8. Example 7.2. Hypothesis Testing for m 1. H 0: m=203 H 1: m ≠ 203 a=0.05 2. Test statistic 3. Decision rule Reject H 0 if z > 1.96 or if z < -1.96n s/ μ- X Z 0 = Example 7.2. Hypothesis Testing for m 4. Compute test statistic 5. Conclusion. Reject H 0 because -4.22 < -1.96. We have statistically significant evidence at a =0.05 to show that the mean total cholesterol is different in the Framingham Heart Study participants.22.4 3310 /8. 36 203 3. 200 n s/ μ- X Z 0 =  = = Example 7.2. Hypothesis Testing for m Significance of the findings. Z = -4.22. Table 1C. Critical Values for Two -Sided Tests a Z 0.20 1.282 0.10 1.645 0.05 1.960 0.010 2.576 0.001 3.291 0.0001 3.819 p<0.0001. New Scenario • Outcome is dichotomous (p=population proportion) – Result of surgery (success, failure) – Cancer remission (yes/no) • One study sample • Data – On each participant, measure outcome (yes/no) – n, x=# positive responses, n x pˆ= Hypothesis Testing for p • Dichotomous outcome • 1 Sample H 0: p = p 0 H 1: p > p 0, p

Critical Value from Table 3 Example 7.6. c 2 goodness - of - fit test A university survey reveals that 60% of students get no regular exercise, 25% exercise sporadically and 15% exercise regularly. The university institutes a health promotion campaign and re -evaluates exercise one year later. None Sporadic Regular Number of students 255 125 90 Example 7.6. c 2 goodness - of - fit test 1. H 0: p 1=0.60, p 2=0.25,p 3=0.15 H 1: H 0 is false a=0.05 2. Test statistic 3. Decision rule df=k -1=3 -1=2 Reject H 0 if c 2 > 5.99= E E)- (O χ 2 2 Example 7.6. c 2 goodness - of - fit test 4. Compute test statistic None Sporadic Regular Total No. students (O) 255 125 90 470 Expected (E) 282 117.5 70.5 470 (O -E) 2/E 2.59 0.48 5.39 c 2 = 8.46= E E)- (O χ 2 2 Example 7.6. c 2 goodness - of - fit test 5. Conclusion. Reject H 0 because 8.46 > 5.99. We have statistically significant evidence at a =0.05 to show that the distribution of exercise is not 60%, 25%, 15%. Using Table 3, the p -value is p<0.005. New Scenario • Outcome is continuous – SBP, Weight, cholesterol • Two independent study samples • Data – On each participant, identify group and measure outcome – ) s (or s, X, n),s (or s, X, n 2 22 2 2 1 21 1 1 Two Independent Samples RCT: Set of Subjects Who Meet Study Eligibility Criteria Randomize Treatment 1 Treatment 2 Mean Trt 1 Mean Trt 2 Two Independent Samples Cohort Study - Set of Subjects Who Meet Study Inclusion Criteria Group 1 Group 2 Mean Group 1 Mean Group 2 Hypothesis Testing for ( m 1  m 2 ) • Continuous outcome • 2 Independent Sample H 0 : m 1 =m 2 ( m 1  m 2 = 0) H 1 : m 1 >m 2 , m 1 < m 2 , m 1 ≠ m 2 Hypothesis Testing for ( m 1  m 2 ) • Continuous outcome • 2 Independent Sample H 0: m 1=m 2 H 1: m 1>m 2, m 1< m 2, m 1≠ m 2 Test Statistic n1 > 30 and (Find critical value n 2> 30 in Table 1C, n 1<30 or Table 2, df=n 1+n 2-2) n 2<302 1 2 1 n 1 n 1 Sp X - X Z  = 2 1 2 1 n 1 n 1 Sp X - X t  = Pooled Estimate of Common Standard Deviation, Sp • Previous formulas assume equal variances ( s 1 2 = s 2 2 ) • If 0.5 < s 1 2 /s 2 2 < 2, assumption is reasonable2 n n 1)s (n 1)s (n Sp 2 1 22 2 21 1      = Example 7.9. Hypothesis Testing for ( m 1  m 2 ) A clinical trial is run to assess the effectiveness of a new drug in lowering cholesterol. Patients are randomized to receive the new drug or placebo and total cholesterol is measured after 6 weeks on the assigned treatment.

Is there evidence of a statistically significant reduction in cholesterol for patients on the new drug? Example 7.9. Hypothesis Testing for ( m 1  m 2 ) Sample Size Mean Std Dev New Drug 15 195.9 28.7 Placebo 15 227.4 30.3 Example 7.2. Hypothesis Testing for ( m 1  m 2 ) 1. H 0: m 1=m 2 H 1: m 1

(p<0.005)92.2 15 1 15 1 5. 29 4. 227 9. 195 n 1 n 1 Sp X - X t 2 1 2 1 =   =  = New Scenario • Outcome is continuous – SBP, Weight, cholesterol • Two matched study samples • Data – On each participant, measure outcome under each experimental condition – Compute differences (D=X 1-X 2) – d ds, X n, Two Dependent/Matched Samples Subject ID Measure 1 Measure 2 1 55 70 2 42 60 .

. Measures taken serially in time or under different experimental conditions Crossover Trial Treatment Treatment Eligible R Participants Placebo Placebo Each participant measured on Treatment and placebo Hypothesis Testing for m d • Continuous outcome • 2 Matched/Paired Sample H 0: m d=0 H 1: m d>0 , m d<0, m d≠0 Test Statistic n > 30 (Find critical value in Table 1C, n<30 Table 2, df=n -1)n s μ - X Z d d d = n s μ - X t d d d = Example 7.10. Hypothesis Testing for m d Is there a statistically significant difference in mean systolic blood pressures (SBPs) measured at exams 6 and 7 (approximately 4 years apart) in the Framingham Offspring Study?

Among n=15 randomly selected participants, the mean difference was -5.3 units and the standard deviation was 12.8 units. Differences were computed by subtracting the exam 6 value from the exam 7 value. Example 7.10. Hypothesis Testing for m d 1. H 0: m d=0 H 1: m d≠ 0 a=0.05 2. Test statistic 3. Decision rule, df=n -1=14 Reject H 0 if t > 2.145 or if z < -2.145n s μ - X t d d d = Example 7.2. Hypothesis Testing for m d 4. Compute test statistic 5. Conclusion. Do not reject H 0 because -2.145 < -1.60 < 2.145. We do not have statistically significant evidence at a =0.05 to show that there is a difference in systolic blood pressures over time.60.1 15 /8. 12 0 3.5 n s μ - X t d d d =   = = New Scenario • Outcome is dichotomous – Result of surgery (success, failure) – Cancer remission (yes/no) • Two independent study samples • Data – On each participant, identify group and measure outcome (yes/no) – 2 2 1 1 pˆ, n, pˆ, n Hypothesis Testing for (p 1 - p 2 ) • Dichotomous outcome • 2 Independent Sample H 0: p 1=p 2 H 1: p 1>p 2, p 1

1.96       = 2 1 2 1 n 1 n 1 )pˆ- (1pˆ pˆ- pˆ Z Example 7.2. Hypothesis Testing for (p 1 - p 2 ) 4. Compute test statistic0.0975 3055 298 pˆ 0.1089, 744 81 pˆ 2 1 = = = =        = 2 1 2 1 n 1 n 1 )pˆ- (1pˆ pˆ- pˆ Z 0.0988 3055 744 298 81 pˆ =   = 927.0 3055 1 744 1 0.0988)- 0.0988(1 0.0975- 0.1089 Z =      = Example 7.2. Hypothesis Testing for (p 1 - p 2 ) 5. Conclusion. Do not reject H 0 because -1.96 < 0.927 < 1.96. We do not have statistically significant evidence at a =0.05 to show that there is a difference in prevalent CVD between smokers and nonsmokers. Hypothesis Testing for More than 2 Means* • Continuous outcome • k Independent Samples, k > 2 H 0: m 1=m 2=m 3 … =m k H 1: Means are not all equal Test Statistic (Find critical value in Table 4) *Analysis of Variancek) /(N ) X ΣΣ(X 1) /(k )X X( Σn F 2j 2 j j     = Test Statistic - F Statistic • Comparison of two estimates of variability in data • Between treatment variation, is based on the assumption that H 0 is true (i.e., population means are equal) • Within treatment, Residual or Error variation, is independent of H 0 (i.e., we do not assume that the population means are equal and we treat each sample separately) F Statistic Difference BETWEEN each group mean and overall mean Difference between each observation and its group mean (WITHIN group variation - ERROR)k) /(N ) X ΣΣ(X 1) /(k )X X( Σn F 2j 2 j j     = F Statistic F = MS B /MS E MS = Mean Square What values of F that indicate H 0 is likely true? Decision Rule Reject H 0 if F > Critical Value of F with df 1 =k - 1 and df 2 =N - k from Table 4 k= # comparison groups N=Total sample size ANOVA Table Source of Sums of Mean Variation Squares df Squares F Between Treatments k -1 SSB/k -1 MSB/MSE Error N -k SSE/N -k Total N -1) X - X( n Σ = SSB j 2 j ) X - X( Σ Σ = SSE j2 ) X - X( Σ Σ = SST 2 Example 7.14. ANOVA Is there a significant difference in mean weight loss among 4 different diet programs? (Data are pounds lost over 8 weeks) Low - Cal Low - Fat Low - Carb Control 8 2 3 2 9 4 5 2 6 3 4 - 1 7 5 2 0 3 1 3 3 Example 7.14. ANOVA 1. H 0: m 1=m 2=m 3=m 4 H 1: Means are not all equal a=0.05 2. Test statistick) /(N ) X ΣΣ(X 1) /(k )X X( Σn F 2j 2 j j     = Example 7.14. ANOVA 3. Decision rule df 1=k -1=4 -1=3 df 2=N -k=20 -4=16 Reject H 0 if F > 3.24 Example 7.14. ANOVA Summary Statistics on Weight Loss by Treatment Low -Cal Low -Fat Low -Carb Control N 5 5 5 5 Mean 6.6 3.0 3.4 1.2 Overall Mean = 3.6 Example 7.14. ANOVA =5(6.6 - 3.6) 2 +5(3.0 - 3.6) 2 +5(3.4 - 3.6) 2 +5(1.2 - 3.6) 2 = 75.8) X - X( n Σ = SSB j 2 j Example 7.14. ANOVA Low - Cal (X - 6.6) (X - 6.6) 2 8 1.4 2.0 9 2.4 5.8 6 - 0.6 0.4 7 0.4 0.2 3 - 3.6 13.0 Total 0 21.4) X - X( Σ Σ = SSE j2 Example 7.14. ANOVA Low - Fat (X - 3.0) (X - 3.0) 2 2 - 1.0 1.0 4 1.0 1.0 3 0 0 5 2.0 4.0 1 - 2.0 4.0 Total 0 10.0) X - X( Σ Σ = SSE j2 Example 7.14. ANOVA Low - Carb (X - 3.4) (X - 3.4) 2 3 - 0.4 0.2 5 1.6 2.6 4 0.6 0.4 2 - 1.4 2.0 3 - 0.4 0.2 Total 0 5.4) X - X( Σ Σ = SSE j2 Example 7.14. ANOVA Control (X - 1.2) (X - 1.2) 2 2 0.8 0.6 2 0.8 0.6 - 1 - 2.2 4.8 0 - 1.2 1.4 3 1.8 3.2 Total 0 10.6) X - X( Σ Σ = SSE j2 Example 7.14. ANOVA =21.4 + 10.0 + 5.4 + 10.6 = 47.4) X - X( Σ Σ = SSE j2 Example 7.14. ANOVA Source of Sums of Mean Variation Squares df Squares F Between 75.8 3 25.3 8.43 Treatments Error 47.4 16 3.0 Total 123.2 19 Example 7.14. ANOVA 4. Compute test statistic F=8.43 5. Conclusion. Reject H 0 because 8.43 > 3.24. We have statistically significant evidence at a =0.05 to show that there is a difference in mean weight loss among 4 different diet programs. Two Factor ANOVA • Compare means of a continuous outcome across two grouping variables or factors – Overall test – is there a difference in cell means – Factor A – marginal means – Factor B – marginal means – Interaction – difference in means across levels of Factor B for each level of Factor A? Interaction Cell Means Factor B 1 2 3 Factor A 1 45 58 70 2 65 55 3835 40 45 50 55 60 65 70 75 1 2 3 A1 A2 No Interaction Cell Means Factor B 1 2 3 Factor A 1 45 58 70 2 38 55 6535 40 45 50 55 60 65 70 75 1 2 3 A1 A2 EXAMPLE 7.16 Two Factor ANOVA • Clinical trial to compare time to pain relief of three competing drugs for joint pain. Investigators hypothesize that there may be a differential effect in men versus women. • Design – N=30 participants (15 men and 15 women) are assigned to 3 treatments (A, B, C) EXAMPLE 7.16 Two Factor ANOVA • Mean times to pain relief by treatment and gender • Is there a difference in mean times to pain relief? Are differences due to treatment? Gender? Or both? Men Women A 14.8 21.4 B 17.4 23.2 C 25.4 32.4 EXAMPLE 7.16 Two Factor ANOVA Table Source Sums of Mean Of Variation Squares df Squares F p -value Model 967.0 5 193.4 20.7 0.0001 Treatment 651.5 2 325.7 34.8 0.0001 Gender 313.6 1 313.6 33.5 0.0001 Treatment*Gender 1.9 2 0.9 0.1 0.9054 Error 224.4 24 9.4 Total 1191.4 29 Hypothesis Testing for Categorical or Ordinal Outcomes* • Categorical or ordinal outcome • 2 or More Samples H 0: The distribution of the outcome is independent of the groups H 1: H 0 is false Test Statistic (Find critical value in Table 3 * c 2 test of independence df=(r -1)(c -1))= E E)- (O χ 2 2 Chi - Square Test of Independence Outcome is categorical or ordinal (2+ levels) and there are two or more independent comparison groups (e.g., treatments). H 0: Treatment and Outcome are Independent (distributions of outcome are the same across treatments) Example 7.17. c 2 Test of Independence Is there a relationship between students’ living arrangement and exercise status? Exercise Status None Sporadic Regular Total Dormitory 32 30 28 90 On -campus Apt 74 64 42 180 Off -campus Apt 110 25 15 150 At Home 39 6 5 50 Total 255 125 90 470 Example 7.17. c 2 Test of Independence 1. H 0: Living arrangement and exercise status are independent H 1: H 0 is false a=0.05 2. Test statistic 3. Decision rule df=(r -1)(c -1)=3(2)=6 Reject H 0 if c 2 > 12.59= E E)- (O χ 2 2 Example 7.17. c 2 Test of Independence 4. Compute test statistic O = Observed frequency E = Expected frequency E = (row total)*(column total)/N= E E)- (O χ 2 2 Example 7.17. c 2 Test of Independence 4. Compute test statistic Table entries are Observed (Expected) frequencies Exercise Status None Sporadic Regular Total Dormitory 32 30 28 90 (90*255/470=48.8) (23.9) (17.2) On -campus Apt 74 64 42 180 (97.7) (47.9) (34.5) Off -campus Apt 110 25 15 150 (81.4) (39.9) (28.7) At Home 39 6 5 50 (27.1) (13.3) (9.6) Total 255 125 90 470 Example 7.17. c 2 Test of Independence 4. Compute test statistic5. 60 χ 9.6 9.6) (5 ... 17.2 17.2) (28 23.9 23.9) (30 48.8 48.8) (32 χ 2 2 2 2 2 2 =         = Example 7.17. c 2 Test of Independence 5. Conclusion. Reject H 0 because 60.5 > 12.59. We have statistically significant evidence at a =0.05 to show that living arrangement and exercise status are not independent. (P<0.005)