Instructions Create a PowerPoint presentation for the Sun Coast Remediation research project to communicate the findings and suggest recommendations. Please use the following format: §  Slide 1: Inclu

Running head: SUN COAST CORRELATION AND REGRESSION ANALYSIS 0

Sun Coast Correlation and Regression Analysis

Columbia Southern University

Data Analysis: Hypothesis Testing

Correlation and Regression tests are two parametric approaches that create data that show and describe the relationship and differences between groups, populations, or samples. The correlation tests and regression tests have been applied to the Sun Coast Safety Project. (Creswell & Creswell, 2018).

Correlation: Hypothesis Testing

Ho1: The first hypothesis is that there is no statistical significance between micron and the mean annual sick days per employee.

Ha1: The second hypothesis that is being tested is that there is a statistical significance between micron and the mean annual sick days per employee.

Correlation data

 

Column 1

Column 2

Column 1

Column 2

-0.71598

A Pearson correction coefficient of r=0.716 shows that there is a moderately strong positive correlation. It equates to an r2 of .51, which exemplifies 51 % of the variance between the independent and dependent variables.

Using an alpha of .05, the test result indicates there is a p-value of 1.05 < .05. Therefore, the null hypothesis is accepted, and the alternative hypothesis is rejected that there is a statistically significant relationship between micron and the number of sick leaves.

The P-value and multiple R were obtained by running the data on sheet one using simple regression.

Simple regression for sheet one

Regression Statistics

Multiple R

0.719543

R Square

0.517742

Adjusted R Square

0.512919

Standard Error

1.299576

Observations

102

ANOVA

 

df

SS

MS

F

Significance F

Regression

181.3162

181.3162

107.3578

1.59919E-17

Residual

100

168.8897

1.688897

Total

101

350.2059

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

10.01017

0.309974

32.29362

1.05E-54

9.395194602

10.62515

9.395195

10.62515

-0.51501

0.049705

-10.3614

1.6E-17

-0.613625981

-0.4164

-0.61363

-0.4164

Simple Regression: Hypothesis Testing

Ho2: There lacks a statistical relationship between safety training hours and lost time hours as the predicted outcome

Ha2: There is a statistical relationship between safety training hours and lost time hours as the predicted outcome.

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.939559

R Square

0.882772

Adjusted R Square

0.882241

Standard Error

24.61329

Observations

223

ANOVA

 

df

SS

MS

F

Significance F

Regression

1008202

1008202

1664.211

7.6586E-105

Residual

221

133884.9

605.814

Total

222

1142087

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

273.4494

2.665262

102.5976

2.1E-188

268.1968373

278.702

268.1968

278.702

X Variable 1

-0.14337

0.003514

-40.7947

7.7E-105

-0.150293705

-0.13644

-0.15029

-0.13644

The multiple r coefficient of r= 0.939 shows that there is a very strong correlation between lost time hours and safety and training. This equates to a multiple R of .88, explaining 88 % of the variance between the independent and dependent variables. With a large Anova F value of 1664. 211, there is a clue however that something is also significant in the relationship between the independent and dependent variables.

Using an alpha of .05, the results indicate a p-value of 2.1< .05. A larger than 0.05 P-value indicates that the values do not fit well into the line (Zou, Tuncali, & Silverman, 2003). Therefore, the null hypothesis is accepted, and the alternative hypothesis is rejected that there is a statistically significant relationship between safety training hours and lost time hours. The x variable coefficient indicates a p-value of 7.7<.05, a statistical result that confirms that it is not statistically significant in the regression model.

Dv=273.4494+-0.14337, which indicates that the model is nor predictive

Multiple Regressions: Hypothesis Testing

Ha3: There is no statistical relationship between decibel as the predicted outcome and frequency, angle in degree, chord length and velocity.

Ha3: There is a statistical relationship between decibel as the predicted outcome and frequency, angle in degree, chord length and velocity.

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.602018

R Square

0.362425

Adjusted R Square

0.360294

Standard Error

5.519422

Observations

1502

ANOVA

 

df

SS

MS

F

Significance F

Regression

25906.34

5181.267

170.0782477

2.0796E-143

Residual

1496

45574.18

30.46402

Total

1501

71480.51

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

126.8097

0.624161

203.1683

125.5853697

128.034

125.5854

128.034

800

-0.00112

4.76E-05

-23.4962

3.6392E-104

-0.001211363

-0.00102

-0.00121

-0.00102

0.046383

0.037337

1.242292

0.214323474

-0.02685487

0.119621

-0.02685

0.119621

0.1809

-5.41565

2.930439

-1.84807

0.064789628

-11.16386013

0.332552

-11.1639

0.332552

71.3

0.083527

0.00931

8.971829

8.51468E-19

0.065265095

0.101789

0.065265

0.101789

0.002663

-240.385

16.52241

-14.549

5.9646E-45

-272.7947344

-207.976

-272.795

-207.976

The multiple R coefficient of R= .60 indicates a moderately strong correlation between decibel as the predicted outcome and frequency, angle in degree, chord length and velocity. This equates to an R2 of .36, explaining 36 % of the variance between the variables being tested.

Using an alpha of .05, the results indicate a p-value of 0.21 < .05. Therefore, the null hypothesis is rejected, and the alternative hypothesis is accepted that there is a statistically significant relationship. The x variable coefficients indicate a p-value of 0.21, 0.06, for frequency and chord length respectively <.05 which shows that there are statistically significant in the regression model. The other three variables of frequency, velocity, and distribution have a greater significance level than their alpha values hence are statistically insignificant in the regression model (Levine, Berenson, Stephan, & Lysell, 1999). A significantly large Anova F statistic of 170.08 indicates that some variables are statistically significant in the prediction model.

Therefore, the derived predictive model is:

Dv=0.046383+-5.4, which indicates that the model is nor predictive. All the other variables are excluded because they have no statistical significance.

References

Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). Thousand Oaks, CA: Sage.

Levine, D. M., Berenson, M. L., Stephan, D., & Lysell, D. (1999). Statistics for managers using Microsoft Excel (Vol. 660). Upper Saddle River, NJ: Prentice-Hall.

Zou, K. H., Tuncali, K., & Silverman, S. G. (2003). Correlation and simple linear regression. Radiology, 227(3), 617-628.