Instructions Create a PowerPoint presentation for the Sun Coast Remediation research project to communicate the findings and suggest recommendations. Please use the following format: § Slide 1: Inclu
Running head: SUN COAST CORRELATION AND REGRESSION ANALYSIS 0
Sun Coast Correlation and Regression Analysis
Columbia Southern University
Data Analysis: Hypothesis Testing
Correlation and Regression tests are two parametric approaches that create data that show and describe the relationship and differences between groups, populations, or samples. The correlation tests and regression tests have been applied to the Sun Coast Safety Project. (Creswell & Creswell, 2018).
Correlation: Hypothesis Testing
Ho1: The first hypothesis is that there is no statistical significance between micron and the mean annual sick days per employee.
Ha1: The second hypothesis that is being tested is that there is a statistical significance between micron and the mean annual sick days per employee.
Correlation data
| Column 1 | Column 2 |
Column 1 | ||
Column 2 | -0.71598 |
A Pearson correction coefficient of r=0.716 shows that there is a moderately strong positive correlation. It equates to an r2 of .51, which exemplifies 51 % of the variance between the independent and dependent variables.
Using an alpha of .05, the test result indicates there is a p-value of 1.05 < .05. Therefore, the null hypothesis is accepted, and the alternative hypothesis is rejected that there is a statistically significant relationship between micron and the number of sick leaves.
The P-value and multiple R were obtained by running the data on sheet one using simple regression.
Simple regression for sheet one
Regression Statistics | ||||||||
Multiple R | 0.719543 | |||||||
R Square | 0.517742 | |||||||
Adjusted R Square | 0.512919 | |||||||
Standard Error | 1.299576 | |||||||
Observations | 102 | |||||||
ANOVA | ||||||||
| df | SS | MS | F | Significance F | |||
Regression | 181.3162 | 181.3162 | 107.3578 | 1.59919E-17 | ||||
Residual | 100 | 168.8897 | 1.688897 | |||||
Total | 101 | 350.2059 |
|
|
| |||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% |
Intercept | 10.01017 | 0.309974 | 32.29362 | 1.05E-54 | 9.395194602 | 10.62515 | 9.395195 | 10.62515 |
-0.51501 | 0.049705 | -10.3614 | 1.6E-17 | -0.613625981 | -0.4164 | -0.61363 | -0.4164 |
Simple Regression: Hypothesis Testing
Ho2: There lacks a statistical relationship between safety training hours and lost time hours as the predicted outcome
Ha2: There is a statistical relationship between safety training hours and lost time hours as the predicted outcome.
SUMMARY OUTPUT | |||||||||||||||
Regression Statistics | |||||||||||||||
Multiple R | 0.939559 | ||||||||||||||
R Square | 0.882772 | ||||||||||||||
Adjusted R Square | 0.882241 | ||||||||||||||
Standard Error | 24.61329 | ||||||||||||||
Observations | 223 | ||||||||||||||
ANOVA | |||||||||||||||
| df | SS | MS | F | Significance F | ||||||||||
Regression | 1008202 | 1008202 | 1664.211 | 7.6586E-105 | |||||||||||
Residual | 221 | 133884.9 | 605.814 | ||||||||||||
Total | 222 | 1142087 |
|
|
| ||||||||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |||||||
Intercept | 273.4494 | 2.665262 | 102.5976 | 2.1E-188 | 268.1968373 | 278.702 | 268.1968 | 278.702 | |||||||
X Variable 1 | -0.14337 | 0.003514 | -40.7947 | 7.7E-105 | -0.150293705 | -0.13644 | -0.15029 | -0.13644 |
The multiple r coefficient of r= 0.939 shows that there is a very strong correlation between lost time hours and safety and training. This equates to a multiple R of .88, explaining 88 % of the variance between the independent and dependent variables. With a large Anova F value of 1664. 211, there is a clue however that something is also significant in the relationship between the independent and dependent variables.
Using an alpha of .05, the results indicate a p-value of 2.1< .05. A larger than 0.05 P-value indicates that the values do not fit well into the line (Zou, Tuncali, & Silverman, 2003). Therefore, the null hypothesis is accepted, and the alternative hypothesis is rejected that there is a statistically significant relationship between safety training hours and lost time hours. The x variable coefficient indicates a p-value of 7.7<.05, a statistical result that confirms that it is not statistically significant in the regression model.
Dv=273.4494+-0.14337, which indicates that the model is nor predictive
Multiple Regressions: Hypothesis Testing
Ha3: There is no statistical relationship between decibel as the predicted outcome and frequency, angle in degree, chord length and velocity.
Ha3: There is a statistical relationship between decibel as the predicted outcome and frequency, angle in degree, chord length and velocity.
SUMMARY OUTPUT | |||||||||||||||
Regression Statistics | |||||||||||||||
Multiple R | 0.602018 | ||||||||||||||
R Square | 0.362425 | ||||||||||||||
Adjusted R Square | 0.360294 | ||||||||||||||
Standard Error | 5.519422 | ||||||||||||||
Observations | 1502 | ||||||||||||||
ANOVA | |||||||||||||||
| df | SS | MS | F | Significance F | ||||||||||
Regression | 25906.34 | 5181.267 | 170.0782477 | 2.0796E-143 | |||||||||||
Residual | 1496 | 45574.18 | 30.46402 | ||||||||||||
Total | 1501 | 71480.51 |
|
|
| ||||||||||
| Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |||||||
Intercept | 126.8097 | 0.624161 | 203.1683 | 125.5853697 | 128.034 | 125.5854 | 128.034 | ||||||||
800 | -0.00112 | 4.76E-05 | -23.4962 | 3.6392E-104 | -0.001211363 | -0.00102 | -0.00121 | -0.00102 | |||||||
0.046383 | 0.037337 | 1.242292 | 0.214323474 | -0.02685487 | 0.119621 | -0.02685 | 0.119621 | ||||||||
0.1809 | -5.41565 | 2.930439 | -1.84807 | 0.064789628 | -11.16386013 | 0.332552 | -11.1639 | 0.332552 | |||||||
71.3 | 0.083527 | 0.00931 | 8.971829 | 8.51468E-19 | 0.065265095 | 0.101789 | 0.065265 | 0.101789 | |||||||
0.002663 | -240.385 | 16.52241 | -14.549 | 5.9646E-45 | -272.7947344 | -207.976 | -272.795 | -207.976 |
The multiple R coefficient of R= .60 indicates a moderately strong correlation between decibel as the predicted outcome and frequency, angle in degree, chord length and velocity. This equates to an R2 of .36, explaining 36 % of the variance between the variables being tested.
Using an alpha of .05, the results indicate a p-value of 0.21 < .05. Therefore, the null hypothesis is rejected, and the alternative hypothesis is accepted that there is a statistically significant relationship. The x variable coefficients indicate a p-value of 0.21, 0.06, for frequency and chord length respectively <.05 which shows that there are statistically significant in the regression model. The other three variables of frequency, velocity, and distribution have a greater significance level than their alpha values hence are statistically insignificant in the regression model (Levine, Berenson, Stephan, & Lysell, 1999). A significantly large Anova F statistic of 170.08 indicates that some variables are statistically significant in the prediction model.
Therefore, the derived predictive model is:
Dv=0.046383+-5.4, which indicates that the model is nor predictive. All the other variables are excluded because they have no statistical significance.
References
Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). Thousand Oaks, CA: Sage.
Levine, D. M., Berenson, M. L., Stephan, D., & Lysell, D. (1999). Statistics for managers using Microsoft Excel (Vol. 660). Upper Saddle River, NJ: Prentice-Hall.
Zou, K. H., Tuncali, K., & Silverman, S. G. (2003). Correlation and simple linear regression. Radiology, 227(3), 617-628.