Instructions Create a PowerPoint presentation for the Sun Coast Remediation research project to communicate the findings and suggest recommendations. Please use the following format: § Slide 1: Inclu

Running head: DESCRIPTIVE STATISTICS 0

Descriptive statistics

Stephener Baisey

Columbia Southern University

Data Analysis

Descriptive Data and Assumptions: Correlation

Frequency Distribution Table

PM size	Frequency
0-1
2-4	24
5-7	37
8-10	34

Sick Days	Frequency
0-2
4-7	61
8-9	30
10-12	11

Histogram

Descriptive Statistics Table

microns		sick day

Mean	5.65728155	Mean	7.126214
Standard Error	0.25560014	Standard Error	0.186484
Median		Median
Mode		Mode
Standard Deviation	2.59405814	Standard Deviation	1.892605
Sample Variance	6.72913764	Sample Variance	3.581953
Kurtosis	-0.8521619	Kurtosis	0.124923
Skewness	-0.37325713	Skewness	0.14225
Range	9.8	Range	10
Minimum	0.2	Minimum
Maximum	10	Maximum	12
Sum	582.7	Sum	734
Count	103	Count	103
Largest(1)	10	Largest(1)	12
Smallest(1)	0.2	Smallest(1)
Confidence Level (95.0%)	0.50698167	Confidence Level (95.0%)	0.36989

Kolmogorov-Smirnov Test

The hypotheses used are:

Ho: The sample data provided has no significant difference to the data that relates to normal population.

H1: There is a significant difference that emerges between the sample data to that of normal population.

Use an alpha of .05 and provide the test statistic and p level here

P > 0.05

P ≤ 0.05

Accept or reject the null hypothesis here.

The null hypothesis is rejected

Measurement Scale

Ordinal

Measure of Central Tendency

Mean

Evaluation

The above descriptive statistics has indifferences as the test static of the sample data to that of normal population were different.

Assumptions for parametric testing

The assumptions in the parametric testing were not met as there was indifferences in the results under a 95 percent confidence interval. First there was differences in the data which led to differences in the measures of central tendency. For instance, the mean of the data for microns and sick day as projected by that of 5.65 and that of 7.12 respectively. Despite having similar counts that was also a difference that arose between the highest and lowest number in the data provided. Additionally, parameters in the test static for the two populations gave contrastive results. Thus, the assumptions in the parametric testing remained unmet.

Descriptive Data and Assumptions: Simple Regression

Frequency Distribution Table

Expenditure	Frequency
20-500	108
501-1000	76
1001-1500	27
1501-2000	11
2001-2500

Time	Frequency
0-50
51-100	26
101-200	98
201-300	85
301-400

Histogram

Descriptive Statistics Table

safety training expenditure		lost time hours

Mean	595.9843812	Mean	188.0045
Standard Error	31.4770075	Standard Error	4.803089
Median	507.772	Median	190
Mode	234	Mode	190
Standard Deviation	470.0519613	Standard Deviation	71.72542
Sample Variance	220948.8463	Sample Variance	5144.536
Kurtosis	0.444080195	Kurtosis	-0.50122
Skewness	0.951331922	Skewness	-0.08198
Range	2251.404	Range	350
Minimum	20.456	Minimum	10
Maximum	2271.86	Maximum	360
Sum	132904.517	Sum	41925
Count	223	Count	223
Largest(1)	2271.86	Largest(1)	360
Smallest(1)	20.456	Smallest(1)	10
Confidence Level (95.0%)	62.03197147	Confidence Level (95.0%)	9.465484

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0: The sample data that relates to training expenditure is different to that of lost time hours.

H1: There is a significant difference value between the data in training expenditure and that of lost time hours.

Use an alpha of .05 and provide the test statistic and p level here

P > 0.05

P ≤ 0.05

Accept or reject the null hypothesis here.

We accept the null hypothesis

Measurement Scale

Nominal

Measure of Central Tendency

Median

Evaluation

The p value for both training expenditures and the lost time hours is exceedingly high.

Assumptions for parametric testing

The assumptions for parametric testing in the study prove to be met as expressed by the test statistic. First, there is a huge difference that emerges in the data between the training expenditure and the lost time hours. Findings from the statistical test indicates that the p value in both the training expenditure and lost time hours is exceeding high. However, an analysis of the data indicates that there lost time hours has a smaller confidence interval as opposed to that of training expenditure. Thus, the statistical tests proves the assumptions as there is a great difference that emerges in the two data sets.

Descriptive Data and Assumptions: Multiple Regression

Frequency Distribution Table

Decibel	Frequency
100-106
107-111	51
112-116	126
117-121	249
122-131	786
132-141	287

Histogram

Descriptive Statistics Table

Decibel

Mean	124.8359
Standard Error	0.177945
Median	125.721
Mode	127.315
Standard Deviation	6.898657
Sample Variance	47.59146
Kurtosis	-0.31419
Skewness	-0.41895
Range	37.607
Minimum	103.38
Maximum	140.987
Sum	187628.4
Count	1503

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0: There is no relationship between the X and Y variables.

H0= H1=0

H1 ≠ 0

Use an alpha of .05 and provide the test statistic and p level here

P=0

P>0

Accept or reject the null hypothesis here.

Reject

Measurement Scale

Internal

Measure of Central Tendency

Mean

Evaluation

There is no direct relation between the variables.

Assumptions for parametric testing

The assumptions for parametric testing were unmet as it is evident that is no relationship between the variables. In such circumstances there is a null hypothesis for each variable an indication that the variables do not fit in the multiple regression equation. Since the variable do not have any relations there remains a standard error in the data. Since the null hypothesis was untrue there is less probability of obtaining a test statistic based on the data provided. This is because there are two variables at the expense of three. This makes the parametric assumptions to remain unmet as there is no clear relationship.

Descriptive Data and Assumptions: Independent Samples t Test

Frequency Distribution Table

Training	Frequency
49-60	12
61-70	20
71-80	21
81-90
91-100

Training	Frequency
74-80	14
81-85	21
86-90	19
91-95
96-100

Histogram

Descriptive Statistics Table

Prior Training		Revised Training

Mean	69.79032	Mean	84.77419
Standard Error	1.402788	Standard Error	0.659479
Median	70	Median	85
Mode	80	Mode	85
Standard Deviation	11.04556	Standard Deviation	5.192742
Sample Variance	122.0045	Sample Variance	26.96457
Kurtosis	-0.77668	Kurtosis	-0.35254
Skewness	-0.0868	Skewness	0.144085
Range	41	Range	22
Minimum	50	Minimum	75
Maximum	91	Maximum	97
Sum	4327	Sum	5256
Count	62	Count	62
Largest(1)	91	Largest(1)	97
Smallest(1)	50	Smallest(1)	75
Confidence Level (95.0%)	2.805048	Confidence Level (95.0%)	1.31871

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0=0

H1>0

Use an alpha of .05 and provide the test statistic and p level here

P≠ 0

Accept or reject the null hypothesis here.

Place detailed test data in the appendix.

Measurement Scale

Internal

Measure of Central Tendency

Mean

Evaluation

There is an indirect relationship between the sample data and the normal population.

Assumptions for parametric testing.

The assumptions were met. Statically test indicate that the probability test is lower than the p value. For instance, in the first data the p value is 2.8 whereas the second data has a p value of 1.31. The p value is greater than 0. This indicates that there is a indirect relationship of the data as evidenced by the p value. The dependent variables were normally distributed. Additionally, there are two groups which are independent to each other such as the test scores for the revised training and that of prior training. Therefore, there is an indirect relationship of the data provided.

Descriptive Data and Assumptions: Dependent Samples t Test

Frequency Distribution Table

Exposure	Frequency
5-15
16-25
26-35	12
36-45	16
46-56

Exposure	Frequency
5-15
16-25
26-35	11
36-45	17
46-56

Histogram

Descriptive Statistics Table

Pre-Exposure μg/dL		Post-Exposure μg/dL

Mean	32.8571429	Mean	33.28571
Standard Error	1.75230655	Standard Error	1.781423
Median	35	Median	36
Mode	36	Mode	38
Standard Deviation	12.2661458	Standard Deviation	12.46996
Sample Variance	150.458333	Sample Variance	155.5
Kurtosis	-0.57603713	Kurtosis	-0.65421
Skewness	-0.42510965	Skewness	-0.48363
Range	50	Range	50
Minimum		Minimum
Maximum	56	Maximum	56
Sum	1610	Sum	1631
Count	49	Count	49
Largest(1)	56	Largest(1)	56
Smallest(1)		Smallest(1)
Confidence Level (95.0%)	3.52324845	Confidence Level (95.0%)	3.581792

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

Ho: u1=0

H1:u1≠0

Use an alpha of .05 and provide the test statistic and p level here

α=0.05

t=m1-m2/Sd/n

33.28571-32.8571429=0.4285671/1.7523

t=0.24457

Accept or reject the null hypothesis here.

Measurement Scale

Interval

Measure of Central Tendency

Mean

Evaluation

The null hypothesis is accepted as the null hypothesis is greater than 0.

Assumptions for parametric testing

The assumptions for parametric testing were met. This is because the data consisted of dependent variable which were continuous on a ratio basis. Additionally, the observations of the data collected were independent of one another. This is irrespective of the fact that dependent variables were normally distributed. A comparison of the two means indicates that there is statistical difference between the mean. In the data present the difference between the mean is 0.4285671. An evaluation of the statistical test indicates that the t-test is greater than the calculated test. The differences between the observed t test and the calculated t-test leads to acceptance of the null hypothesis.

Descriptive Data and Assumptions: ANOVA

Frequency Distribution Table

Air	Frequency
1-3
4-6
7-9
10-12
12-15

Soil	Frequency
5-7
8-10	13
10-13

Water	Frequency
1-3
4-6	10
7-9
10-12

Training	Frequency
1-3
4-6	16
7-9

Histogram

Descriptive Statistics Table

A = Air		B = Soil

Mean	8.9	Mean	9.1
Standard Error	0.684028	Standard Error	0.390007
Median		Median
Mode	11	Mode
Standard Deviation	3.059068	Standard Deviation	1.744163
Sample Variance	9.357895	Sample Variance	3.042105
Kurtosis	-0.6283	Kurtosis	0.11923
Skewness	-0.36085	Skewness	0.492002
Range	11	Range
Minimum		Minimum
Maximum	14	Maximum	13
Sum	178	Sum	182
Count	20	Count	20
Largest(1)	14	Largest(1)	13
Smallest(1)		Smallest(1)
Confidence Level(95.0%)	1.431688	Confidence Level(95.0%)	0.816294

C = Water		D = Training

Mean		Mean	5.4
Standard Error	0.575829	Standard Error	0.265568
Median		Median
Mode		Mode
Standard Deviation	2.575185	Standard Deviation	1.187656
Sample Variance	6.631579	Sample Variance	1.410526
Kurtosis	-0.23752	Kurtosis	0.253747
Skewness	0.760206	Skewness	0.159183
Range		Range
Minimum		Minimum
Maximum	12	Maximum
Sum	140	Sum	108
Count	20	Count	20
Largest(1)	12	Largest(1)
Smallest(1)		Smallest(1)
Confidence Level (95.0%)	1.205224	Confidence Level (95.0%)	0.55584

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0: There is no difference of the means.

H1: Means are not all equal

Use an alpha of .05 and provide the test statistic and p level here

α=0.05

Test statistic of water to training

(7-5.4)2=2.56

Test statistic of air to soil

(8.9-9.1)2=0.0004

Accept or reject the null hypothesis here.

Measurement Scale

Ratio

Measure of Central Tendency

Mean

Evaluation

The means are not equal as they base on different sets of data.

Assumptions for parametric testing

Based on the data provided the assumptions that can be derived are those for normality, equal variance and that of independent errors. From the data there is an interaction of the variables with no restrictions. The parametric assumptions in this scenario would relate to the parameters on the population distribution upon which data is drawn. Additionally, a non-parametric test would refer to that which makes no such assumptions. This leads to normal distribution, homogeneity of the variances, multiple groups which relates to the same variance as well as linearity on the independent relationships. Thus, the assumptions define the type of variance.

References

Judd, C. M., McClelland, G. H., & Ryan, C. S. (2017). Data analysis: A model comparison approach to regression, ANOVA, and beyond. Routledge.

Kalaian, S. A., & Kasim, R. M. (2016). Analyzing quantitative data. In Mixed Methods Research for Improved Scientific Study (pp. 149-164). IGI Global.