Instructions Create a PowerPoint presentation for the Sun Coast Remediation research project to communicate the findings and suggest recommendations. Please use the following format: §  Slide 1: Inclu

Running head: DESCRIPTIVE STATISTICS 0

Descriptive statistics

Stephener Baisey

Columbia Southern University

Data Analysis

Descriptive Data and Assumptions: Correlation

Frequency Distribution Table

PM size

Frequency

0-1

2-4

24

5-7

37

8-10

34

Sick Days

Frequency

0-2

4-7

61

8-9

30

10-12

11

Histogram

Descriptive Statistics Table

microns

 

 

sick day

 

Mean

5.65728155

Mean

7.126214

Standard Error

0.25560014

Standard Error

0.186484

Median

Median

Mode

Mode

Standard Deviation

2.59405814

Standard Deviation

1.892605

Sample Variance

6.72913764

Sample Variance

3.581953

Kurtosis

-0.8521619

Kurtosis

0.124923

Skewness

-0.37325713

Skewness

0.14225

Range

9.8

Range

10

Minimum

0.2

Minimum

Maximum

10

Maximum

12

Sum

582.7

Sum

734

Count

103

Count

103

Largest(1)

10

Largest(1)

12

Smallest(1)

0.2

Smallest(1)

Confidence Level (95.0%)

0.50698167

 

Confidence Level (95.0%)

0.36989

Kolmogorov-Smirnov Test

The hypotheses used are:

Ho: The sample data provided has no significant difference to the data that relates to normal population.

H1: There is a significant difference that emerges between the sample data to that of normal population.

Use an alpha of .05 and provide the test statistic and p level here

P > 0.05

P ≤ 0.05

Accept or reject the null hypothesis here.

The null hypothesis is rejected

Measurement Scale

Ordinal

Measure of Central Tendency

Mean

Evaluation

The above descriptive statistics has indifferences as the test static of the sample data to that of normal population were different.

Assumptions for parametric testing

The assumptions in the parametric testing were not met as there was indifferences in the results under a 95 percent confidence interval. First there was differences in the data which led to differences in the measures of central tendency. For instance, the mean of the data for microns and sick day as projected by that of 5.65 and that of 7.12 respectively. Despite having similar counts that was also a difference that arose between the highest and lowest number in the data provided. Additionally, parameters in the test static for the two populations gave contrastive results. Thus, the assumptions in the parametric testing remained unmet.

Descriptive Data and Assumptions: Simple Regression

Frequency Distribution Table

Expenditure

Frequency

20-500

108

501-1000

76

1001-1500

27

1501-2000

11

2001-2500

Time

Frequency

0-50

51-100

26

101-200

98

201-300

85

301-400

Histogram

Descriptive Statistics Table

safety training expenditure

 

 

lost time hours

 

Mean

595.9843812

Mean

188.0045

Standard Error

31.4770075

Standard Error

4.803089

Median

507.772

Median

190

Mode

234

Mode

190

Standard Deviation

470.0519613

Standard Deviation

71.72542

Sample Variance

220948.8463

Sample Variance

5144.536

Kurtosis

0.444080195

Kurtosis

-0.50122

Skewness

0.951331922

Skewness

-0.08198

Range

2251.404

Range

350

Minimum

20.456

Minimum

10

Maximum

2271.86

Maximum

360

Sum

132904.517

Sum

41925

Count

223

Count

223

Largest(1)

2271.86

Largest(1)

360

Smallest(1)

20.456

Smallest(1)

10

Confidence Level (95.0%)

62.03197147

 

Confidence Level (95.0%)

9.465484

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0: The sample data that relates to training expenditure is different to that of lost time hours.

H1: There is a significant difference value between the data in training expenditure and that of lost time hours.

Use an alpha of .05 and provide the test statistic and p level here

P > 0.05

P ≤ 0.05

Accept or reject the null hypothesis here.

We accept the null hypothesis

Measurement Scale

Nominal

Measure of Central Tendency

Median

Evaluation

The p value for both training expenditures and the lost time hours is exceedingly high.

Assumptions for parametric testing

The assumptions for parametric testing in the study prove to be met as expressed by the test statistic. First, there is a huge difference that emerges in the data between the training expenditure and the lost time hours. Findings from the statistical test indicates that the p value in both the training expenditure and lost time hours is exceeding high. However, an analysis of the data indicates that there lost time hours has a smaller confidence interval as opposed to that of training expenditure. Thus, the statistical tests proves the assumptions as there is a great difference that emerges in the two data sets.

Descriptive Data and Assumptions: Multiple Regression

Frequency Distribution Table

Decibel

Frequency

100-106

107-111

51

112-116

126

117-121

249

122-131

786

132-141

287

Histogram

Descriptive Statistics Table

Decibel

 

 

 

Mean

124.8359

Standard Error

0.177945

Median

125.721

Mode

127.315

Standard Deviation

6.898657

Sample Variance

47.59146

Kurtosis

-0.31419

Skewness

-0.41895

Range

37.607

Minimum

103.38

Maximum

140.987

Sum

187628.4

Count

1503

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0: There is no relationship between the X and Y variables.

H0= H1=0

H1 ≠ 0

Use an alpha of .05 and provide the test statistic and p level here

P=0

P>0

Accept or reject the null hypothesis here.

Reject

Measurement Scale

Internal

Measure of Central Tendency

Mean

Evaluation

There is no direct relation between the variables.

Assumptions for parametric testing

The assumptions for parametric testing were unmet as it is evident that is no relationship between the variables. In such circumstances there is a null hypothesis for each variable an indication that the variables do not fit in the multiple regression equation. Since the variable do not have any relations there remains a standard error in the data. Since the null hypothesis was untrue there is less probability of obtaining a test statistic based on the data provided. This is because there are two variables at the expense of three. This makes the parametric assumptions to remain unmet as there is no clear relationship.

Descriptive Data and Assumptions: Independent Samples t Test

Frequency Distribution Table

Training

Frequency

49-60

12

61-70

20

71-80

21

81-90

91-100

Training

Frequency

74-80

14

81-85

21

86-90

19

91-95

96-100

Histogram

Descriptive Statistics Table

Prior Training

 

 

Revised Training

 

Mean

69.79032

Mean

84.77419

Standard Error

1.402788

Standard Error

0.659479

Median

70

Median

85

Mode

80

Mode

85

Standard Deviation

11.04556

Standard Deviation

5.192742

Sample Variance

122.0045

Sample Variance

26.96457

Kurtosis

-0.77668

Kurtosis

-0.35254

Skewness

-0.0868

Skewness

0.144085

Range

41

Range

22

Minimum

50

Minimum

75

Maximum

91

Maximum

97

Sum

4327

Sum

5256

Count

62

Count

62

Largest(1)

91

Largest(1)

97

Smallest(1)

50

Smallest(1)

75

Confidence Level (95.0%)

2.805048

 

Confidence Level (95.0%)

1.31871

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0=0

H1>0

Use an alpha of .05 and provide the test statistic and p level here

P≠ 0

Accept or reject the null hypothesis here.

Accept

Place detailed test data in the appendix.

Measurement Scale

Internal

Measure of Central Tendency

Mean

Evaluation

There is an indirect relationship between the sample data and the normal population.

Assumptions for parametric testing.

The assumptions were met. Statically test indicate that the probability test is lower than the p value. For instance, in the first data the p value is 2.8 whereas the second data has a p value of 1.31. The p value is greater than 0. This indicates that there is a indirect relationship of the data as evidenced by the p value. The dependent variables were normally distributed. Additionally, there are two groups which are independent to each other such as the test scores for the revised training and that of prior training. Therefore, there is an indirect relationship of the data provided.

Descriptive Data and Assumptions: Dependent Samples t Test

Frequency Distribution Table

Exposure

Frequency

5-15

16-25

26-35

12

36-45

16

46-56

Exposure

Frequency

5-15

16-25

26-35

11

36-45

17

46-56

Histogram

Descriptive Statistics Table

Pre-Exposure μg/dL

 

 

Post-Exposure μg/dL

 

Mean

32.8571429

Mean

33.28571

Standard Error

1.75230655

Standard Error

1.781423

Median

35

Median

36

Mode

36

Mode

38

Standard Deviation

12.2661458

Standard Deviation

12.46996

Sample Variance

150.458333

Sample Variance

155.5

Kurtosis

-0.57603713

Kurtosis

-0.65421

Skewness

-0.42510965

Skewness

-0.48363

Range

50

Range

50

Minimum

Minimum

Maximum

56

Maximum

56

Sum

1610

Sum

1631

Count

49

Count

49

Largest(1)

56

Largest(1)

56

Smallest(1)

Smallest(1)

Confidence Level (95.0%)

3.52324845

 

Confidence Level (95.0%)

3.581792

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

Ho: u1=0

H1:u1≠0

Use an alpha of .05 and provide the test statistic and p level here

α=0.05

t=m1-m2/Sd/n

33.28571-32.8571429=0.4285671/1.7523

t=0.24457

Accept or reject the null hypothesis here.

Accept

Measurement Scale

Interval

Measure of Central Tendency

Mean

Evaluation

The null hypothesis is accepted as the null hypothesis is greater than 0.

Assumptions for parametric testing

The assumptions for parametric testing were met. This is because the data consisted of dependent variable which were continuous on a ratio basis. Additionally, the observations of the data collected were independent of one another. This is irrespective of the fact that dependent variables were normally distributed. A comparison of the two means indicates that there is statistical difference between the mean. In the data present the difference between the mean is 0.4285671. An evaluation of the statistical test indicates that the t-test is greater than the calculated test. The differences between the observed t test and the calculated t-test leads to acceptance of the null hypothesis.


Descriptive Data and Assumptions: ANOVA

Frequency Distribution Table

Air

Frequency

1-3

4-6

7-9

10-12

12-15

Soil

Frequency

5-7

8-10

13

10-13

Water

Frequency

1-3

4-6

10

7-9

10-12

Training

Frequency

1-3

4-6

16

7-9

Histogram

Descriptive Statistics Table

A = Air

 

 

B = Soil

 

Mean

8.9

Mean

9.1

Standard Error

0.684028

Standard Error

0.390007

Median

Median

Mode

11

Mode

Standard Deviation

3.059068

Standard Deviation

1.744163

Sample Variance

9.357895

Sample Variance

3.042105

Kurtosis

-0.6283

Kurtosis

0.11923

Skewness

-0.36085

Skewness

0.492002

Range

11

Range

Minimum

Minimum

Maximum

14

Maximum

13

Sum

178

Sum

182

Count

20

Count

20

Largest(1)

14

Largest(1)

13

Smallest(1)

Smallest(1)

Confidence Level(95.0%)

1.431688

 

Confidence Level(95.0%)

0.816294

C = Water

 

 

D = Training

 

Mean

Mean

5.4

Standard Error

0.575829

Standard Error

0.265568

Median

Median

Mode

Mode

Standard Deviation

2.575185

Standard Deviation

1.187656

Sample Variance

6.631579

Sample Variance

1.410526

Kurtosis

-0.23752

Kurtosis

0.253747

Skewness

0.760206

Skewness

0.159183

Range

Range

Minimum

Minimum

Maximum

12

Maximum

Sum

140

Sum

108

Count

20

Count

20

Largest(1)

12

Largest(1)

Smallest(1)

Smallest(1)

Confidence Level (95.0%)

1.205224

 

Confidence Level (95.0%)

0.55584

Kolmogorov-Smirnov Test

State null and alternative hypotheses for normality here.

H0: There is no difference of the means.

H1: Means are not all equal 

Use an alpha of .05 and provide the test statistic and p level here

  α=0.05

Test statistic of water to training

(7-5.4)2=2.56

Test statistic of air to soil

(8.9-9.1)2=0.0004

Accept or reject the null hypothesis here.

Accept

Measurement Scale

Ratio

Measure of Central Tendency

Mean

Evaluation

The means are not equal as they base on different sets of data.

Assumptions for parametric testing

Based on the data provided the assumptions that can be derived are those for normality, equal variance and that of independent errors. From the data there is an interaction of the variables with no restrictions. The parametric assumptions in this scenario would relate to the parameters on the population distribution upon which data is drawn. Additionally, a non-parametric test would refer to that which makes no such assumptions. This leads to normal distribution, homogeneity of the variances, multiple groups which relates to the same variance as well as linearity on the independent relationships. Thus, the assumptions define the type of variance.

References

Judd, C. M., McClelland, G. H., & Ryan, C. S. (2017). Data analysis: A model comparison approach to regression, ANOVA, and beyond. Routledge.

Kalaian, S. A., & Kasim, R. M. (2016). Analyzing quantitative data. In Mixed Methods Research for Improved Scientific Study (pp. 149-164). IGI Global.