Regression Analysis

5


Cedric Alikusumah

ECON 217

March 15, 2017

Regression Analysis Assignment

X1

X2

X3

X4

X5

X6

78

284

9.100000381

109

9.300000191

68

433

8.699999809

144

7.5

70

739

7.199999809

113

8.899999619

96

1792

8.899999619

97

10.19999981

74

477

8.300000191

206

8.300000191

111

362

10.89999962

124

8.800000191

77

671

10

152

8.800000191

168

636

9.100000381

162

10.69999981

82

329

8.699999809

150

11.69999981

89

634

7.599999905

134

8.5

149

631

10.80000019

292

8.300000191

60

257

9.5

108

8.199999809

96

284

8.800000191

111

7.900000095

83

603

9.5

182

10.30000019

130

686

8.699999809

129

7.400000095

145

345

11.19999981

158

9.600000381

112

1357

9.699999809

186

9.300000191

131

544

9.600000381

177

10.60000038

80

205

9.100000381

127

9.699999809

130

1264

9.199999809

179

11.60000038

140

688

8.300000191

80

8.100000381

154

354

8.399999619

103

9.800000191

118

1632

9.399999619

101

7.400000095

94

348

9.800000191

117

9.399999619

119

370

10.39999962

88

11.19999981

153

648

9.899999619

78

9.100000381

116

366

9.199999809

102

10.5

97

540

10.30000019

95

11.89999962

176

680

8.899999619

80

8.399999619

75

345

9.600000381

92

134

525

10.30000019

126

9.800000191

161

870

10.39999962

108

9.800000191

111

669

9.699999809

77

10.80000019

114

452

9.600000381

60

10.10000038

142

430

10.69999981

71

10.89999962

238

822

10.30000019

86

9.199999809

78

190

10.69999981

93

8.300000191

196

867

9.600000381

106

7.300000191

125

969

10.5

162

9.399999619

82

499

7.699999809

95

9.399999619

125

925

10.19999981

91

9.800000191

129

353

9.899999619

52

3.599999905

84

288

8.399999619

110

8.399999619

183

718

10.39999962

69

10.80000019

119

540

9.199999809

57

10.10000038

180

668

13

106

82

347

8.800000191

40

10

71

345

9.199999809

50

11.30000019

118

463

7.800000191

35

11.30000019

121

728

8.199999809

86

12.80000019

68

383

7.400000095

57

10

112

316

10.39999962

57

6.699999809

109

388

8.899999619

94

The data (X1, X2, X3, X4, X5) are by city.

X1 = death rate per 1000 residents

X2 = doctor availability per 100,000 residents

X3 = hospital availability per 100,000 residents

X4 = annual per capita income in thousands of dollars

X5 = population density people per square mile

Reference: Life In America's Small Cities, by G.S. Thomas

X6=gender where 0-male and 1-female


  • Briefly we are going to look at the relationship between X1 (death rate per 1000 residents) with X2 (doctor availability per 100,000 residents), X3 (hospital availability per 100,000 residents), X4 (annual per capita income in thousands of dollars), X5 (population density people per square mile) and X6 (gender).

  • Choosing death rate per 1000 residents as the response variables because the other variables (X1, X2, X3, X4, X5 and X6) explain the dependent variable.

  • The general model is: X1= a+bX2+cX3+dX4+eX5+fX6, where a, b, c, d, e and f are constants that need to be determined.

  • I think the relationship is that the death rate per 1000 residents increases with: decrease in doctor and hospital availability per 100000 residents and decreases in population density per square mile and decreases annual per capita income in thousands of dollars.

  • The estimated model is: X1= 12.4472698+0.006859042*X2 + 0.00695302* X3 + -0.336405998*X4 + -0.009687813 *X5 +-0.23263965 *X6.

  • The coefficient of determination is 0.148132225 which indicates the total variability in the multiple regression model. It measures the linear association between X1 and the rest of the independent variables. 14.81% of the total variation can be explained by the linear relationship between X1 and the independent variables. If X6(the dummy variable) is not considered the coefficient of determination remains 0.148132225.

  • The adjusted R square is 0.057507994

  • Using the F-test to check the validity of the model shows the model is statistically acceptable and valid.

  • The intercept is 12.4472698 is the point where the line of best fit and the y-axis intercept.

  • X2 coefficient is 0.006859042 which means that for each one unit increase in X2 per 100000 residents the death rate per 1000 residents increases by 0.006859042. This also applies to X3 coefficient which is positive and for each increase in X3 per 100000 residents the death rate per 1000 residents increases by 0.000695302.

  • X4, X5, X6 have negative coefficient that means for each one unit decrease in either X4,X5,X6 the death rate per 1000 residents decreases by the proportional coefficient.

  • P-value is used to check if individual variables are significant (if p-value is greater than 0.05 then the variable is not significant). Hence X5 and X6 are not significant as their individual p-values are greater than 0.05.