Waiting for answer This question has not been answered yet. You can hire a professional tutor to get the answer.

# Problem 1The article "Multiple Linear Regression for Lake Ice and Lake Temperature Characteristics" by Gao & Stefan (1999) presents data on the maximum ice thickness in mm (Thickness), average number

**Problem 1**The article "Multiple Linear Regression for Lake Ice and Lake Temperature Characteristics" by Gao & Stefan (1999) presents data on the maximum ice thickness in mm (Thickness), average number of days per year of ice cover (DaysIceCover), average number of days the low temperature is below 8 degrees centrigrade (DaysLessThan8) and the average snow depth in mm (AvgSnowdepth) for 13 lakes in Minnesota. The data are in the file __iceThickness.csv__.

- Construct a scatterplot matrix for the variables in this data. Describe/discuss the four variables of the dataset and any suspected relationships you may see.
- Fit a multiple regression model where Thickness is the response variable and the other three variables are predictors
- Assess the underlying assumptions for the model you fit in 2.
- Does the model (collectively) significantly predict the thickness of the ice? (Justify by referencing specific output).
- What percent of the variable in ice thickness is explained by the model? (reference specific output)
- Answer the question: Do lakes with greater average snow depth tend to have greater or lesser maximum ice thickness? In addressing this question please explain your rationale and any limitations/caveats in your response.

**Problem 2**The SAT is a standardized college entrance exam taken by many high school students across the United States. In certain regions, the SAT is a popular exam used in college admissions, while in other regions the ACT or other metrics are used for college admissions. Thus, the percentage of students taking the SAT varies greatly by state and region. The file __stateSATscores.csv__ contains average SAT scores for each of the 50 states in the USA for the year 1997 along with several other variables (**State** name; **Expenditure** - expenditure per pupil in average daily attendance in public elementary and secondary schools; **PT.Ratio** - the average pupil to teacher ratio in public schools; **Salary** - the estimated average salary of public school teachers in the state; **PercentSAT** - the percentage of students electing to take the SAT exam; **Verbal** - the average Verbal composite score; **Math** - the average Mathematics composite score; and **SAT** - the average composite SAT score).

Perform the following with this data:

- Construct a scatterplot where the x-axis it the Percent taking the SAT exam and the y-axis is the average composite SAT score. Describe the relationship you see.
- Fit a simple linear regression model where the response is the composite SAT score and the predictor is the Percent of students taking the SAT. Construct residual diagnostic plots of that fit. What do you notice in the Residuals vs Fitted plot?
- Create a new variable to the dataset that is the square root of the Percent of Students taking the SAT exam.
- Fit a multiple regression model where the response is the composite SAT score with two predictors: the percent of students taking the exam and the square root of that percentage (the variable you created in part 3). Construct residual diagnostic plots of that fit. What do you notice in the Residuals vs Fitted plot?
- Determine if the fitted model in part 4 is a significant model to predict composite SAT scores. (reference specific output)
- What percentage of the variability in composite SAT scores is explained by the model you fit in part 4? (reference specific output)
- We are interested in determining if student expenditures, pupil-teacher ratios, and teacher salary influence SAT scores when accounting for the percentage who take the SAT. Fit a multiple regression model where the predictor variables include the percent taking the SAT, the square root of that percentage (created in part 3), the expenditures, pupil-teacher ratio, and teacher salary. Compare/contrast this fitted model to that in part 4. Do you feel that expenditures, pupil-teacher ratios or teacher salary predict SAT scores once accounting for the number of students taking the SAT exam? Reference specific output from the fitted model to address this question.