Final Project: Statistical Tools and Data Analysis
Analysis Plan for eSoma Case Scenario
Task | Statistical Tool(s) | Method(s) | Notes |
Organize “Department” in terms of “Gender”, assuming that in your data, 1 stands for female and 0 is Male. Provide statistical evidence(s) to support your conclusion | Pivot table or excel function, “COUNTIF” | Create a contingency table | Since we are dealing with categorical variables we basically “count”. Organize your outcome in a tabular form. |
Determine whether there is a relationship between Gender (being a man or a woman) and the Department (someone is in). Provide statistical evidence(s) to support your conclusion | Suggest statistical tool to use in this analysis | Suggest statistical method | This task is left for you to determine appropriate statistical tool and methods. I just wanted to see how did you master the course. All other tasks except one will include statistical tools and methods with brief notes. |
Estimate average Salary” of the overall midlevel supervisors in this company assuming that the data provided is just a sample data. | t-test statistic | Confidence Interval method | Assume that the overall distribution (population) is unknown, i.e., not the 46 size of the given data. So estimate the average salary for the overall distribution. Use 95% confidence level in your estimations. |
Estimate the differences between the average salaries of males and that of females. Provide statistical evidence(s) to support your conclusion | t-test | Confidence Interval method | Compare the average salaries of male and female from the sample data and use the differences as point estimate to estimate the difference in the overall population. Use 95% confidence level in your estimations. |
Find proportions of women to men. | Proportion formula | Use formula to calculate proportion | |
Estimate proportion of Women to Men in the overall distribution. Provide statistical evidence(s) to support your conclusion | Z- test | Confidence Interval | Your sample proportion is your point estimate. Estimate population proportion by using your sample proportion. Use 95% confidence level in your estimations. Do you remember why we are using z-test not t-test? Refer to instructor’s notes |
Find correlation between salary (dependent variable) and each other variables (independent variables). Provide statistical evidence(s) to support your conclusion | t-test for correlation, P-value | Correlation method ( use excel function or Data Analysis plug-ins) | You can use correlation function in excel, i.e., =correl, or Data Analysis which you can activate from add-ins. Once you find correlation you have to estimate correlation for the overall population to provide evidence to prove that correlation is significantly strong or not. You can do this by using t-test for correlation whereby, you will compare the t-test calculated with t-critical (see instructor’s notes) and the p-value with significance level. Remember to start by establishing hypotheses and provide conclusion to reject or fail to reject the null hypothesis. |
Run Regression Analysis by using Salary as a dependent (explained)variable and all other variables as Independent (explanatory) Variables | Correlation (Multiple r), R- squared, Adjusted R squared, t-test, p-values etc. | Regression analysis | First, provide a theory (and specify the sign variables), establish hypotheses, and create Regression Model. Then run regression analysis. After running regression analysis, use the regression output to validate your model and each specific independent variable. Omit less significant variables and re-run regression analysis using significant variables. Explain your equation. Remember you can explain the effect of one variable to the dependent variable holding other variables constant. |
Find out whether or not, people (men and women) working in different (levels of) Departments receive contrasting (levels of) salaries. Provide statistical evidence(s) to support your conclusion | Suggest Statistical tool | Suggest method of analysis | This task is left for you to determine appropriate statistical tool and methods to show how you mastered the course. All other tasks (except one) include statistical tools and methods with brief notes. |