Business Intelligence – Data Analysis of US Data Source From the list of large datasets select a data containing data on United States subjects. You are to define a business problem this dataset can a

1

United States Data

United States Data

Business Problem

A clothing company in the United States plans to establish its business in Boston City. Successful organizations ensure that they provide products that satisfy the needs and preferences of their target market. Boston comprises different demographic groups based on age, level of education, gender, sexual orientation, occupation, individual/household income, among other distinguishing factors. Therefore, for the clothing company to achieve success when it establishes its business in this city, it needs to make sure that it offers business products and services that align with the needs of different demographic groups. Being the investment analyst for the clothing company, I was tasked with studying and examining the differences in income in four demographic groups, including gender, age, race, and households, to provide the information which the organization would assess before investing in Boston city.

Variables in the SAS dataset

The two main variables from the company's dataset were males and females. The clothing company analyzed whether income differs significantly between these two groups. The sample comprised 128 respondents aged 25 to 65 with an equal distribution between the two variables, i.e., 64 were males, and 64 were females.

Business Questions

The following four business questions were utilized to analyze the clothing company's demographic factors before establishing its business in Boston.

  1. Should the clothing company target more male or female customers?

  2. Should the clothing company target the young population more than the elderly?

  3. Should the clothing company target all races or a specific race?

  4. Should the clothing company target more family households than non-family households?

Each question is crucial in the analysis used to address the above business problem. For instance, the first business question will examine the relationship between gender and income to determine which gender earns the most income. This information will help the clothing company determine the gender category they should target more with their products. In the same context, the second question will study the income of different age groups to provide more products to meet the needs of the age group with the highest income. Similarly, the third question will identify which racial group earns the most income. Lastly, the fourth question will check if the income in family households is different from non-family households to determine which group the business should focus on. These results will help the clothing company customize its products to satisfy the demographic group's needs and preferences that earn the most income.

Hypothesis

Business Question 1: Should the clothing company target more male or female customers?

Null Hypothesis: There will be no significant income differences between the male and female genders residing in Boston.

Alternative Hypothesis: There will be significant income differences between the males and females residing in Boston.

Business Question 2: Should the clothing company target the young population more than the elderly?

Null Hypothesis: There will be no significant income differences between the young and elderly populations residing in Boston.

Alternative Hypothesis: There will be significant income differences between the young and elderly populations residing in Boston.

Business Question 3: Should the clothing company target all races or a specific race?

Null Hypothesis: There will be no significant income differences between racial groups in Boston.

Alternative Hypothesis: There will be significant income differences between racial groups in Boston.

Business Question 4: Should the clothing company target more family households than non-family households?

Null Hypothesis: There will be no significant income differences between family households and non-family households.

Alternative Hypothesis: There will be significant income differences between family households and non-family households.

Statistical Tests

The statistical tests used to analyze the business problem include the median, Chi-square, and t-test.

Business Question 1: Should the clothing company target more male or female customers?

The statistical test that will be used to answer this business question and either prove or disapprove of the hypothesis is a median test. This statistical test is often utilized in comparing the medians of two different groups to establish whether they differ (Stephenson, 2016). Therefore, the median test is the definitive test for this business question because it will compare the medians of the income between male and female customers to establish if they differ significantly.

Business Question 2: Should the clothing company target the young population more than the elderly?

The statistical test that will be used to answer this business question and either prove or disapprove the hypothesis is the Chi-square test. This statistical test analyzes differences in independent variables (Hayes, 2021). Therefore, this test is appropriate for this business question because the chi-square statistic obtained from the test will tell how much difference in income exists between the young population and the elderly

Business Question 3: Should the clothing company target all races or a specific race?

The statistical test used to answer this business question and either prove or disapprove of the hypothesis is the t-test. Hayes (2021) highlighted that t-tests are used to assess whether there is a notable difference in means between two different groups that might be correlated. Similarly, this test will establish whether there is a significant income difference between racial groups in Boston city.

Business Question 4: Should the clothing company target more family households than non-family households?

The statistical test that will be used to answer this business question and either prove or disapprove of the hypothesis is the Median test. This test will compare the medians of the income between family households and non-family households to assess whether they differ significantly.

Visualizations

The visualizations that I intend to use are graphs, tables, charts, scatter plots. Graphical visualizations will enable quick analysis of the results at one glance, thus helping the clothing company make informed decisions about establishing their business successfully in Boston. Secondly, tables will summarize the mean income of each category in the four different demographic groups. Similarly, charts and scatter plots will highlight the information recorded on the tables to allow a more straightforward interpretation of data.

Concerns

The main concern in completing this portfolio project is the possibility of statistical errors since the sample used in the analysis was smaller. In hypothesis testing, statistical errors such as type I and type II errors are inevitable. These errors might arise due to errors of omission and errors of commission. For instance, if the results obtained from one of the statistical tests indicate that we reject the null hypothesis when it needs to be accepted, this will lead to a Type I error. If the results indicate that we accept the null hypothesis when it should be rejected, this will lead to a Type II error (Bhandari, 2021). Therefore, to minimize such errors, the sample size used in the test should be larger.

References

Bhandari, P. (2021). Type I & Type II Errors | Differences, Examples, Visualizations. https://www.scribbr.com/statistics/type-i-and-type-ii-errors/

Hayes, A. (2021). Chi-Square (χ2) Statistic. https://www.investopedia.com/terms/c/chi-square-statistic.asp

Hayes, A. (2021). T-Test. https://www.investopedia.com/terms/t/t-test.asp

Stephenson, G. (2016). Mood's Median Test: Definition, Run the Test, and Interpret Results. https://www.statisticshowto.com/moods-median-test/