Purdue STAT350
STAT 350: Dataset Description
The dataset combines demographic, crime, and education data at the county-level in the United States. It can be used to study the interrelationships between location in the US, demographics, education, and crime. Most -- but not all -- states are included, as well as D.C. Note that this data was formed as a random sample of US data, so it does not include every county.
Variable Name | Dataset Column | Description |
State | Data contains 47 states plus D.C. | |
Region | Region the state belongs to: South (SO), Northeast (NE), North Central (NC), and West (WE), per U.S. Department of Commerce. | |
CountyIndex | Gives a number to each county in a state. The values run from 1 to the number of counties in that state included in the data, for each state. It has no particular meaning, but allows one to communicate their results. For example: “County x in Indiana constitutes an outlier” | |
UrbanIndicator | Indicates whether a county is urban or not. If the value is 1, the county is urban, if the value is 0, then the county is rural. | |
Population | Population living in county. | |
LandArea | Land area of county, in square miles. | |
PopulationDensity | Population per land area. | |
PercentMaleDivorce | Percentage of males who are divorced | |
PercentFemaleDivorce | Percentage of females who are divorced | |
MedianIncome | 10 | Median household income of county at time of data collection, in dollars. Measured on an integer scale. |
IncomeCategory | 11 | Variable indicating whether Median county income in dollars is in the range [12701, 25400], [25401, 38100], [38101,50800], or [50800, ). These categories capture all the data because Median Income is recorded as an integer (for example: we are not missing 25400.5). |
PercentCollegeGraduates | 12 | Percentage of people 25 and over who have graduated from college. |
MedianHouseAge | 13 | Median age of housing units. |
RobberiesPerPopulation | 14 | Number of robberies out of 100,000 people. |
AssaultsPerPopulation | 15 | Number of assaults out of 100,000 people. |
BurglariesPerPopulation | 16 | Number of burglaries out of 100,000 people. |
LarceniesPerPopulation | 17 | Number of larcenies out of 100,000 people. |
EducationSpending | 18 | Average amount spent on each pupil in the county, in dollars. |
EducationSpendingP2 | 19 | Average amount spent on each pupil in the county, measured at a later time than EducationSpending (P2 for period 2). |
TestScore | 20 | Average Test Score (participation adjusted) for a college admission exam. |
Disclaimer: the original datasets were modified to suit the purposes of STAT 350. Take the results of analysis this semester with a grain of salt.