Purdue STAT350

STAT 350: Dataset Description

The dataset combines demographic, crime, and education data at the county-level in the United States. It can be used to study the interrelationships between location in the US, demographics, education, and crime. Most -- but not all -- states are included, as well as D.C. Note that this data was formed as a random sample of US data, so it does not include every county.

Variable Name

Dataset Column

Description

State

Data contains 47 states plus D.C.

Region

Region the state belongs to: South (SO), Northeast (NE), North Central (NC), and West (WE), per U.S. Department of Commerce.

CountyIndex

Gives a number to each county in a state. The values run from 1 to the number of counties in that state included in the data, for each state. It has no particular meaning, but allows one to communicate their results. For example: “County x in Indiana constitutes an outlier”

UrbanIndicator

Indicates whether a county is urban or not. If the value is 1, the county is urban, if the value is 0, then the county is rural.

Population

Population living in county.

LandArea

Land area of county, in square miles.

PopulationDensity

Population per land area.

PercentMaleDivorce

Percentage of males who are divorced

PercentFemaleDivorce

Percentage of females who are divorced

MedianIncome

10

Median household income of county at time of data collection, in dollars. Measured on an integer scale.

IncomeCategory

11

Variable indicating whether Median county income in dollars is in the range [12701, 25400], [25401, 38100], [38101,50800], or [50800, ). These categories capture all the data because Median Income is recorded as an integer (for example: we are not missing 25400.5).

PercentCollegeGraduates

12

Percentage of people 25 and over who have graduated from college.

MedianHouseAge

13

Median age of housing units.

RobberiesPerPopulation

14

Number of robberies out of 100,000 people.

AssaultsPerPopulation

15

Number of assaults out of 100,000 people.

BurglariesPerPopulation

16

Number of burglaries out of 100,000 people.

LarceniesPerPopulation

17

Number of larcenies out of 100,000 people.

EducationSpending

18

Average amount spent on each pupil in the county, in dollars.

EducationSpendingP2

19

Average amount spent on each pupil in the county, measured at a later time than EducationSpending (P2 for period 2).

TestScore

20

Average Test Score (participation adjusted) for a college admission exam.

Disclaimer: the original datasets were modified to suit the purposes of STAT 350. Take the results of analysis this semester with a grain of salt.