Analyzing Data on the Web- will pay $30

Quantitative and Qualitative Data Analysis

Introduction

This final lecture for the course will, fittingly, deal with some of the final steps that a researcher takes in the process of designing, carrying out, and analyzing a research study. Some students are anxious about the data analysis step of the process but it is really quite straightforward, especially for the beginning researcher. To set the stage, the researcher has come up with a topic or question to study, decided what method(s) to use to study that topic, designed the study to find the answers to the particular problem or question of interest, carried out the study, and recorded the data in some way. The final step is to analyze the data to find out the answers the research was designed to provide.

Quantitative Data Analysis

While this heading might sound a bit scary, it need not be. The word quantitative simply means having to do with quantity or, another way of saying it, numbers. So the researcher's study has produced some numbers − the percentage of the respondents that agree with a certain statement, the number of crimes committed in one month, the number of marriages that end in divorce, the reasons why people like a particular political candidate − oops, that last one doesn't deal with numbers! Or does it? Even though reasons are not numbers, the number of people who give a certain reason can be counted so quite often the researcher transforms something that is not a number into a number. The point is that most agencies or organizations that want to know something or want to understand why something is the way it is, will ultimately want numbers to prove or demonstrate the answers the research provides.

Simple Statistics: Part A

While statistics is a field that can become very complex and overwhelming, that level of data analysis is beyond the scope of this course. This course is simply concerned with understanding why the researcher needs to calculate statistics at the end of most research projects and what those statistics can tell the researcher and the people who asked for the study.

Statistics tell the readers of a research report how much confidence they should put in the results that the study produced. For instance, if a research report only stated that the study revealed that male college students are smarter than female college students, would most readers accept that finding and think that the matter was settled? Or would the reader want to know what evidence the researcher had to back up that statement? To take the example further, what if the report listed as evidence that the average grade point average (GPA) of graduating male students was 3.75 while the average GPA of graduating female students was only 3.74? Would the reader of this research now be convinced that the statement that male students were smarter than female students had been proven?

Without going into too much detail, statisticians have a concept called significance which refers to comparisons like the one described above. Two averages can be compared and shown to be different, but the question is whether or not that difference is significant. A significant difference is one that is very unlikely to have occurred by chance. An insignificant difference is one that could very easily have occurred by chance. In the case of GPA calculations where the difference is only one one-hundredth of a point, a statistical test of that difference would almost certainly reveal that it is not a significant difference. That is, if one female student who got a B in a specific course had earned an A instead, that would probably have made the two averages identical instead of being separated by a tiny fraction. Knowing that one point on an exam can be the difference between an A and a B in some courses, it is not hard to believe that the difference in the average GPA of males and females is not statistically significant.

Simple Statistics, Part B: Univariate Analysis

The example in the previous paragraph is vastly oversimplified but the underlying point is important. Statistics are used to analyze the data that the researcher collects and are universally accepted as having well-defined meanings. Different types of statistics are used depending on the type of data the researcher has gathered.

Some simple research studies will lead the researcher to analyze only one variable at a time. For instance, the researcher may have recorded the grade point averages of each student in a college and may want to analyze those grades. If analysis of the data on one variable is needed, the process is called univariate analysis. The prefix uni refers to one, as in a unicycle. The most common statistics calculated in univariate analysis fall into two categories−measures of central tendency and measures of dispersion.

Measures of central tendency are statistics that measure to what extent the data tends to cluster around some point. To use the earlier example of grades in a college course, a measure of central tendency would calculate whether most students earned about the same grade or whether the grades of the students were scattered all across the range of possible grades. Did 19 of the 20 students receive the grade of A and one student get a B? Or were there 4 As, 4 Bs, 4 Cs, 4 Ds, and 4 Fs? The first scenario would illustrate a high degree of clustering around one grade; the second scenario would show a low degree of clustering around some central point. Some of the common measures of central tendency that you will learn how to use are the mean or average, the median, and the mode.

Measures of dispersion are statistics that measure how spread out the data is, just the opposite of measures of central tendency. In the example in the paragraph above, the second scenario has a high degree of dispersion. The simplest calculation of dispersion is called the range. The range of scores or data that is collected on a single variable is simply the difference between the highest and the lowest score.

Using a different example this time, suppose that one variable on a survey is the number of children that the respondents have had. If the highest number reported was 17 and the lowest number reported was 0, the range would be from 0 to 17 or it could be reported as a range of 17 years. Notice how different this group would be from one that reported a high of two children and a low of zero. This second group has a much smaller range and would be clustered within two years of each other.

Simple Statistics, Part C: Bivariate analysis

As might be expected, bivariate analysis is simply the statistical analysis of data on two variables at the same time. The prefix bi refers to two, as in bicycle. Examples of this kind of analysis would include the earlier hypothetical study of the differences in GPA between males and females. (G.P.A. is one variable, sex is the other.) If data were gathered on educational achievement and earned income, the researcher would have data on two different variables and could see if people with more education tended to earn higher incomes or not. (Education is one variable, income is the second.)

Much of the time, bivariate data is presented to the reader in table form such that the values of one variable are the columns of the table (the vertical lines) and the values of the other variable are the rows (the horizontal lines). Table 8.1 shows hypothetical (fake) data on the variables education level completed (the columns) and annual earned income (the rows).

	High School Graduate	College Graduate	Master's Degree
Low Income	75%	25%	10%
Middle Income	20%	60%	50%
High Income	5%	15%	40%

Table 8.1. Annual earned income by education level completed

If one looks at the column labeled High School Grad., it is seen that only 5% of those whose highest educational achievement was to graduate from high school were earning what was described as a High Income. On the other hand, those graduating from college and those with master's degrees placed 15% and 40%, respectively, into the High Income bracket. Thus, it could be concluded that there is something about getting a college degree, and especially a master's degree, that increases the chance that one will earn a high income. This table, then, would be called a bivariate table because it presents the data on two different variables at the same time.

Qualitative Data Analysis

The discussion of qualitative data analysis could quite easily become more philosophical and complex than is appropriate for an entry level course such as this one. Therefore the presentation here will be relatively simple and straightforward. The notion or concept of quality is often something that is harder to describe or explain than the concept of quantity. And the two are not totally distinct from each other.

To illustrate, suppose I ask you to tell me if one garment is of higher quality than another garment of the same purpose − for instance, a shirt. You may be able to quantify some aspects of high quality such as the number of threads per square inch of fabric or the thickness of the fabric in millimeters. But other aspects of quality may be more difficult to quantify. You may say, for instance, that one shirt simply feels more comfortable than another; it is "softer" or "smoother," without being able to put a number on softness or smoothness. One shirt may just "look better" than the other one. These are qualitative distinctions that usually require words to describe them rather than numbers.

When it comes to analyzing data, some data may be in numerical form such as percentages, mean, median, or mode. This data would typically be subject to quantitative analysis. Other data may take the form of what might be called a story. That is, the researcher may have observed someone doing something and that is the data that needs to be analyzed. Social science research is rich with research describing things such as what happened to the researcher when he or she went to live in an apartment building in the poorest section of New York City or Washington, D.C. Other social science researchers have joined motorcycle gangs, strange religious cults, and nudist colonies! In the academic field known as cultural anthropology, researchers often have travelled thousands of miles to live in villages in Africa, China, South America, or a Pacific island and written fascinating accounts of the daily life of the native inhabitants of these places. Their analysis of what they observed and experienced includes the food, housing, clothing, medical care, transportation, religious beliefs, government systems, marriage customs, and many other norms, values, and beliefs that do not lend themselves to being described in numbers. Rather, these research reports read like stories − they have characters, a plot, emotions, and sometimes suspenseful endings.

But as mentioned at the beginning of this section of the lecture, quantitative and qualitative analyses are not always totally separate. A researcher might report that, for example, 80% of the adult members of an African village are married but then go on to report that this particular culture allows multiple wives so that there is a higher percentage of married men than married women. This leads the researcher to describe the situations in which a man may take a second or third wife and what his obligations are to each wife, etc. So what starts as a quantitative analysis then blends into a qualitative analysis.

Conclusion

One of the final steps in a research project is to analyze one's data − the data collected earlier in the process using a survey or an experiment, or perhaps using field research or some unobtrusive method of gathering data. Most of the time, but not always, the data has been gathered in numerical or quantitative form and the researcher uses one or more statistical tests or calculations to analyze the data and form conclusions as to what the data show about the topic or question at hand.