Methods in Behavioral Research, Ch. 4

Chapter 4 Fundamental Research Issues


LEARNING OBJECTIVES

  • Define variable and describe the operational definition of a variable.

  • Describe the different relationships between variables: positive, negative, curvilinear, and no relationship.

  • Compare and contrast nonexperimental and experimental research methods.

  • Distinguish between an independent variable and a dependent variable.

  • Discuss the limitations of laboratory experiments and the advantage of using multiple methods of research.

  • Distinguish between construct validity, internal validity, and external validity.

Page 73IN THIS CHAPTER, WE EXPLORE SOME OF THE BASIC ISSUES AND CONCEPTS THAT ARE NECESSARY FOR UNDERSTANDING THE SCIENTIFIC STUDY OF BEHAVIOR. We will focus on the nature of variables and the relationships between variables. We also examine general methods for studying these relationships. Most important, we introduce the concept of validity in research.

VALIDITY: AN INTRODUCTION

You are likely aware of the concept of validity. You use the term when asking whether the information that you found on a website is valid. A juror must decide whether the testimony given in a trial is valid. Someone on a diet may wonder if the weight shown on the bathroom scale is valid. After a first date, your friend may try to decide whether her positive impressions of the date are valid. Validity refers to truth or accuracy. Is the information on the website true? Does the testimony reflect what actually happened? Is the scale really showing my actual weight? Should my friend believe that her positive impression is accurate? In all these cases, someone is confronted with information and must make a decision about the extent to which the information is valid.

Scientists are also concerned about the validity of their research findings. In this chapter, we introduce three key types of validity:

  • Construct validity concerns whether our methods of studying variables are accurate.

  • Internal validity refers to the accuracy of conclusions about cause and effect.

  • External validity concerns whether we can generalize the findings of a study to other populations and settings.

These issues will be described in greater depth in this and subsequent chapters. Before exploring issues of validity, we need to have a fundamental understanding of variables and the operational definition of variables.

VARIABLES

variable is any event, situation, behavior, or individual characteristic that varies. Any variable must have two or more levels or values. Consider the following examples of variables that you might encounter in research and your own life. As you read a book, you encounter the variable of word length, with values defined by the number of letters of each word. You can take this one step further and think of the average word length used in paragraphs in the book. One book you read may use a longer average word length than another. Page 74When you think about yourself and your friends, you might categorize the people on a variable such as extraversion. Some people can be considered relatively low on the extraversion variable (or introverted); others are high on extraversion. You might volunteer at an assisted living facility and notice that the residents differ in their subjective well-being: Some of the people seem much more satisfied with their lives than others. When you are driving and the car in front of you brakes to slow down or stop, the period of time before you apply the brakes in your own car is called response time. You might wonder if response time varies depending on the driver's age or whether the driver is talking to someone using a cell phone. In your biology class, you are studying for a final exam that is very important and you notice that you are experiencing the variable of test anxiety. Because the test is important, everyone in your study group says that they are very anxious about it. You might remember that you never felt anxious when studying for quizzes earlier in the course. As you can see, we all encounter variables continuously in our lives even though we do not formally use the term. Researchers, however, systematically study variables.

Examples of variables a psychologist might study include cognitive task performance, depression, intelligence, reaction time, rate of forgetting, aggression, speaker credibility, attitude change, anger, stress, age, and self-esteem. For some variables, the values will have true numeric, or quantitative, properties. Values for the number of free throws made, number of words correctly recalled, and the number of symptoms of major depression would all range from 0 to an actual value. The values of other variables are not numeric, but instead simply identify different categories. An example is gender; the values for gender are male and female. These are different, but they do not differ in amount or quantity.

OPERATIONAL DEFINITIONS OF VARIABLES

A variable such as aggression, cognitive task performance, pain, self-esteem, or even word length must be defined in terms of the specific method used to measure or manipulate it. The operational definition of a variable is the set of procedures used to measure or manipulate it.

A variable must have an operational definition to be studied empirically. The variable bowling skill could be operationalized as a person's average bowling score over the past 20 games, or it could be operationalized as the number of pins knocked down in a single roll. Such a variable is concrete and easily operationalized in terms of score or number of pins. But things become more complicated when studying behavior. For example, a variable such as pain is very general and more abstract. Pain is a subjective state that cannot be directly observed, but that does not mean that we cannot create measures to infer how much pain someone is experiencing. A common pain measurement instrument in both clinical and research settings is the McGill Pain Questionnaire, which has both a long form and a short form (Melzack, 2005). The short form Page 75includes a 0 to 5 scale with descriptors no pain, mild, discomforting, distressing, horrible, excruciating. There is also a line with end points of no pain and worst possible pain; the person responds by making a mark at the appropriate place on the line. In addition, the questionnaire offers sensory descriptors such as throbbing, shooting, and stabbing; each of these descriptors has a rating of none, mild, moderate, or severe. This is a relatively complex set of questions and is targeted for use with adults. When working with children over 3, a better measurement instrument would be the Wong-Baker FACES™ Pain Rating Scale:

Using the FACES scale, a researcher could ask a child, “How much pain do you feel? Point to how much it hurts.” These examples illustrate that the same variable of pain can be studied using different operational definitions.

There are two important benefits in operationally defining a variable. First, the task of developing an operational definition of a variable forces scientists to discuss abstract concepts in concrete terms. The process can result in the realization that the variable is too vague to study. This realization does not necessarily indicate that the concept is meaningless, but rather that systematic research is not possible until the concept can be operationally defined.

In addition, operational definitions also help researchers communicate their ideas with others. If someone wishes to tell me about aggression, I need to know exactly what is meant by this term because there are many ways of operationally defining it. For example, aggression could be defined as (1) the number and duration of shocks delivered to another person, (2) the number of times a child punches an inflated toy clown, (3) the number of times a child fights with other children during recess, (4) homicide statistics gathered from police records, (5) a score on a personality measure of aggressiveness, or even (6) the number of times a batter is hit with a pitch during baseball games. Communication with another person will be easier if we agree on exactly what we mean when we use the term aggression in the context of our research.

Of course, a very important question arises once a variable is operationally defined: How good is the operational definition? How well does it match up with reality? How well does my average bowling score really represent my skill?

Construct Validity

Construct validity refers to the adequacy of the operational definition of variables: Does the operational definition of a variable actually reflect the true Page 76theoretical meaning of the variable? If you wish to scientifically study the variable of extraversion, you need some way to measure extraversion. Psychologists have developed measures that ask people whether they like to socialize with strangers or whether they prefer to avoid such situations. Do the answers on such a measure provide a good or “true” indication of the underlying variable of extraversion? If you are studying anger, will telling male college students that females had rated them unattractive create feelings of anger? Researchers are able to address these questions when they design their studies and examine the results.

RELATIONSHIPS BETWEEN VARIABLES

Many research studies investigate the relationship between two variables: Do the levels of the two variables vary systematically together? For example, does playing violent video games result in greater aggressiveness? Is physical attractiveness related to a speaker's credibility? As age increases, does the amount of cooperative play increase as well?

Recall that some variables have true numeric values whereas the levels of other variables are simply different categories (e.g., female versus male; being a student-athlete versus not being a student-athlete). This distinction will be expanded upon in the chapter on Measurement Concepts in Chapter 5. For the purposes of describing relationships among variables, we will begin by discussing relationships in which both variables have true numeric properties.

When both variables have values along a numeric scale, many different “shapes” can describe their relationship. We begin by focusing on the four most common relationships found in research: the positive linear relationship, the negative linear relationship, the curvilinear relationship, and, of course, the situation in which there is no relationship between the variables. These relationships are best illustrated by line graphs that show the way changes in one variable are accompanied by changes in a second variable. The four graphs in Figure 4.1 show these four types of relationships.

Positive Linear Relationship

In a positive linear relationship, increases in the values of one variable are accompanied by increases in the values of the second variable. In Chapter 1, Scientific Understanding of Behavior, we described a positive relationship between communicator credibility and persuasion; higher levels of credibility are associated with greater attitude change. Consider another communicator variable, rate of speech. Are “fast talkers” more persuasive? In a study conducted by Smith and Shaffer (1991), students listened to a speech delivered at a slow (144 words per minute), intermediate (162 wpm), or fast (214 wpm) speech rate. The speaker advocated a position favoring legislation to raise the legal drinking age; the students initially disagreed with this position. Graph A in Figure 4.1 shows the positive linear relationship between speech rate and attitude change that was found in this study. That is, as rate of speech increased, so did the amount of attitude change. In a graph like this, we see a horizontal and a vertical axis, termed the x axis and y axis, respectively. Values of the first variable are placed on the horizontal axis, labeled from low to high. Values of the second variable are placed on the vertical axis. Graph A shows that higher speech rates are associated with greater amounts of attitude change.

Page 77

FIGURE 4.1

Four types of relationships between variables

Negative Linear Relationship

Variables can also be negatively related. In a negative linear relationship, increases in the values of one variable are accompanied by decreases in the values of the other variable. Latané, Williams, and Harkins (1979) were intrigued with reports that increasing the number of people working on a task may actually reduce group effort and productivity. The researchers designed Page 78an experiment to study this phenomenon, which they termed “social loafing” (which you may have observed in group projects!). The researchers asked participants to clap and shout to make as much noise as possible. They did this alone or in groups of two, four, or six people. Graph B in Figure 4.1 illustrates the negative relationship between number of people in the group and the amount of noise made by each person. As the size of the group increased, the amount of noise made by each individual decreased. The two variables are systematically related, just as in a positive relationship; only the direction of the relationship is reversed.

Curvilinear Relationship

In a curvilinear relationship, increases in the values of one variable are accompanied by systematic increases and decreases in the values of the other variable. In other words, the direction of the relationship changes at least once. This type of relationship is sometimes referred to as a nonmonotonic function.

Graph C in Figure 4.1 shows a curvilinear relationship. This particular relationship is called an inverted-U. A number of inverted-U relationships are described by Grant and Schwartz (2011). Graph C depicts the relationship between number of extraverts in a group and performance. Having more extraverts in a team is associated with higher performance, but only up to a point. With too many extraverts, the relationship becomes negative as there is less and less task focus and a resulting effect on team performance. Of course, it is also possible to have a U-shaped relationship. Research on the relationship between age and happiness indicates that adults in their 40s are less happy than younger and older adults (Blanchflower & Oswald, 2008). A U-shaped curve results when this relationship is graphed.

No Relationship

When there is no relationship between the two variables, the graph is simply a flat line. Graph D in Figure 4.1 illustrates the relationship between crowding and task performance found in a series of studies by Freedman, Klevansky, and Ehrlich (1971). Unrelated variables vary independently of one another. Increases in crowding were not associated with any particular changes in performance; thus, a flat line describes the lack of relationship between the two variables.

You rarely hear about variables that are not related. In large U.S. surveys, the size of the community in which a person lives is not related to number of reported poor mental health days or amount of Internet use. Usually, such findings are just not as interesting as results that do show a relationship so there is little reason to focus on them (although research that does not support a widely assumed relationship may receive attention, e.g., a common medical procedure is found to be ineffective). We will examine “no relationship” findings in greater detail in Chapter 13.

Page 79Remember that these are general patterns. Even if, in general, a positive linear relationship exists, it does not necessarily mean that everyone who scores high on one variable will also score high on the second variable. Individual deviations from the general pattern are likely. In addition to knowing the general type of relationship between two variables, it is also necessary to know the strength of the relationship. That is, we need to know the size of the correlation between the variables. Sometimes two variables are strongly related to each other and show little deviation from the general pattern. Other times the two variables are not highly correlated because many individuals deviate from the general pattern. A numerical index of the strength of relationship between variables is called a correlation coefficient. Correlation coefficients are very important because we need to know how strongly variables are related to one another. Correlation coefficients are discussed in detail in Chapter 12.

Table 4.1 provides an opportunity to review types of relationships—for each example, identify the shape of the relationship as positive, negative, or curvilinear.

TABLE 4.1 Identify the type of relationship

Page 80

Relationships and Reduction of Uncertainty

When we detect a relationship between variables, we reduce uncertainty by increasing our understanding of the variables we are examining. The term uncertainty implies that there is randomness in events; scientists refer to this as random variability in events that occur. Research can reduce random variability by identifying systematic relationships between variables.

Identifying relationships between variables seems complex but is much easier to see in a simple example. For this example, the variables will have no quantitative properties—we will not describe increases in the values of variables but only differences in values—in this case whether a person is an active user of Facebook. Suppose you ask 200 students at your school to tell you whether they are active Facebook users. Now suppose that 100 students said Yes and the remaining 100 said No. What do you do with this information? You know only that there is variability in people's use of Facebook—some people are regular users and some are not.

This variability is called random variability. If you walked up to anyone at your school and tried to guess whether the person was a Facebook user, you would have to make a random guess—you would be right about half the time and wrong half the time (because we know that 50% of the people are regular users and 50% are not, any guess you make will be right about half the time). However, if we could explain the variability, it would no longer be random. How can the random variability be reduced? The answer is to see if we can identify variables that are related to Facebook use.

Suppose you also asked people to indicate their gender—whether they are male or female. Now let's look at what happens when you examine whether gender is related to Facebook use. Table 4.2 shows one possible outcome. Note that there are 100 males and 100 females in the study. The important thing, though, is that 40 of the males say they regularly use Facebook compared to 60 females. Have we reduced the random variability? We clearly have. Before you had this information, there would be no way of predicting whether a given person would be a regular Facebook user. Now that you have the research finding, you can predict the likelihood that any female would use Facebook and any male would not use Facebook. Now you will be right about 60% of the time; this is a big increase from the 50% when everything was random.

TABLE 4.2 Gender and Facebook use

Page 81Is there still random variability? The answer is clearly yes. You will be wrong about 40% of the time, and you do not know when you will be wrong. For unknown reasons, some males will say they are regular users of Facebook and some females will not. Can you reduce this remaining uncertainty? The quest to do so motivates additional research. With further studies, you may be able to identify other variables that are also related to Facebook use. For example, variables such as income and age may also be related to Facebook use.

This discussion underscores once again that relationships between variables are rarely perfect: There are males and females who do not fit the general pattern. The relationship between the variables is stronger when there is less random variability—for example, if 90% of females and 10% of males were Facebook users, the relationship would be much stronger (with less uncertainty or randomness).

NONEXPERIMENTAL VERSUS EXPERIMENTAL METHODS

How can we determine whether variables are related? There are two general approaches to the study of relationships among variables, the nonexperimental method and the experimental method. With the nonexperimental method, relationships are studied by making observations or measures of the variables of interest. This may be done by asking people to describe their behavior, directly observing behavior, recording physiological responses, or even examining various public records such as census data. In all these cases, variables are observed as they occur naturally. A relationship between variables is established when the two variables vary together. For example, the relationship between class attendance and course grades can be investigated by obtaining measures of these variables in college classes. A review of many studies that did this concluded that attendance is indeed related to grades (Credé, Roch, & Kieszczynka, 2010).

The second approach to the study of relationships, the experimental method, involves direct manipulation and control of variables. The researcher manipulates the first variable of interest and then observes the response. For example, Ramirez and Beilock (2011) were interested in the anxiety produced by important “high-stakes” examinations. Because such anxiety may impair performance, it is important to find ways to reduce the anxiety. In their research, Ramirez and Beilock tested the hypothesis that writing about testing worries would improve performance on the exam. In their study, they used the experimental method. All students took a math test and were then given an opportunity to take the test again. To make this a high-stakes test, students were led to believe that the monetary payout to themselves and their partner was dependent on their performance. The writing variable was then manipulated. Some students spent 10 minutes before taking the test writing about what Page 82they were thinking and feeling about the test. The other students constituted a control group; these students simply sat quietly for 10 minutes prior to taking the test. Next, the new, important test was then administered. The researchers found that students in the writing condition improved their scores; the control group's scores actually decreased. With the experimental method, the two variables do not merely vary together; one variable is introduced first to determine whether it affects the second variable.

Nonexperimental Method

Suppose a researcher is interested in the relationship between exercise and anxiety. How could this topic be studied? Using the nonexperimental method, the researcher would devise operational definitions to measure both the amount of exercise that people engage in and their level of anxiety. There could be a variety of ways of operationally defining either of these variables; for example, the researcher might simply ask people to provide self-reports of their exercise patterns and current anxiety level. Now suppose that the researcher collects data on exercise and anxiety from a number of people and finds that exercise is negatively related to anxiety—that is, the people who exercise more also have lower levels of anxiety. The two variables covary, or correlate, with each other: Observed differences in exercise are associated with amount of anxiety. Because the nonexperimental method allows us to observe covariation between variables, another term that is frequently used to describe this procedure is the correlational method. With this method, we examine whether the variables correlate or vary together.

The nonexperimental method seems to be a reasonable approach to studying relationships between variables such as exercise and anxiety. A relationship is established by finding that the two variables vary together—the variables covary or correlate with each other. However, this method is not ideal when we ask questions about cause and effect. We know the two variables are related, but what can we say about the causal impact of one variable on the other? There are two problems with making causal statements when the nonexperimental method is used: (1) it can be difficult to determine the direction of cause and effect and (2) researchers face the third-variable problem—that is, extraneous variables may be causing an observed relationship. These problems are illustrated in Figure 4.2. The arrows linking variables depict direction of causation.

Direction of cause and effect The first problem involves direction of cause and effect. With the nonexperimental method, it is difficult to determine which variable causes the other. In other words, it cannot really be said that exercise causes a reduction in anxiety. Although there are plausible reasons for this particular pattern of cause and effect, there are also reasons why the opposite pattern might occur (both causal directions are shown in Figure 4.2). Perhaps high anxiety causes people to reduce exercise. The issue here is one of temporal precedence; it is very important in making causal inferences (see Chapter 1). Knowledge of the correct direction of cause and effect in turn has implications for applications of research findings: If exercise reduces anxiety, then undertaking an exercise program would be a reasonable way to lower one's anxiety. However, if anxiety causes people to stop exercising, simply forcing someone to exercise is not likely to reduce the person's anxiety level.

Page 83

FIGURE 4.2

Causal possibilities in a nonexperimental study

The problem of direction of cause and effect is not the most serious drawback to the nonexperimental method, however. Scientists have pointed out, for example, that astronomers can make accurate predictions even though they cannot manipulate variables in an experiment. In addition, the direction of cause and effect is often not crucial because, for some pairs of variables, the causal pattern may operate in both directions. For instance, there seem to be two causal patterns in the relationship between the variables of similarity and liking: (1) Similarity causes people to like each other, and (2) liking causes people to become more similar. In general, the third-variable problem is a much more serious fault of the nonexperimental method.

The third-variable problem When the nonexperimental method is used, there is the danger that no direct causal relationship exists between the two variables. Exercise may not influence anxiety, and anxiety may have no causal effect on exercise; this would be known as a spurious relationship. Instead, there may be a relationship between the two variables because some other variable causes both exercise and anxiety. This is known as the third-variable problem.

A third variable is any variable that is extraneous to the two variables being studied. Any number of other third variables may be responsible for an Page 84observed relationship between two variables. In the exercise and anxiety example, one such third variable could be income level. Perhaps high income allows people more free time to exercise (and the ability to afford a health club membership!) and also lowers anxiety. Income acting as a third variable is illustrated in Figure 4.2. If income is the determining variable, there is no direct cause-and-effect relationship between exercise and anxiety; the relationship was caused by the third variable, income level. The third variable is an alternative explanation for the observed relationship between the variables. Recall from Chapter 1 that the ability to rule out alternative explanations for the observed relationship between two variables is another important factor when we try to infer that one variable causes another.

The fact that third variables could be operating is a serious problem, because third variables introduce alternative explanations that reduce the overall validity of a study. The fact that income could be related to exercise means that income level is an alternative explanation for an observed relationship between exercise and anxiety. The alternative explanation is that high income reduces anxiety level, so exercise has nothing to do with it.

When we actually know that an uncontrolled third variable is operating, we can call the third variable a confounding variable. If two variables are confounded, they are intertwined so you cannot determine which of the variables is operating in a given situation. If income is confounded with exercise, income level will be an alternative explanation whenever you study exercise. Fortunately, there is a solution to this problem: the experimental method provides us with a way of controlling for the effects of third variables.

As you can see, direction of cause and effect and potential third variables represent serious limitations of the nonexperimental method. Often, they are not considered in media reports of research results. For instance, a newspaper may report the results of a nonexperimental study that found a positive relationship between amount of coffee consumed and likelihood of a heart attack. Obviously, there is not necessarily a cause-and-effect relationship between the two variables. Numerous third variables (e.g., occupation, personality, or genetic predisposition) could cause both a person's coffee-drinking behavior and the likelihood of a heart attack. In sum, the results of such studies are ambiguous and should be viewed with skepticism.

Experimental Method

The experimental method reduces ambiguity in the interpretation of results. With the experimental method, one variable is manipulated and the other is then measured. The manipulated variable is called the independent variable and the variable that is measured is termed the dependent variable. If a researcher used the experimental method to study whether exercise reduces anxiety, exercise would be manipulated—perhaps by having one group of people exercise each day for a week and another group refrain from exercise for a week. Anxiety would then be measured. Suppose that people in the exercise group have Page 85less anxiety than the people in the no-exercise group. The researcher could now say something about the direction of cause and effect: In the experiment, exercise came first in the sequence of events. Thus, anxiety level could not influence the amount of exercise that the people engaged in.

Another characteristic of the experimental method is that it attempts to eliminate the influence of all potential confounding third variables on the dependent variable. This is generally referred to as control of extraneous variables. Such control is usually achieved by making sure that every feature of the environment except the manipulated variable is held constant. Any variable that cannot be held constant is controlled by making sure that the effects of the variable are random. Through randomization, the influence of any extraneous variables is equal in the experimental conditions. Both procedures are used to ensure that any differences between the groups are due to the manipulated variable.

Experimental control In an experiment, all extraneous variables are kept constant. This is called experimental control. If a variable is held constant, it cannot be responsible for the results of the experiment. In other words, any variable that is held constant cannot be a confounding variable. In the experiment on the effect of exercise, the researcher would want to make sure that the only difference between the exercise and no-exercise groups is the exercise. For example, because people in the exercise group are removed from their daily routine to engage in exercise, the people in the no-exercise group should be removed from their daily routine as well. Otherwise, the lower anxiety in the exercise condition could have resulted from the “rest” from the daily routine rather than from the exercise.

Experimental control is accomplished by treating participants in all groups in the experiment identically; the only difference between groups is the manipulated variable. In the Loftus experiment on memory (discussed in Chapter 2), both groups witnessed the same accident, the same experimenter asked the questions in both groups, the lighting and all other conditions were the same, and so on. When a difference occurred between the groups in reporting memory, researchers could be sure that the difference was the result of the method of questioning rather than of some other variable that was not held constant.

Randomization The number of potential confounding variables is infinite, and sometimes it is difficult to keep a variable constant. The most obvious such variable is any characteristic of the participants. Consider an experiment in which half the research participants are in the exercise condition and the other half are in the no-exercise condition; the participants in the two conditions might be different on some extraneous, third variable such as income. This difference could cause an apparent relationship between exercise and anxiety. How can the researcher eliminate the influence of such extraneous variables in an experiment?

Page 86The experimental method eliminates the influence of such variables by randomization. Randomization ensures that an extraneous variable is just as likely to affect one experimental group as it is to affect the other group. To eliminate the influence of individual characteristics, the researcher assigns participants to the two groups in a random fashion. In actual practice, this means that assignment to groups is determined using a list of random numbers. To understand this, think of the participants in the experiment as forming a line. As each person comes to the front of the line, a random number is assigned, much like random numbers are drawn for a lottery. If the number is even, the individual is assigned to one group (e.g., exercise); if the number is odd, the subject is assigned to the other group (e.g., no exercise). By using a random assignment procedure, the researcher is confident that the characteristics of the participants in the two groups will be virtually identical. In this “lottery,” for instance, people with low, medium, and high incomes will be distributed equally in the two groups. In fact, randomization ensures that the individual characteristic composition of the two groups will be virtually identical in every way. This ability to randomly assign research participants to the conditions in the experiment is an important difference between the experimental and nonexperimental methods.

To make the concept of random assignment more concrete, you might try an exercise such as the one we did with a box full of old baseball cards. The box contained cards of 50 American League players and 50 National League players. The cards were thoroughly mixed up; we then proceeded to select 32 of the cards and assign them to “groups” using a sequence of random numbers obtained from a website that generates random numbers (www.randomizer.org). As each card was drawn, we used the following decision rule: If the random number is even, the player is assigned to Group 1, and if the number is odd, the player is assigned to Group 2. We then checked to see whether the two groups differed in terms of league representation. Group 1 had nine American League players and seven National League players, whereas Group 2 had an equal number of players from the two leagues. The two groups were virtually identical!

Any other variable that cannot be held constant is also controlled by randomization. For instance, many experiments are conducted over a period of several days or weeks, with participants arriving for the experiment at various times during each day. In such cases, the researcher uses a random order for scheduling the sequence of the various experimental conditions. This procedure prevents a situation in which one condition is scheduled during the first days of the experiment whereas the other is studied during later days. Similarly, participants in one group will not be studied only during the morning and the others only in the afternoon.

Direct experimental control and randomization eliminate the influence of any extraneous variables (keeping variables constant across conditions). Thus, the experimental method allows a relatively unambiguous interpretation of the results. Any difference between groups on the observed variable can be attributed to the influence of the manipulated variable.

Page 87

Internal Validity and the Experimental Method

Internal validity is the ability to draw conclusions about causal relationships from the results of a study. A study has high internal validity when strong inferences can be made that one variable caused changes in the other variable. We have seen that strong causal inferences can be made more easily when the experimental method is used.

Recall from Chapter 1 that inferences of cause and effect require three elements. So, strong internal validity requires an analysis of these three elements:

  • First, there must be temporal precedence: The causal variable should come first in the temporal order of events and be followed by the effect. The experimental method addresses temporal order by first manipulating the independent variable and then observing whether it has an effect on the dependent variable. In other situations, you may observe the temporal order or you may logically conclude that one order is more plausible than another.

  • Second, there must be covariation between the two variables. Covariation is demonstrated with the experimental method when participants in an experimental condition (e.g., an exercise condition) show the effect (e.g., a reduction in anxiety), whereas participants in a control condition (e.g., no exercise) do not show the effect.

  • Third, there is a need to eliminate plausible alternative explanations for the observed relationship. An alternative explanation is based on the possibility that some confounding third variable is responsible for the observed relationship. When designing research, a great deal of attention is paid to eliminating alternative explanations, because doing so brings us closer to truth. Indeed, eliminating alternative explanations improves internal validity. The experimental method begins by attempting to keep such variables constant through random assignment and experimental control.

Other issues of control will be discussed in later chapters. The main point here is that inferences about causal relationships are stronger when there are fewer alternative explanations for the observed relationships.

Independent and Dependent Variables

When researchers study the relationship between variables, the variables are usually conceptualized as having a cause-and-effect connection. That is, one variable is considered to be the cause and the other variable the effect. Thus, speaker credibility is viewed as a cause of attitude change, and exercise is viewed as having an effect on anxiety. Researchers using both experimental and nonexperimental methods view the variables in this fashion, even though, as we Page 88have seen, there is less ambiguity about the direction of cause and effect when the experimental method is used. Researchers use the terms independent variable and dependent variable when referring to the variables being studied. The variable that is considered to be the cause is called the independent variable, and the variable that is the effect is called the dependent variable. It is often helpful to actually draw a relationship between the independent and dependent variables using an arrow as we did in Figure 4.2. The arrow always indicates your hypothesized causal sequence:

In an experiment, the manipulated variable is the independent variable. After manipulating the independent variable, the researchers measure a second variable, called the dependent variable. The basic idea is that the researchers make changes in the independent variable and then see if the dependent variable changes in response.

One way to remember the distinction between the independent and dependent variables is to relate the terms to what happens to participants in an experiment. First, the participants are exposed to a situation, such as watching a violent versus a nonviolent program or exercising versus not exercising. This is the manipulated variable. It is called the independent variable because the participant has nothing to do with its occurrence; the researchers vary it independently of any characteristics of the participant or situation.

In the next step of the experiment, the researchers want to see what effect the independent variable had on the participant; to find this out, they measure the dependent variable. In this step, the participant is responding to what happened to him or her; whatever the participant does or says, the researcher assumes must be caused by—or be dependent on—the effect of the independent (manipulated) variable. The independent variable, then, is the variable manipulated by the experimenter, and the dependent variable is the participant's measured behavior, which is assumed to be caused by the independent variable.

When the relationship between an independent and a dependent variable is plotted in a graph, the independent variable is always placed on the horizontal axis and the dependent variable is always placed on the vertical axis. If you look back to Figure 4.1, you will see that this graphing method was used to present the four relationships. In Graph B, for example, the independent variable, “Group Size,” is placed on the horizontal axis; the dependent variable, “Amount of Noise,” is placed on the vertical axis.

Note that some research focuses primarily on the independent variable, with the researcher studying the effect of a single independent variable on numerous behaviors. Other researchers may focus on a specific dependent variable and study how various independent variables affect that one behavior. To make this distinction more concrete, consider a study of the effect of jury size, the independent variable, on the outcome of a trial, the dependent variable. One researcher studying this issue might be interested in the effect of group size on a variety of behaviors, including jury decisions and risk taking among business managers. Another researcher, interested solely in jury decisions, might study the effects of many aspects of trials, such as jury size and the judge's instructions, on juror behavior. Both emphases lead to important research. Figure 4.3 presents an opportunity to test your knowledge of the types of variables we have described.

Page 89

FIGURE 4.3

Identify the relevant variables

CHOOSING A METHOD

The advantages of the experimental method for studying relationships between variables have been emphasized. However, there are disadvantages to experiments and many good reasons for using methods other than experiments. Let's examine some of the issues that arise when choosing a method.

External Validity and the Artificiality of Experiments

The external validity of a study is the extent to which the results can be generalized to other populations and settings. In thinking about external validity, several questions arise: Can the results of a study be replicated with other operational definitions of the variables? Can the results be replicated with different participants? Can the results be replicated in other settings?

When examining a single study, we find internal validity to be generally in conflict with external validity. A researcher interested in establishing that there is a causal relationship between variables is most interested in internal validity. An experiment would be designed in which the independent variable is manipulated and other variables are kept constant (experimental control). Page 90This is most easily done in a laboratory setting, often with a highly restricted sample such as college students drawn from introductory psychology classes. This procedure permits relatively unambiguous inferences concerning cause and effect and reduces the possibility that extraneous variables could influence the results. These unambiguous inferences are another way of saying “strong internal validity.” Laboratory experimentation is an extremely valuable way to study many problems. However, the high degree of control and the laboratory setting may sometimes create an artificial atmosphere that may limit the external validity of the results. So, although laboratory experiments often have strong internal validity, they may often have limited external validity. For this reason, researchers may decide to use a nonexperimental procedure to study relationships among variables.

Another alternative is to try to conduct an experiment in a field setting. In a field experiment, the independent variable is manipulated in a natural setting. As in any experiment, the researcher attempts to control extraneous variables via either randomization or experimental control. As an example of a field experiment, consider Lee, Schwarz, Taubman, and Hou's (2010) study on the impact that public sneezing had on perceptions of risk resulting from flu. On a day that swine flu received broad media attention, students on a university campus were exposed to a confederate who either sneezed and coughed, or did not, as participants passed. Afterward, participants were asked to complete a measure of perceived risks in order to help on a “class project.” The researchers conducted a similar study at shopping malls and local businesses. In all three cases, they found that participants who were exposed to sneezing and coughing perceived higher risk of contracting a serious disease, having a heart attack prior to age 50, and dying from a crime or accident.

Many other field experiments take place in public spaces such as street corners, shopping malls, and parking lots. Ruback and Juieng (1997) measured the amount of time drivers in a parking lot took to leave their space under two conditions: (1) when another car (driven by the experimenter) waited a few spaces away or (2) when no other car was present. As you might expect, drivers took longer to leave when a car was waiting for the space. Apparently, the motive to protect a temporary territory is stronger than the motive to leave as quickly as possible! The advantage of the field experiment is that the independent variable is investigated in a natural context. The disadvantage is that the researcher loses the ability to directly control many aspects of the situation. For instance, in the parking lot, there are other shoppers in the area and security guards that might drive by. The laboratory experiment permits researchers to more easily keep extraneous variables constant, thereby eliminating their influence on the outcome of the experiment. Of course, it is precisely this control that leads to the artificiality of the laboratory investigation. Fortunately, when researchers have conducted experiments in both lab and field settings, the results of the experiments have been very similar (Anderson, Lindsay, & Bushman, 1999).

Page 91

Ethical and Practical Considerations

Sometimes the experimental method is not a feasible alternative because experimentation would be either unethical or impractical. Child-rearing practices would be impractical to manipulate with the experimental method, for example. Further, even if it were possible to randomly assign parents to two child-rearing conditions, such as using withdrawal of love versus physical types of punishment, the manipulation would be unethical. Instead of manipulating variables such as child-rearing techniques, researchers usually study them as they occur in natural settings. Many important research areas present similar problems—for example, studies of the effects of alcoholism, divorce and its consequences, or the impact of corporal punishment on children's aggressiveness. Such problems need to be studied, and generally the only techniques possible are nonexperimental.

When such variables are studied, people are often categorized into groups based on their experiences. When studying corporal punishment, for example, one group would consist of individuals who were spanked as children and another group would consist of people who were not. This is sometimes called an ex post facto design. Ex post facto means “after the fact”—the term was coined to describe research in which groups are formed on the basis of some actual difference rather than through random assignment as in an experiment. It is extremely important to study these differences. However, it is important to recognize that this is nonexperimental research because there is no random assignment to the groups and no manipulation of an independent variable.

Participant Variables

Participant variables (also called subject variables and personal attributes) are characteristics of individuals, such as age, gender, ethnic group, nationality, birth order, personality, or marital status. These variables are by definition nonexperimental and so must be measured. For example, to study a personality characteristic such as extraversion, you might have people complete a personality test that is designed to measure this variable. Such variables may be studied in experiments along with manipulated independent variables (see Chapter 10).

Description of Behavior

A major goal of science is to provide an accurate description of events. Thus, the goal of much research is to describe behavior; in those cases, causal inferences are not relevant to the primary goals of the research. A classic example of descriptive research in psychology comes from the work of Jean Piaget, who carefully observed the behavior of his own children as they matured. He described in detail the changes in their ways of thinking about and responding to their environment (Piaget, 1952). Piaget's descriptions and his interpretations of his observations resulted in an important theory of cognitive development Page 92that greatly increased our understanding of this topic. Piaget's theory had a major impact on psychology that continues today (Flavell, 1996).

A more recent example of descriptive research in psychology is Meston and Buss's (2007) study on the motives for having sex. The purpose of the study was to describe the “multitude of reasons that people engage in sexual intercourse” (p. 496). In the study, 444 male and female college students were asked to list the reasons why they had engaged in sexual intercourse in the past. The researchers combed through the answers and identified 237 reasons including “I was attracted to the person,” “I wanted to feel loved,” “I wanted to make up after a fight,” and “I wanted to defy my parents.” The next step for the researchers was to categorize the reasons that their participants reported for having sex, including physical reasons (such as attraction) and goal attainment reasons (such as revenge). In this case, as with some of Piaget's work, the primary goal was to describe behavior rather than to understand its causes.

Successful Predictions of Future Behavior

In many real-life situations, a major concern is to make a successful prediction about a person's future behavior—for example, success in school, ability to learn a new job, or probable interest in various major fields in college. In such circumstances, there may be no need to be concerned about issues of cause and effect. It is possible to design measures that increase the accuracy of predicting future behavior. School counselors can give tests to decide whether students should be in “enriched” classroom programs, employers can test applicants to help determine whether they should be hired, and college students can take tests that help them decide on a major. These types of measures can lead to better decisions for many people. When researchers develop measures designed to predict future behavior, they must conduct research to demonstrate that the measure does, in fact, relate to the behavior in question. This research will be discussed in Chapter 5.

Advantages of Multiple Methods

Perhaps most important, complete understanding of any phenomenon requires study using multiple methods, both experimental and nonexperimental. No method is perfect, and no single study is definitive. To illustrate, consider a hypothesis developed by Frank and Gilovich (1988). They were intrigued by the observation that the color black represents evil and death across many cultures over time, and they wondered whether this has an influence on our behavior. They noted that several professional sports teams in the National Football League and National Hockey League wear black uniforms and hypothesized that these teams might be more aggressive than other teams in the leagues.

They first needed an operational definition of “black” and “nonblack” uniforms; they decided that a black uniform is one in which 50% or more of the uniform is black. Using this definition, five NFL and five NHL teams had black uniforms. They first asked people who had no knowledge of the Page 93NFL or NHL to view each team's uniform and then rate the teams on “malevolent” adjectives such as “mean” and “aggressive.” Overall, the ratings of the black uniform teams were perceived to be more malevolent. They then compared the penalty yards of NFL black and nonblack teams and the penalty minutes of NHL teams. In both cases, black teams were assessed more penalties. But is there a causal pattern? Frank and Gilovich discovered that two NHL teams had switched uniforms from nonblack to black, so they compared penalty minutes before and after the switch; consistent with the hypothesis, penalties did increase for both teams. They also looked at the penalty minutes of a third team that had changed from a nonblack color to another nonblack color and found no change in penalty minutes. Note that none of these studies used the experimental method. In an experiment to test the hypothesis that people perceive black uniform teams as more aggressive, students watched videos of two plays from a staged football game in which the defense was wearing either black or white. Both plays included an aggressive act by the defense. On these plays, the students penalized the black uniform team more than the nonblack team. In a final experiment to see whether being on a black uniform team would increase aggressiveness, people were brought into the lab in groups of three. The groups were told they were a “team” that would be competing with another team. All members of the team were given either white or black clothing to wear for the competition; they were then asked to choose the games they would like to have for the competition. Some of the games were aggressive (“dart gun duel”) and some were not (“putting contest”). As you might expect by now, the black uniform teams chose more aggressive games.

The important point here is that no study is a perfect test of a hypothesis. However, when multiple studies using multiple methods all lead to the same conclusion, our confidence in the findings and our understanding of the phenomenon are greatly increased.

EVALUATING RESEARCH: SUMMARY OF THE THREE VALIDITIES

The key concept of validity was introduced at the outset of this chapter. Validity refers to “truth” and the accurate representation of information. Research can be described and evaluated in terms of three types of validity:

  • Construct validity refers to the adequacy of the operational definitions of variables.

  • Internal validity refers to our ability to accurately draw conclusions about causal relationships.

  • External validity is the extent to which results of a study can be generalized to other populations and settings.

Page 94Each gives us a different perspective on any particular research investigation, and every research study should be evaluated on these aspects of validity.

At this point, you may be wondering how researchers select a methodology to study a problem. A variety of methods are available, each with advantages and disadvantages. Researchers select the method that best enables them to address the questions they wish to answer. No method is inherently superior to another. Rather, the choice of method is made after considering the problem under investigation, ethics, cost and time constraints, and issues associated with the three types of validity. In the remainder of this book, many specific methods will be discussed, all of which are useful under different circumstances. In fact, all are necessary to understand the wide variety of behaviors that are of interest to behavioral scientists. Complete understanding of any problem or issue requires research using a variety of methodological approaches.

ILLUSTRATIVE ARTICLE: STUDYING BEHAVIOR

Many people have had the experience of anticipating something bad happening to them: “I'm not going to get that job” or “I'm going to fail this test” or “She'll laugh in my face if I ask her out!” Do you think that anticipating a negative outcome means that a person is less distressed when a negative outcome occurs? That is, is it better to think “I'm going to fail” if, indeed, you may fail?

In a study published by Golub, Gilbert, and Wilson (2009), two experiments and a field study were conducted in an effort to determine whether this negative expectation is a good thing or a bad thing.

In the two laboratory studies, participants were asked to complete a personality assessment and were then led to have either positive, negative, or no expectations about the results. Participants' affective (emotional) state was assessed prior to—and directly after—hearing a negative (in the case of study 1a) or positive (in the case of study 1b) outcome. In the field study, participants were undergraduate introductory psychology students who were asked about their expectations of their performance in an upcoming exam. Then, a day after the exam, positive and negative emotion was assessed. Taken together, the results of these three studies suggest that anticipating bad outcomes may be an ineffective path to positive emotion.

First, acquire and read the article:

Golub, S. A., Gilbert, D. T., & Wilson, T. D. (2009). Anticipating one's troubles: The costs and benefits of negative expectations. Emotion, 9, 227–281. doi:10.1037/a0014716

Then, after reading the article, consider the following:

  1. For each of the studies, how did Golub, Gilbert, and Wilson (2009) operationally define the positive expectations? How did they operationally define affect?

  2. In experiments 1a and 1b, what were the independent variable(s)? What where the dependent variable(s)?Page 95

  3. This article includes three different studies. In this case, what are the advantages to answering the research question using multiple methods?

  4. On what basis did the authors conclude, “our studies suggest that the affective benefits of negative expectations may be more elusive than their costs” (p. 280)?

  5. Evaluate the external validity of the two experiments and one field study that Golub, Gilbert, and Wilson (2009) conducted.

  6. How good was the internal validity?

Study Terms

Confounding variable (p. 84)

Construct validity (p. 73)

Correlation coefficient (p. 79)

Curvilinear relationship (p. 76)

Dependent variable (p. 84)

Experimental control (p. 85)

Experimental method (p. 81)

External validity (p. 73)

Field experiment (p. 90)

Independent variable (p. 84)

Internal validity (p. 73)

Negative linear relationship (p. 76)

Nonexperimental method (correlational method) (p. 81)

Operational definition (p. 74)

Participant (subject) variable (p. 91)

Positive linear relationship (p. 76)

Randomization (p. 86)

Third-variable problem (p. 83)

Variable (p. 73)

Review Questions

  1. What is a variable? List at least five different variables and then describe at least two levels of each variable. For example, age is a variable. For adults, age has values that can be expressed in years starting at 18 and ranging upward. In an actual study, the age variable might be measured by asking for actual age in years, the year of birth, or providing a choice of age ranges such as 18–34, 35–54, and 55+. Sentence length is a variable. The values might be defined by the number of words in sentences that participants write in an essay.

  2. Define “operational definition” of a variable. Give at least two operational definitions of the variables you thought of in the previous review question.

  3. Distinguish among positive linear, negative linear, and curvilinear relationships.Page 96

  4. What is the difference between the nonexperimental method and the experimental method?

  5. What is the difference between an independent variable and a dependent variable?

  6. Distinguish between laboratory and field experiments.

  7. What is meant by the problem of direction of cause and effect and the third-variable problem?

  8. How do direct experimental control and randomization influence the possible effects of extraneous variables?

  9. What are some reasons for using the nonexperimental method to study relationships between variables?

Activities

  1. The dictionary definition of shy is “being reserved or having or showing nervousness or timidity in the company of other people.” Create three different operational definitions of shy and provide a critique of each one. Example: An operational definition of shy could be the number of new people that a person reports meeting in a given day. Critique: What if an outgoing person has a job that requires them to meet very few people? They may be considered (incorrectly) shy by this operational definition.

  2. Males and females may differ in their approaches to helping others. For example, males may be more likely to help a person having car trouble, and females may be more likely to bring dinner to a sick friend. Develop two operational definitions for the concept of helping behavior, one that emphasizes the “male style” and the other the “female style.” How might the use of one or the other lead to different conclusions from experimental results regarding who helps more, males or females? What does this tell you about the importance of operational definitions?

  3. You observe that classmates who get good grades tend to sit toward the front of the classroom, and those who receive poorer grades tend to sit toward the back. What are three possible cause-and-effect relationships for this nonexperimental observation?

  4. Consider the hypothesis that stress at work causes family conflict at home.

a. What type of relationship is proposed (e.g., positive linear, negative linear)?

b. Graph the proposed relationship.

c. Identify the independent variable and the dependent variable in the statement of the hypothesis.

d. How might you investigate the hypothesis using the experimental method?Page 97

e. How might you investigate the hypothesis using the nonexperimental method (recognizing the problems of determining cause and effect)?

f. What factors might you consider in deciding whether to use the experimental or nonexperimental method to study the relationship between work stress and family conflict?

  5. Identify the independent and dependent variables in the following descriptions of experiments:

a. Students watched a cartoon either alone or with others and then rated how funny they found the cartoon to be.

b. A comprehension test was given to students after they had studied text-book material either in silence or with the television turned on.

c. Some elementary school teachers were told that a child's parents were college graduates, and other teachers were told that the child's parents had not finished high school; they then rated the child's academic potential.

d. Workers at a company were assigned to one of two conditions: One group completed a stress management training program; another group of workers did not participate in the training. The number of sick days taken by these workers was examined for the two subsequent months.

  6. A few years ago, newspapers reported a finding that Americans who have a glass of wine a day are healthier than those who have no wine (or who have a lot of wine or other alcohol). What are some plausible alternative explanations for this finding; that is, what variables other than wine could explain the finding? (Hint: What sorts of people in the United States are most likely to have a glass of wine with dinner?)

  7. The limitations of nonexperimental research were dramatically brought to the attention of the public by the results of an experiment on the effects of postmenopausal hormone replacement therapy (part of a larger study known as the Women's Health Initiative). An experiment is called a clinical trial in medical research. In the clinical trial, participants were randomly assigned to receive either the hormone replacement therapy or a placebo (no hormones). The hormone replacement therapy consisted of estrogen plus progestin. In 2002, the investigators concluded that women taking the hormone replacement therapy had a higher incidence of heart disease than did women in the placebo (no hormone) condition. At that point, they stopped the experiment and informed both the participants and the public that they should talk with their physicians about the advisability of this therapy. The finding dramatically contrasted with the results of nonexperimental research in which women taking hormones Page 98had a lower incidence of heart disease; in these studies, researchers compared women who were already taking the hormones with women not taking hormones. Why do you think the results were different with the experimental research and the nonexperimental research?

Answers

TABLE 4.1:

positive, negative, curvilinear, negative, positive, positive, curvilinear, negative

FIGURE 4.3:

Independent variable = music condition

Dependent variable = exam score

Potential confounding variables =

use of headphones (headphones were only worn by participants in the music condition)