U8D1-60 - Reliability and Validity in Quantitative Research - See Details

Unit 8 INTRODUCTION In quantitative research, we refer to the concepts of validity and reliability to tell us how well the study measured what it was supposed to measure and how much we trust the accuracy of the results. We address these same basic issues in qualitative research, but they are referred to as credibility and dependability , respectively.

Validity as a Unifying Concept The process of reviewing literature and interpreting research in light of other studies revolves around the concept of validity , one of the central concepts in research design. What is validity? There is no consensus regarding the definition of validity, and validity in measurement entails different considerations from validity in research design (Pedhazur & Schmelkin, 1991). At its core, however, validity involves truth, or how accurately an instrument or study manifests what it is purported to examine. In Chapter 4, Leedy and Ormrod (2013) present different aspects of validity in relation to measurement and research design.

Validity in Measurement In measurement, validity is frequently discussed in terms of construct, content, and criterion-related validity, although validity in measurement is a unitary concept. Content and criterion-related validity are traditional classifications, but they should be considered aspects of the larger and unifying concept of construct validity (Pedhazur & Schmelkin, 1991). Samuel Messick, who has written extensively in the area of validity, has argued passionately for the use of construct validity as a comprehensive concept, which can be broken down into at least six different facets (Messick, 1995).

Regardless of how construct validity is divided, it is critical to understand that the construct validity of an instrument is never proven. It is only accorded varying degrees of credibility. In other words, validity is not an inherent property of a test or type of research design. Rather, validity refers to the appropriateness of interpretations or inferences made based on test scores or assessment data. In measurement, validity concerns the meaning and context of test scores.

Validity in Research Design Many of the same considerations apply to validity in research designs. In Chapter 4, Leedy and Ormrod (2013) discuss internal and external validity in research design. Many other authors (for example, Cook and Campbell, 1979; Parker, 1990) add the concepts of statistical conclusion validity and construct validity , which we discuss briefly below.

Statistical conclusion validity refers to the appropriate use of statistics in the derivation of conclusions from a study. It refers to whether the researchers have likely made an accurate conclusion about the findings based upon the statistical analysis. Another way to think about it is that statistical conclusion validity is enhanced when the researcher has made neither a Type I or Type II error, which was covered in Unit 3. Threats to statistical conclusion validity include the following:

•Low statistical power, which was discussed in Unit 4.

• Alpha inflation; that is, "fishing" for statistically significant results). Reliability and Validity •Low reliability of measures.

• Low reliability of treatment implementation.

• Nuisance variables in the experimental environment.

• Random but salient differences of respondents (Parker, 1990).

• False acceptance of the null hypothesis (Cook & Campbell, 1979).

Construct validity, when applied to research design, refers to "whether the variable is adequately defined and accurately measured by the instruments, procedures, manipulations, and methods employed in the study" (Parker, 1990, p. 617). Threats to construct validity in the design of research include the following: •Inadequate pre-operational explication of constructs (constructs poorly defined).

• Mono-operational bias (construct measured with only one operational definition).

• Mono-method bias (all constructs and treatments measured using only one method).

• Hypothesis-guessing (participants guess purpose of experiment and change their behavior accordingly).

• Test or treatment anxiety among participants.

• Experimenter expectancies (the experimenter unwittingly influences the participant's behavior).

• Generalizing across time (not indicating the length of time, strength, or duration of the treatment effect).

• Interaction of procedures and treatments.

Threats to internal, external, statistical conclusion, and construct validity vary by the types of quantitative research design. Leedy and Ormrod (2013) discuss ways to overcome threats to internal and external validity in their discussion of different types of quantitative research design. In general, the tighter controls a researcher implements in an experiment, the less likely the findings will be able to be generalized. In other words, higher internal validity comes at the cost of lower external validity.

Validity in Qualitative Research Thus far, our discussion of validity has involved quantitative research designs. Validity is no less important in qualitative research, though it has different implications, and it is often referred to as credibility or transferability . Maxwell (1997) proposed two main threats to validity in qualitative studies:

• Researcher bias.

• Reactivity, or the effect of the researcher on the environment or people studied.

He recommended that qualitative researchers ask the question "How might I be wrong?" They should then systematically examine potential threats to the validity of their conclusions. Maxwell suggested the following qualitative validity tests, several of which overlap with Leedy and Ormrod's (2013) list: •Search for discrepant evidence and negative cases.

• Triangulate conclusions; that is, compare information from multiple sources and avenues.

• Approach threats to validity as events, and then search for clues to support their presence or influence on outcomes.

• Solicit feedback from others.

• Use member checks; to Leedy and Ormrod, respondent validation involves checking the accuracy of your conclusions with the participants studied.

• Provide data rich and detailed enough to provide a full and revealing picture.

• Use simple statistics to summarize results.

• Compare your conclusions to conclusions drawn from similar research. Reliability Reliability refers to the consistency or reproducibility of scores or measurements across different times, settings, or people. Reliability is a necessary but not sufficient condition for validity. In other words, the reliability of an instrument for a given purpose must be established before further claims are made in support of its validity.

Reliability indices can vary from 0.00 to 1.00, with higher values indicating greater reliability.

Reliability indices are estimates. They represent an attempt to quantify the measurement error associated with a score. An important characteristic of a research project is the degree to which the procedures and results can be duplicated. Two factors that can have an effect on the repeatability of a project are the accuracy and precision with which the research was conducted. The statistical concept of reliability refers to the constructs of accuracy and precision.

As described in Chapter 4 of Leedy and Ormrod (2013), validity refers to the extent to which you are measuring or studying what you purport to measure; in other words, validity involves demonstrating the credibility of your findings. In developing a research plan, you must establish that your research has merit or meaning. An essential part of establishing the credibility of your research involves ruling out competing explanations for your findings.

This process is known as addressing the threats to validity . All researchers must address these threats.

Researchers must detail the steps taken to guard against certain threats and then establish why their conclusions are warranted. Researchers also need to discuss threats that remain and include these as limitations of the study.

As part of this process, particularly for quantitative designs, researchers must also create a convincing case that the results are reliable. Recall that reliability is a necessary, but not sufficient, condition for validity. Reliability refers to the consistency, precision, or reproducibility of scores. Like validity, reliability is not an inherent property of an instrument. An instrument may be more or less reliable with different populations. Technically, reliability is defined as the ratio of the variance in true scores, or a person's maximum score, to the variance in the observed scores. An observed score can be broken down into two parts, the true score and error. Reliability indices are designed to provide an estimate of the amount of measurement error associated with scores.

Types of Reliability Several types of reliability indices exist. The most common indices are test-retest, alternate-forms, internal consistency, and interrater. They vary according to the type of generalization you are interested in making (Salvia & Ysseldyke, 1991).

•Test-retest reliability provides an index of the stability of scores (that is, the extent to which observed scores can be generalized to different times).

• Alternate-forms reliability provides a measure of stability and consistency of response across different items.

• Internal consistency reliability refers to the extent to which one can generalize to different samples of items. Frequently, an instrument is divided in half and a reliability coefficient is computed on the relationship between the two halves, called a split-half reliability coefficient . Because such a procedure underestimates reliability, split-half coefficients are usually adjusted using the Spearman-Brown formula or other methods developed by Rulon or Guttman.

• Interrater agreement or interrater reliability involves the extent to which obtained scores can be generalized to different observers or raters. In other words, would other observers obtain the same scores (Salvia & Ysseldyke, 1991)? To obtain reliable observational data, you must describe exactly what will be measured, how it will be measured, what the observation intervals will be, how observers will be trained, and how you will regularly evaluate interrater agreement. The most simple and common form of interrater agreement is percent agreement . This is computed as the number of observations in which the observers agree divided by the total number of observations. Interrater reliability is the correlation between the scores obtained by two different raters (Salvia & Ysseldyke, 1991). A minimum benchmark for this type of reliability is 80 percent agreement or a reliability index of .80.

Some Factors Affecting Reliability In describing the reliability of data, researchers consider the potential influence of the following factors (Salvia & Ysseldyke, 1991):

•Test length (the longer the test, the more reliable the results).

• Test-retest interval (the longer the interval, the less reliable the results).

• Test-taker guessing (the more correct guesses, the lower the reliability).

• Variation within testing situations (for example, fatigue, misunderstanding directions, inadvertently filling out the dots on the answer sheet incorrectly, and so on will reduce the reliability of scores).

• Time limits, in cases in which the purpose of the test is not to assess speed.

• Group homogeneity (the greater the homogeneity, the lower the reliability coefficient).

In determining the reliability of the findings, researchers need to establish the reliability of the instruments selected for the group of people studied and the procedures implemented. In other words, researchers need to assess the amount of measurement error associated with their data and their procedures. This information is critical to the interpretation of the data and, ultimately, to the credibility of the findings.

Reliability in Qualitative Research Just as credibility and transferability roughly correlate to the concepts of internal and external validity in quantitative research, reliability has its counterparts in qualitative research as well. We refer to these issues as confirmability and dependability . Reliability, with its emphasis on replication, does not apply well to qualitative inquiries because each is unique—something that is inherent in the process. However, as we saw in Units 6 and 7, the rigor and detailed data-collection process required by qualitative approaches provide what we refer to as an "audit trail" in qualitative research. A thorough account of the interactions (such as verbatim transcripts, behavioral observations, and so on) that make up the data in qualitative research provide the researcher with a confirmable account of the process (Lincoln & Guba, 1985). Dependability comes into play in the record- keeping procedures of the qualitative researcher. The audit trail would be a record of the coding procedures used for the data. The detailed account of this coding and how it was done would provide us with an understanding of how dependably the codes were applied to the data.

In summary, revisit the concepts of reliability and validity throughout the research process, from design to implementation to analysis, and finally, to dissemination. In so doing, you provide a solid foundation for the conclusions you draw.

References Cook, T., & Campbell, D. (1979). Quasi-experimentation: Design and analysis of issues for field settings .

Chicago, IL: Rand-McNally.

Leedy, P. D., & Ormrod, J. E. (2013). Practical research: Planning and design (10th ed.). Upper Saddle River, NJ:


Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Thousand Oaks, CA: Sage. Maxwell, J. A. (1997). Designing a qualitative study. In L. Bickman & D. Rog, Handbook of Social Science Research (pp. 69–100). Thousand Oaks, CA: Sage.

Messick, S. M. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist , 50 , 741–749.

Parker, R. M. (1990). Power, control, and validity in research. Journal of Learning Disabilities, 23 (10), 613–620.

Pedhazur, E. J., & Schmelkin, L. P. (1991). Measurement, design, and analysis: An integrated approach .

Hillsdale, NJ: Lawrence Erlbaum.

Salvia, J., & Ysseldyke, J. E. (1991). Assessment . Boston, MA: Houghton Mifflin Co.

OBJECTIVES To successfully complete this learning unit, you will be expected to: 1.Explore the concepts of reliability and validity as they apply to both qualitative and quantitative research.

2. Delineate the link between reliability and validity, and scientific merit. [u08s1] Unit 8 Study 1 STUDIES Readings Read the introduction to the unit, Reliability and Validity. This will introduce you to the concepts of reliability and validity as they apply to both qualitative and quantitative research.

Use your Leedy and Ormrod text to complete the following:

•Review Chapter 4, "Planning Your Research Project," pages 74–115. This chapter discusses how a researcher goes about planning research, so threats to validity and reliability are reduced. Use the Research Library to complete the follwing: • Read Onwuegbuzie and Leech's 2007 article, "Validity and Qualitative Research: An Oxymoron?" from Quality & Quantity, volume 41, issue 2, pages 233–249. This article discusses how the concept of validity and how it is applied to qualitative research. PSY Lear ners Additional Required Reading In addition to the other required study activities for this unit, PSY learners are also required to compete the following:

Use the Research Library to complete the following: • Read Johnson's 1997 article, "Examining the Validity Structure of Qualitative Research," from Education , volume 118, issue 2, pages 282–293. This article focuses on validity in qualitative research. • Read McDermott's 2002 article, "Experimental Methods in Political Science," from Annual Review of Political Science , volume 5, issue 1, pages 31–62.This article covers validity issues, especially as they relate to quantitative experimental approach. [u08d1] Unit 8 Discussion 1  RELIABILITY AND VALIDITY IN QUANTITATIVE RESEARCH Resources Discussion Participation Scoring Guide.

APA Style and Format.

Research Library.

Persistent Links and DOIs.

Validity and Qualitative Research: An Oxymoron?

Using a quantitative article that you previously selected, post the following information: • Describe the variables used in the research and how they were measured. Describe any assessments that were used.

• Discuss the types of reliability that were reported for the measurement instrument (test-retest, inter- rater, parallel forms, split-half) and validity (content, construct, criterion).

• Using your textbook and the quantitative validity article assigned in this unit's studies, linked in Resources, construct an external validity checklist and an internal validity checklist, noting how your selected article did address or could have addressed each issue.

• Discuss how the reliability and validity in the research contribute to the scientific merit of the research.

• Post the persistent link for the article in your response. Refer to the Persistent Links and DOIs guide, linked in Resources, to learn how to locate this information in the library databases.

• Cite all sources in APA style and provide an APA-formatted reference list at the end of your post.