Psychology question 3

All notes are reference two this cite

Salkind, N. J. (2012). Exploring research (8th ed.). Upper Saddle River, NJ: Pearson.

  1. Explain the use of testing and assessment in psychology. CH6

more important. Finally, keep in mind that methods vary widely in the time it takes to learn how to use them, in the measurement process itself, and in what you can do with the information once you have collected it. For example, an interview might be appropriate to determine how teachers feel about changes in the school administration, but interviewing would not be very useful if you were interested in assessing physical strength. So, here is an overview of a variety of measurement tools. Like any other tool, use the one you choose well and you will be handsomely rewarded. Likewise, if you use the tool incorrectly, the job may not get done at all, and even if it does, the quality and value of your finished report will be less than what you expected. The way in which you ask your research question will determine the method you use to assess the variables you are studying. What better place to start than with the measurement method that all of us have been exposed to time and again: the good ol’ test? Tests and Their Development In the most general terms, the purpose of a test is to measure the nature and the extent of individual differences. For example, you might want to assess teenagers’ knowledge of how AIDS is transmitted. Or you may be interested in differences that exist on some measure of personality such as the Myers–Briggs Type Indicator or an intelligence test such as the Wechsler Intelligence Scales. Tests also are instruments that distinguish among people on such measures as reaction time, physical strength, agility, or the strategy someone selects to solve a problem. Not all tests use paper and pencil, and as we just mentioned, the technique that a researcher uses to assess a behavior often reflects that researcher’s creativity. A good test should be able to differentiate people from one another reliably based on their true scores. Before continuing, here are just a few words of clarification. The word “test” is being used throughout this chapter to indicate a tool or technique to assess behavior but should not be used synonymously with the term “dependent variable.” Although you may use a test to assess some outcome, you may also use it for categorization or classification purposes. For example, if you want to investigate the effectiveness of two treatments (behavior therapy and medication, for example) on obsessive-compulsive disorders, you would first use the results of a test to categorize subjects into severe or mild categories and then use another assessment to evaluate the effectiveness of each treatment.

  1. What is multidimensional scaling and why is it used and what is it used for? E Reading


Journal Reading: Determining the significance of scale values from multidimensional scaling profile analysis using a resampling method

Ding, C. S. (2005). Determining the significance of scale values from multidimensional scaling profile analysis using a resampling method. Behavior Research Methods, 37(1), 37-47.

http://search.proquest.com.contentproxy.phoenix.edu/docview/204325417/fulltext/CC6A3D44E75B4529PQ/1?accountid=458


Although multidimensional scaling (MDS) profile analysis is widely used to study individual differences, there is no objective way to evaluate the statistical significance of the estimated scale values. In the present study, a resampling technique (bootstrapping) was used to construct confidence limits for scale values estimated from MDS profile analysis. These bootstrap confidence limits were used, in turn, to evaluate the significance of marker variables of the profiles. The results from analyses of both simulation data and real data suggest that the bootstrap method may be valid and may be used to evaluate hypotheses about the statistical significance of marker variables of MDS profiles. [PUBLICATION ABSTRACT] Multidimensional scaling (MDS) is a technique developed in the behavioral and social sciences for studying the structure of objects or people. It has been used to study the perceptional structure of people (e.g., Goodrum, 2001 ; McWhirter, Palombi, & Garbin, 2000; Tittle, 1996), the vocational interest of college students (Johnson, 1995), test content and validity (Sireci & Geisinger, 1992), evaluation in health-related professions (Raymond, 1989), latent profiles of personality test batteries (Davison, Gasser, & Ding, 1996), and the cognitive organization of perceptions (Treat et al., 2002). MDS has proven useful to researchers in many fields, including education, health, marketing, psychology, and sociology.

For various reasons, however, the particular strengths of MDS are not well understood. Many consider MDS to be equivalent to factor analysis (assuming that factor analysis will provide the same information as MDS). To be sure, MDS and factor analysis are closely related, and they are sometimes used to study the same issues, especially in data reduction applications. Nevertheless, the objectives of MDS profile analysis differ from those of factor analysis. MDS profile analysis is particularly useful in studying individual differences, not only differences between variables within a population. For example, many studies have involved multidimensional scaling of psychological variables (e.g., Padula, Conoley, & Garbin, 1998; Thomas & Stock, 1988), but the data were not considered in terms of profiles. In other words, these studies examined the dimensions as constructs among a set of variables, with variables being considered as common indicators of a particular construct, as in factor analysis.

Thus, although MDS analysis is not a new concept, the profile analysis of people from a MDS analysis solution is rather new (Davison, 1994). In this application, the focus of the analysis is on identifying major prototypical profiles of people in a population. An example of such a study would be to identify how adolescents differ with respect to a set of psychosocial adjustment variables, such as aggressiveness, hyperactivity, school motivation, coping skills, peer relationships, and self-esteem. Applying an MDS profile analysis approach to this research issue, the focus becomes the following. What typical profiles of adjustment actually exist in adolescents? To what degree do the profiles of individuals in a sample correspond to these "normative" profiles-that is, the fit of the model to an individual's data? It should be noted that these variables are not considered as common indicators of a psychosocial construct(s); rather, how these variables are related to one another along profiles is considered.

In this article, MDS profile analysis will be described only briefly, because several researchers (e.g., Davison, 1994; Davison et al., 1996) have discussed the procedures of MDS profile analysis in detail. The major goal of the article is to build on the previous research of profile analysis and deal with the issues that have not been resolved in those studies. Specifically, this article focuses on determining the statistical significance of scale values from MDS profile analysis. On the basis of deterministic MDS model, MDS profile analysis involves estimating variable parameters (called scale values) from a proximity matrix computed for every pair of variables. This type of proximity is called derived proximity, in contrast to direct proximity (Borg & Groenen, 1997; Davison, 1983). The scale values estimated from an MDS analysis are then used to define a profile.

Thus far, however, there are neither standard errors of the estimates nor any test statistics associated with the variable scale values estimated from such a deterministic MDS analysis. Without the standard error or a similar test statistic as an objective criterion, it is difficult to determine which variables in the profile can be used to define the profile (such variables are called marker variables). The degree of confidence that a researcher can place in the interpretation of the profiles is then entirely subjective, since there is no way to examine whether the variable scale values of a particular size in the profiles are significantly different from zero. That is, one needs to test the hypothesis that the scale values obtained from an MDS profile analysis are not due to chance or to statistical manipulation. This is an important step in aiding interpretation of the profiles and, therefore, warrants consideration.

In recent MDS analysis literature, there have been very few developments on hypothesis testing regarding scale values from a nonmetric MDS model. As far as the author knows, there have been no published studies with respect to the examination of standard errors or confidence limits of the scale values derived from the deterministic nonmetric MDS model. Weinberg, Carroll, and Cohen (1984) is the only study that used jackknife and bootstrap methods to investigate the validity of the standard errors of scale values estimated from the probabilistic MDS model, using maximum likelihood estimation procedures. However, there are some unique aspects of their study that are not applicable to MDS profile analysis. First, the data they used for bootstrap and jack-knife estimation were based on ratio production estimates of the dissimilarities; that is, the data was generated on the basis of an assumption of ratio scale measurement in their model. In most psychological behavior studies, ratio scale measurement is not warranted.

Second, the analysis for their study was conducted using MULTISCALE, a maximum likelihood based estimation procedure (Ramsay, 1977). Strong distributional assumptions are made by MULTISCALE, which include log normal distribution of the data. Thus, the data for their simulation study were generated on the basis of these assumptions. As Weinberg et al. (1984) have pointed out, further research is needed to expand the results from the bootstrap technique by using different data and different models. On the other hand, they suggested that the bootstrap technique is feasible in complex methods of analysis regarding the accuracy of the scale values in multidimensional spaces.

As has been indicated, MDS profile analysis is based on the commonly used nonmetric MDS model that does not involve any strong distributional assumptions or ratio scale measurement. The major purpose of the present study was to extend the domain of applicability of the bootstrap found in a variety of multivariate situations to the area of studying profiles of people by using real data of vocational interest, as well as simulated data, especially with respect to estimating the confidence intervals of scale values from MDS profile analysis. In the following sections, MDS profile analysis will be briefly described first. Next, use of the bootstrap method in determining the statistical significance of scale values will be described. Finally, an application of this approach in the study of vocational interest profile research, as well as a simulation study, will be presented

  1. Give a brief summary of what you think of this article and whether this type is beneficial for organizational change? E reading

WHEN THE PSYCHOMETRICS OF TEST DEVELOPMENT MEETS ORGANIZATIONAL REALITIES: A CONCEPTUAL FRAMEWORK FOR ORGANIZATIONAL CHANGE, EXAMPLES, AND RECOMMENDATIONS

Muchinsky, P. M. (2004). When the psychometrics of test development meets organizatinoal realities: A conceptual framework for organizational change, examples, and recommendations. Personnel Psychology, 57(1), 175-209.

Press the Escape key to close

FromTo

Translate

Translation in progress...

[[missing key: loadingAnimation]]

The full text may take 40-60 seconds to translate; larger documents may take longer.


Cancel

OverlayEnd

Our standards for the construction of psychological tests give scant attention to the organizational context in which the tests are to be used. This paper describes 10 psychometric issues associated with the development of an electrical job knowledge test. The test was designed to replace seniority as a means of making promotional decisions within an organization. The 10 test-related issues are presented as a means to understand the underlying process of organizational change associated with the implementation of the new test. It is suggested that a closer link between the science and practice of our profession can be attained by achieving a greater understanding of issues associated with the practical implementation of theory-based interventions.

I believe that the implementation of a testing program in an organization can be viewed as a case of organizational change. Thus, the organizational change literature can be a resource to facilitate the practice of psychological assessment. As opposed to thinking of scientists and practitioners as two separate camps governed by two different sets of values (which can happen), techniques of organizational change can help to merge these two roles. That is, organizational change strategies can serve to implement in practice concepts and principles developed through scientific research. I have come to conclude that the science of I-O psychology has a wealth of knowledge to offer the workworld. In turn, the workworld has a great need for help in issues that are germane to I-O psychology. The major obstacle is the "translation" of the science into practice. As such, I have come to regard organizational change strategies as not only desirable, but imperative, for the merger of science and practice to occur. Whenever attempts are made to change an organization, resistance is likely to manifest itself in one or more forms. The skilled scientist-practitioner must be adept in identifying the source of the resistance, but also must possess an array of strategies to help mitigate the resistance. Among the strategies I have found useful are education, shared responsibility, serving as a negotiator/facilitator, and the overt manifestation of respect and recognition for the knowledge possessed by all parties.

This paper describes a real-life organizational context in which the principles of test construction were directed toward the development of a job knowledge test. Ten issues pertaining to test construction and use serve as the basis for a broader conceptual understanding of the organizational change process. However, the 10 issues also provide a context for understanding how the scientific principles of test construction can be challenged in developing a product that meets with organizational appeal. That is, the paper frames the process of test development within an organization into a larger theoretical context, that of organizational change. Specific recommendations are proposed to enhance the likelihood of attaining the twin criteria of organizational acceptance and psychometric integrity.

Background

The organization in question is a national leader (by size) within its industry and is in the manufacturing sector of the economy. The organization uses state-of-the-art manufacturing processes to produce a fairly narrow range of products. It is positioned early in the supply-chain process converting raw stock into mid-range products that are in turn converted by its customers into durable goods, that are in turn sold to retail vendors. The manufacturing process is heavily machine intensive. The maintenance of the machines is crucial for the operating efficiency of the company. Virtually all of the machines in the company use computer-aided manufacturing processes, requiring a high level of electrical knowledge among the employees responsible for maintaining the operating efficiency of the machines. Although the job titles of the employees responsible for machine maintenance differed slightly, the most accurate and representative job title would be electrical maintenance mechanic (D.O.T. Code 829.261-018; U.S.D.O.L., 1991).


Conclusion

Over the years much has been made of the "scientist-practitioner gap." At different times, various authors have contended the gap is narrowing or widening. For example, Latham (2001) asserted the gap is narrowing by the publication of scientific journal articles useful for practitioners, but Hulin (2001) believes science and practice are guided by differing sets of priorities and values. I feel it is instructive to examine the basis of the gap to better understand how it can be narrowed. One major component to the gap is the issue of implementation. For the most part, scientists are relatively unconcerned with how their theories, principles, and methods are put into practice in arenas outside of academic study. For the most part, practitioners are deeply concerned with matters of implementation because what they do occurs in arenas not created primarily for scientific study. Practitioners are compelled to interact with other organizational members who have areas of expertise and responsibility far different from our own. Furthermore, it has been my experience that human resources (or any other name for the functional area in which applied psychologists in industry are staffed) typically does not have the highest degree of power, status, and influence within the organization. Therefore, it is imperative practitioners find a way to practice their craft in a way that is true to the principles of their profession, yet, is accepted by those influential organizational members not skilled or even conversant in our area of expertise. The criterion of "organizational acceptability" is encountered far more in the world of practice than in the world of science.

In short, I believe the scientist-practitioner gap is strongly related to issues of implementation. Furthermore, issues of implementation are rarely discussed in academic journal articles. There is a linkage between implementation and organizational change, for in reality it is some form of change that is being implemented. Thus, the science of organizational change can serve to guide the practice of implementation. Scientists and practitioners could both benefit from a greater understanding of the linkage between organizational change and implementation.

Footnote

1 Although there is an element of humor in this seemingly self-evident statement, it represents somewhat of a moral victory over a previous discussion I once had regarding the base rate. In another company, a senior executive extolled at length the company's need for a new system of personnel selection because he was very critical of the current employees. I then explained to him the concept of a base rate and inquired as to his estimate of the base rate among the affected jobs to be staffed by a new method of personnel selection. His response was, "100%-otherwise they wouldn't be working here."

2 Although statistical analysis is a science, I am of the opinion that data interpretation is more of a learned art. Early in my career, I validated a test of mechanical comprehension for an organization that resulted in a criterion-related validity coefficient of .50. I was most pleased with discovering this level of predictability between the test and the criterion, and my pleasure regarding the findings was highly apparent to the client organization. It was at this point a senior company official said to me, "I fail to see the basis for your enthusiasm. A validity coefficient of .50 means we could do as good a job in selecting employees by flipping a coin that has a 50/50 chance of turning up heads." This individual had a Ph.D. (in mechanical engineering).



  1. What is the importance of Step ( The Literature Search) and what do it require. Breifly discuss each task for Step of the Literature Review Model? And define. Literature Review Ch2

We have already completed Step 1 Selecting a topic this week we will focus on the second task of the Literature Review model. Step 2 Literature Search Steps

Literature Search Steps and requirement.

Three tools will help you complete this task. These tools are (1) scanning the literature, (2) skimming potential works for con- tent, and (3) mapping the suitable works for inclusion in the study. While these are three separate techniques, you may use them in various ways depending on your ability and your topic selection. Think of searching the literature as assembling a well-used jig- saw puzzle. There are always parts missing and often pieces of other puzzles have become intermixed. Developing a strategy for assemble- bling a jigsaw puzzle is simple: Find a table with room to spread out the puzzle. Ensure that you have enough room to sort pieces and to organize them. Make sure there is good lighting. Consider what the puzzle should look like when completed by looking at the picture on the box. Spread the puzzle pieces out on the table. Look for pieces that obviously do not belong, and set them aside. Look for the puzzle pieces that make up the outer edge. Assemble them, and sort the remaining pieces by like pattern. Look for matching color patterns, and notice the specific shape of each piece. Finally, put the puzzle together one piece at a time. Assembling a jigsaw puzzle is similar to searching the literature. Open the box, and spread the puzzle pieces on the table by consult- ing subject and author indices for potential texts and materials for possible review. The key terms and core ideas of the preliminary topic statement define the search. They represent the boundaries of your research puzzle. Scan the library materials, reflecting on pieces that are part of the research puzzle. Keep in mind that some puzzle pieces are not part of this jigsaw puzzle. Remove these first. Then begin collecting the pieces of this puzzle. Catalog the remaining materials found to make them available for the next stage of the search, skimming. Skimming resembles a second sorting of the jigsaw puzzle pieces. As with the jigsaw puzzle, data gathered from the scan of the literature will be studied for usefulness. What should you include? What parts should you discard? Skim the materials collected in the scan to decide their individual appropriateness for inclusion in the study. What part of this work addresses the topic? In what way? The preliminary topic statement provides the frame for deciding what to include. After deciding what will be useful in the study, address the final task of the search, mapping.

This task of Step 2 Literature Search Steps requires collecting and selecting data. This task requires completing three separate activities: (1) previewing the material, (2) selecting the appropriate literature, and (3) organizing

Literature• Search is Collecting, cataloging, and documenting data that will determine salient works and refine the topic.

Task 1. To review At this point in the literature review, select the material to review and the material to be discarded. Several considerations decide what material is suitable for a particular literature search. The main con- sideration must be finding the information to address the key ideas contained in your preliminary topic statement. Other considerations might apply as well. For example, if the topic is time sensitive, look carefully at dates of publication. A 1940s text is probably no help if the topic title begins, “Latest Theories on ” Perhaps, instead, your topic involves synthesizing the major works addressing a subject. If so, search for the important authors and theories about the topic, regardless of date. The topic statement provides the direction and boundaries of your search. Using the topic statement as a pathfinder, continually ask yourself the following two questions: in the literature review, because knowledge of any literature does not yet influence your topic understanding. The data gathered while completing a search of the literature will impact your topic knowl- edge. The literature selected from the search will qualify and refine the topic statement, causing it to narrow and become more concrete

Next, examine and reflect on the impact of the search data on your topic understanding. These deliberations should help you fur- ther develop a topic statement. Be aware of the influence the litera- ture has on the conduct of the search. Be mindful and deliberate while conducting the search. Keep these three questions in mind when reflecting on your topic: 1. What is the literature telling me about my topic? 2. How is my understanding of my research topic changing? 3. What should my topic statement be now?

Three tools will help you complete this task. These tools are (1) scanning the literature, (2) skimming potential works for con- tent, and (3) mapping the suitable works for inclusion in the study. While these are three separate techniques, you may use them in various ways depending on your ability and your topic selection. Think of searching the literature as assembling a well-used jig- saw puzzle. There are always parts missing and often pieces of other puzzles have become intermixed. Developing a strategy for assem- bling a jigsaw puzzle is simple: Find a table with room to spread out the puzzle. Ensure that you have enough room to sort pieces and to organize them. Make sure there is good lighting. Consider what the puzzle should look like when completed by looking at the picture on the box. Spread the puzzle pieces out on the table. Look for pieces that obviously do not belong, and set them aside

  • 5 . Breifly discuss step 3 Develop an Argument for the research the importance.

An Argument is the presentation of one or more claims backed by credible evidence that supports a logical conclusion.

This step presents the concepts necessary for building a case.

Building a case means compiling and arranging sets of facts in a logical fashion that will prove the thesis you have made about the research topic. For example, if your thesis states that participatory leadership is the most effective style for 21st century organizations, the data in your literature review must support and prove your con- clusion.

Example of this step: Picture an evening in early spring, when changing weather pat- terns are unpredictable. You are deciding what to wear to work tomorrow. Should you dress for rain? You look at the newspaper and see that the forecast is for rain. You check the barometer, and find the pressure steadily falling. You look outside and see that cloud formations are building. You check online and see that the storms are predicted for the next few days. When considering all the informa- tion gathered, you conclude there is a high likelihood for rain tomor- row. You also decide that the available data indicate the rainstorm will probably hit during your morning commute. You apply the results of this research to your question, “What do I wear to work tomorrow?” and decide to wear a raincoat and take an umbrella. Notice that two conclusions were present in the example. The initial conclusion is, “Rain is likely.” This first conclusion was derived using different sources to gather and combine information about weather conditions. The argument for this conclusion was made by analyzing information from different sources and decid- ing that rain was imminent. Using this conclusion, it now becomes possible to address the question of whether to dress for rain. The second conclusion is, “I should dress for rain.” The argument for this conclusion was built by interpreting the first conclusion, “Rain is likely.” The results and conclusions of the fir irst argument were applied as the basis for the second. These results reasoned that rain was approaching and that carrying a raincoat and an umbrella would be the most prudent course of action. How does the rain example apply to writing a literature review? In preparing a literature review, you must also present similarly developed arguments to make the research case. An argument is the logical presentation of evidence that leads to and justifies a conclusion.

An argument is the logical presentation of evidence that leads to and justifies a conclusion. The literature review uses two arguments to make its case. The first argument, an inductive argument, called the argument of discovery, discusses and explains what is known about the subject in question. When building the argument of dis- covery, gather the data about the subject, analyze it, and develop findings that present the current state of knowledge about your research interest. For example, if your interest is to determine the ideal leadership style for organizations in the 21st century, then the information you have discovered must provide the evidence to argue what is known about leadership styles. The argument of discovery serves as the foundation for the second argument, a deductive argument, called the argument of advocacy. The argument of advocacy analyzes and critiques the knowledge gained from the synthesis of the data produced by the discovery argu- ment to answer the research question. The answer to this argument is the thesis statement (initially discussed in the introductory chapter). Continuing with the leadership style example, let’s say that your discovery argument produced findings that documented many lead- ership styles and their effective uses. Your advocacy argument must use these findings to determine which, if any, of these styles meets the needs of a 21st century organization. You conclude, based on the evidence your case presents, that the participatory leadership style is best in the specific situation named. Your conclusion—“the partici- patory leadership style is the best fit for a 21st century organization”— becomes your thesis statement.

The following three questions provide a handy guide for checking the validity of an argument. Ask these questions whenever you are evaluating an argument: •• Question 1. What is the stated conclusion? •• Question 2. What are the reasons that support the conclusion? •• Question 3. Do the reasons argue for the conclusion? Do the reasons stated have convincing data to support them? Does the conclusion logically follow from those reasons?

Argument 3 states a conclusion in the first sentence, thus answering Question 1. The support for this conclusion is cited research. When exam- ining each of the studies, you find that they support the conclusion drawn, thus answering Question 2. When reviewing Question 3, we find that the reasons stated are logical and convincing. All the parts of an argument are in order here, and Argument 3 is sound. Building an argument is simple. Before you arrive at a conclusion, though, be sure you can justify it.

•• Argument of Discovery—Argument proving that the findings-in-fact represent the current state of knowledge regarding the research topic.

•• Claim—A fact that is open to challenge.

•• Warrant—Thereasoningusedinanargumenttoallowtheresearcher, and any reader, to accept the evidence presented as reasonable proof that the position of the claim is correct.

  1. Why Use Tests? Ch 6

Tests are highly popular in the assessment of social and behavioral outcomes because they serve a very specific purpose. They yield a score that reflects performance on some variable (such as intelligence, affection, emotional involvement, and activity level), and they can fill a variety of the researcher’s needs (summarized in Table 6.1). First and foremost, tests help researchers determine the outcome of an experiment. Quite simply, tests are the measuring stick by which the effectiveness of a treatment is judged or the status of a variable such as height or voting preference in a sample is assessed. Because test results help us determine the value of an experiment, they can also be used to help us build and test hypotheses. Second, tests can be used as diagnostic and screening tools, where they provide insight into an individual’s strengths and weaknesses. For example, the Denver Developmental Screening Test (DDST) assesses young children’s language, social, physical, and personal development. Although the DDST is a general screening test at best, it does provide important information about a child’s developmental status and areas that might need attention. Third, tests assist in placement. For example, children who missed the date for kindergarten entrance in their school district could take a battery of tests to determine whether they have the skills and maturity to enter public school early. High school students often take advanced placement courses and then “test out” of basic required college courses. In these two cases, test scores assist when a recommendation is made as to where someone should be placed in a program. Fourth, tests assist in selection. Who will get into graduate school is determined, at least in part, by an applicant’s score on tests such as the Graduate Record Examination (GRE) or the Miller’s Analogy Test (MAT). Businesses often conduct tests to screen individuals before they are hired to ensure that they have the basic skills necessary to complete training and perform competently. Table 6.1 What tests do and how they do it What Tests Do How Tests Do It Examples Help researchers determine the outcome of a study Tests are used as dependent variables A researcher wants to know which of two training programs is more effective Provide diagnostic and screening information Tests are usually administered at the beginning of a program to get some idea of the participant’s status A teacher needs to know what type of reading program in which a particular child should be placed Help in the placement process Tests are used to place people in different settings based on specified characteristics A mental health worker needs to place a client into a drug rehabilitation program Assist in selection Tests are used to distinguish between people who are admitted to certain programs A graduate school committee uses test scores to make decisions about admitting undergraduates Help evaluate outcomes Tests are used to determine whether the goals of a program were met A school superintendent uses a survey to measure whether the in-service programs had an impact on teachers’ attitudes Finally, tests are used to evaluate the outcomes of a program. Until you collect information that relates to the question you asked and then act on that information, you never really know whether the program you are assessing had, for example, the impact you sought. If you are interested in evaluating the effectiveness of a training program on returning war veterans, it is unlikely that you can judge the program’s efficacy without conducting some type of formal evaluation. However, whether you use a test for selection or evaluation, it is not the test score that is in and of itself important, but rather the interpretation of that score. A score of 10 on an exam wherein all the items are simple is much different than a score of 10 where everyone else in the group received scores between 3 and 5. Learning to design, create, administer, and score any test is important, but it is very important—and almost essential—to be able to know how to interpret that score. What Tests Look Like You may be most familiar with achievement-type tests, which often include multiple-choice items such as the following: The cube root of 8 is a. 2 b. 4 c. 6 d. 8 Multiple-choice questions are common items on many of the tests you will take throughout your college career. But tests can take on a variety of appearances, especially when you have to meet the needs of the people being tested and to sample the behavior you are interested in learning more about. For example, you would not expect people with a severe visual impairment to take a pencil-and-paper test requiring them to darken small, closely placed circles. Similarly, if you want to know about children’s social interactions with their peers, you would probably be better served by observing them at play than by asking them about playing. With such considerations in mind, you need to decide on the form a test might take. Some of the questions that will arise in deciding how a test should appear and be administered are as follows: • Is the test administered using paper and pencil, or is it administered some other way? • What is the nature of the behavior being assessed (cognitive, social, physical)? • Do people report their own behavior (self-report), or is their behavior observed? • Is the test timed, or is there no time limit? • Are the responses to the items subjective in nature (where the scoring is somewhat arbitrary) or objective (where there are clearly defined rules for scoring)? • Is the test given in a group or individually? • Are the test takers required to recognize the correct response (such as in a multiple-choice test) or to provide one (such as in a fill-in item or an open-ended question)? Test Yourself Why test? Provide at least two reasons and an example of each. Types of Tests Tests are designed for a particular purpose: to assess an outcome whose value distinguishes different individuals from one another. Because many different types of outcome might be measured, there are different types of tests to do the job. For example, if you want to know how well a group of high school seniors understood a recent physics lesson, an achievement test would be appropriate. On the other hand, if you are interested in better understanding the structure of an individual’s personality, a test such as the Minnesota Multiphasic Personality Inventory or the Thematic Apperception Test, two popular yet quite different tests of personality, would be more appropriate. What follows is a discussion of some of the main types of tests you will run into in your research work, how they differ from one another, and how they can best be utilized. Achievement Tests Achievement tests are used to measure knowledge of a specific area. They are the most commonly used tests when learning is the outcome that is being measured. They are also used to measure the effectiveness of the instruction that accompanied the learning. For example, school districts sometimes use students’ scores on achievement tests to evaluate teacher effectiveness. The spelling test you took every Friday in fourth grade, your final exam in freshman English, and your midterm exam in chemistry all were achievement tests administered for the same reason: they were designed to evaluate how well you understood specific information. Achievement tests come in all flavors, from the common multiple-choice test to true–false and essay examinations. All have their strengths and weaknesses. Achievement tests are used to assess expertise in a content area. There are basically two types of achievement tests: standardized tests and researcher-generated tests. Standardized tests, usually produced by commercial publishers, have broad application across a variety of different settings. What distinguishes a standardized test from others is that it comes with a standard set of instructions and scoring procedures. For example, the Kansas Minimum Competency Test is a standardized test that has been administered to more than 2 million children across the state of Kansas in rural and urban settings, from very different social classes, school sizes, and backgrounds. Another example is the California Achievement Test (CAT), a nationally standardized test of achievement in the areas of reading, language, and arithmetic. Researcher/Teacher-made tests, on the other hand, are designed for a much more specific purpose and are limited in their application to a much smaller number of people. For example, the test that you might take in this course would most likely be researcher or teacher made and designed specifically for the content of this course. Another example would be a test designed by a researcher to determine whether the use of teaching machines versus traditional teaching makes a difference in the learning of a foreign language. Achievement tests can also be broken down into two other categories. Both standardized and researcher-made tests can be norm-referenced or criterion-referenced tests. Norm-referenced tests allow you to compare an individual’s test performance to the test performance of other individuals. For example, if an 8-year-old student receives a score of 56 on a mathematics test, you can use the norms that are supplied with the test to determine that child’s placement relative to other 8-year-olds. Standardized tests are usually accompanied by norms, but this is usually not the case for teacher-made tests nor is the existence of norms a necessary condition for a test to be considered standardized. Remember, a test is standardized only if it has a standard or common set of administration and scoring procedures. Criterion-referenced tests (a term coined by psychologist Robert Glaser in 1963) define a specific criterion or level of performance, and the only thing of importance is the individual’s performance, regardless of where that performance might stand in comparison with others. In this case, performance is defined as a function of mastery of some content domain. For example, if you were to specify a set of objectives for 12th-grade history and specify that students must show command of 90% of those objectives to pass, then you would be implying that the criterion is 90% mastery. Because this type of test actually focuses on the mastery of content at a specific level, it is also referred to as content-referenced testing. When should you use which test? First, you must make this decision before you begin designing a test or searching for one to use in your research. The basic question you want to answer is whether you are interested in knowing how well an individual performs relative to others (for which norms are needed to make the comparison) or how well the individual has mastered a particular area of content (for which the mastery is reflected in the criterion you use). Second, any achievement test, regardless of its content, can fall into one of the four cells shown in Table 6.2, which illustrates the two dimensions just described: Does the test compare results with those of other individuals or to some criterion, and who designed or authored the test? Multiple-Choice Achievement Items Remember those endless hours filling in bubbles on optical-scanner scoring sheets or circling the A’s, B’s, C’s, and D’s, guessing which answer might be correct or not, and being told not to guess if you have no idea what the correct answer is? All these experiences are part of the multiple-choice question test, by far the most widely used type of question on achievement tests, and it is a type of test that deserves special attention.

7. Achievement tests are ubiquitous in our society. Why do you think that’s the case? Provide a referene. Ch 6 test yourself

8. Achievement tests are ubiquitous in our society. Why do you think that’s the case? Provide a referene.





  1. Explain the use of testing and assessment in psychology. CH6

  1. What is multidimensional scaling and why is it used and what is it used for? E Reading


  1. Give a brief summary of what you think of this article and whether this type is beneficial for organizational change? E reading


  1. What is the importance of Step ( The Literature Search) and what do it require. Breifly discuss each task for Step of the Literature Review Model? And define.


  1. Breifly discuss Step 3 Develop an Argument for the research the importance.


  1. Why Use Tests? Ch 6


  1. Achievement tests are ubiquitous in our society. Why do you think that’s the case? Provide a referene. Ch 6 test yourself


  1. Achievement tests are ubiquitous in our society. Why do you think that’s the case? Provide a reference.