Questionnaire DevelopmentTo create psychological measuring tools, test developers initially write as many as twice the number of items that will appear in the final draft of their questionnaires. They

Writing and Evaluating Test Items PSY3700 Multimedia Assessment and Psychometrics ©20 16 South University 2 Writing and Evaluating Test Items Creating the Questionnaire Last week, you were introduced to the topics of reliability and validity and their relationship to test items and testing. In the process of writing test items that are valid, test developers strive to make sure that their items align with the attitude, behavior, ability, or trait under investigation. Valid items measure what they purport to measure and are not disconnected from the objectives of the test (Sue & Ritter, 2013). Once these items have been written, the preliminary draft of the survey or questionnaire is piloted on a representative sample of the population for which the measure is intended. Once the data from the pilot survey is col lected, the overall performance of the test takers on the measure and on each of the items is analyzed. As we have learned, items are evaluated using statistical procedures known as item analysis which may include analysis of item reliability, item validit y, item discrimination, and item difficulty level (Cohen & Swerdlik, 2002). Writing test items can be a taxing process as it requires both creativity and persistence to address each of the criteria necessary for constructing a valid item or question. One of the first concerns of a test creator is the range of difficulty of the items (Gregory, 2013). For tests of ability, achievement, and aptitude, this concern can be addressed by creating items that are differential and progressively become more difficult as the test progresses. Alternatively, mixing less and more difficult items together may be a strategy of choice if one wants to calculate the split -half reliability index score for the test. Another area of concern in the construction of test items is th e homogeneity or heterogeneity of item content (Gregory, 2013). In some cases, items may tap multifaceted constructs and, as such, may have multiple layers. Alternatively, groups of questions may combine to assess multilayered concepts (Sue & Ritter, 2013) . As indicated earlier, it is not uncommon for a test constructor to initially write twice as many questions or items as will eventually be used in the first draft of the questionnaire. Ideally, this pool of questions contains valid items, the content of w hich effectively measures the domain of investigation. If this assumption is violated items may need to be revised or eliminated (Gregory, 2013). Another concern is about the types of responses required by test takers and the meaning that will be attribut ed to test scores (Cohen & Swerdlik, 2002). Items should clearly measure what they purport to measure, be undergirded by theory, and be as specific as possible (Kaplan & Saccuzzo, 2013).

Exceptionally long items that are confusing and misleading should be avoided as well as those loaded with jargon (Sue & Ritter, 2013). The art and science of item writing develops over time and with practice. For this reason, test developers should be prepared to commit long hours and a great deal of sweat for the design o f effective test items. PSY3700 Multimedia Assessment and Psychometrics ©20 16 South University 3 Writing and Evaluating Test Items Creating the Questionnaire References Cohen, R., & Swerdlik, M. (2002). Psychological testing and assessment: An introduction to test and measurement (5th ed.). Boston, MA: McGraw -Hill. Gregory, R. (2013). Psychological testing: History, principles, and applications (7th ed.). Boston, MA: Pearson. Kaplan, R., & Saccuzzo, D. (2013). Psychological testing: Principles, applications, & issues (8th ed.). Belmont, CA: Wadsworth. Sue, M., & Ritter, L. (2013). Conducting online surveys (2nd ed.). Los Angeles, CA: Sage. © 201 6 South University