Yet, researchers lack an agreement on the components of listening. Various research studies attempted to provide empirical support (Buck &amp. Tatsuoka, 1998. Kostin, 2004) and theoretical taxonomies Mendelsohn, 1994) for the sub-skills of listening processes. None of these taxonomies could be exhaustive and comprehensive descriptions of the listening process (Buck, 2001). Literature Review The literature reviewed in this paper concentrates on studies relevant to testing listening. Powers (1985) aimed to validate the use of TOEFL listening tests and investigated listening activities important to academic success across disciplines. Powers conducted surveys to faculty members, students, and admission officers at universities. The survey investigated the importance of various listening activities to academic success and problems with these activities for native and nonnative speakers. The results may be used to check validity of test score uses. The following activities were rated as very important in academic contexts. Identifying major themes or ideas Identifying relationships among major ideas Identifying the topic of a lecture Retaining information through note-taking Retrieving information from notes Inferring relationships between information Comprehending key vocabulary Following the spoken mode of lecture Identifying supporting ideas and examples Several studies investigated the factors affecting item difficulty of the TOEFL listening comprehension test. Nissan et al (1995) was the initiating study that investigated the stimulus-related and item-related features that contributed to item difficulty. They used TOEFL dialogue items and Equated Delta, an item difficulty index from classical test theory, to predict item difficulty. Seventeen independent features common in dialogue items were selected as variables. Using 283 TOEFL dialogue items, the study found five variables that have a significant impact on the Delta: word frequency, utterance pattern, negative in stimulus, explicit/implicit information, and role of speakers. Infrequent vocabulary was the word that was not on Berger's (1977) list. The utterance pattern showed that when the second utterance was in the form of statements, the items were significantly more difficult than those that ended in a question. More than one negative in the stimulus significantly increased the mean Delta value. The items that required test takers to identify implied information tend to be more difficult than those that required understanding of explicit information. When the speaker was not a casual acquaintance or a classmate, the item became significantly more difficult. In terms of the effect of the combinations of those five variables, the study found that combinations of variables had stronger impact on the item difficulty index than any individual variable. Combinations of three variables, word frequency, utterance pattern, and inference, had the greatest impact on the Delta. This study was meaningful to identify the significant features of listening tests that could predict item difficulty. However, the features were selected based on the linguistic characteristics of texts and items without theoretical considerations. Thus, the generalization of finding could be limited to the effect of textual characteristics on item difficulty. Several studies followed their research frame.

