新托福的口语评估标准及方法(23985)-英语教育-(36)-21英语网

　　新托福的口语考试中所涉及的语言内容大都来自大学校园的真实情景，其题型也不再是单一题型，而是由口语与听力、口语与阅读等相结合的综合题型来全面考查学生的逻辑思维能力及表达能力。

　　THE most fundamental task in the development of any assessment, including oral language proficiency assessments, is determining and validating the uses and interpretations of test scores. An assessment is valid, according to the widely used Standards for Educational and Psychological Testing (published jointly by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education), to the degree to which "evidence and theory support the interpretations of test scores entailed by [its] proposed uses."

　　In the case of TOEFL iBT and other tests of English language proficiency, test score use often comes down, in practical terms, to establishing cut scores. These are scores that are interpreted as indicating the minimum level of English language proficiency an international student must have in order to successfully undertake first year studies in an English-speaking context. Establishing and documenting the validity of these interpretations is not a simple task. What constitutes necessary proficiency can differ according to the varying developmental resources that institutions have at their disposal, and the varying demands that universities make on their international students.

　　Additionally, traditional predictive validity studies, which typically compare test performance at various levels with measurable external variables like first year grades, are difficult to carry out in the context of language-proficiency assessments, since other factors (subject-specific academic preparation, for example) can weigh incalculably more than English-language proficiency in determining whether a student will be successful in the first year of study at an English-speaking institution.

　　This presentation will describe and demonstrate aspects of the process of standard setting and explain its utility in the context of oral language proficiency assessment. The process of establishing a cut score is a standard setting. It is a judgmental process, which relies on human judgments. But it's not a statistical process. It relies on the concept that there is a minimum number of acceptable students.

　　In standard setting for TOEFL iBT, test developers work with panels of experts from various departments at individual score-using institutions to help make explicit their often-implicit requirements for English-language oral proficiency. They also help them link their requirements for speaking test scores in an informed and meaningful way. Through the standard-setting process, institutions with different requirements and varying resources can help ensure the validity of their speaking assessment scores and comply with accepted standards for educational assessment.

　　The new speaking test has several tasks. Candidates are given some time to prepare, and then to compose their responses to each of the tasks. They are usually familiar with the first two topics as if it was just like a warm-up task. The scores given to the candidates are high so as to make students feel comfortable. The other task was an integrated task. This involved a listening and a speaking task. The students listen to something, and then they speak. Or there is a reading-listening task. Candidates are asked to read a short passage first, then they listen, prepare and speak.

　　The qualities of strong speakers and weak speakers are also discussed. The qualities of a strong speaker are various inflections of tone, fluent dialogue, a neutral accent and a rich vocabulary. He can use them well in a structured way, and he knows when to use formal speech and when to use informal speech. The characteristics of a weak speaker are that he speaks without an inflection on a tone, he has such a strong accent that it makes him difficult to be understood, or he has a poor vocabulary. These are the two extremes, with strong speakers up one end, and the weak speakers up the other.

　　---------------------------------

　　Prof. Arthur Denner，美国教育考试服务中心（Educational Testing Service)测评专家。研究领域：语言测试，包括GRE，GMAT，TOEFL和TOEFL iBT等。目前主要负责TOEFL iBT的听力和口语测试。