TESOL: Development Process for Language Assessment

The development process for a standards-based language assessment instrument such as a language test begins with a conceptual basis. There is much to consider even before reaching the design stage of developing an assessment. There are also some common misconceptions about the creation and use of language tests as well as unrealistic expectations that prevent people who need to create and use language tests in their professional work. Mystique and a belief that language testers have some “almost magical procedures and formulae” for creating “the best” (Bachman 3) test is a common misconception. It is thought that there is a perfect language test and many people wish to know how to develop such tests for their own testing needs but there is no such thing as the one best test, even for a specific situation.

Bachman and Palmer believe that there is a “need for a correspondence between language test performance and language use in order for a particular language test to be useful for its intended purpose, test performances must correspond in demonstrable ways to language use in non-test situations.” This essay’s aim is to report my findings in the text Language Testing in Practice and in doing so, remove any mystique associated with the development process for language assessment and describe a model for test usefulness that includes six qualities -- reliability, construct validity, authenticity, interactiveness, impact, and practicality according to Bachman and Palmer.

Reliability

Reliability is frequently defined as “consistency of measurement.” A test score that is reliable should be the same across the board. For example, if two tests were to be taken by the same group of people on two different days, in two different environments, it should not matter to one individual where he/she takes the test on one day or the other. It should also be of no consequence to the individual whether he/she takes two forms of a test that are equivalent (used interchangeably); he/she should receive the same score on either test.

Construct Validity

Bachman and Palmer define construct as, “the specific definition of an ability that provides the basis for a given test or test task.” Then the term construct validity refers to the degree in which a given test score can be used to measure the ability or abilities of a test-taker. For example, if there were a need for a placement test in an academic writing course, then using a multiple-choice test of grammar might give reliable scores. Yet, it may not be the good enough to use as a placement test for a writing course because grammar is only one part of academic writing ability. To define the “construct” here to only test grammatical knowledge is very narrow considering that the intended language use or “domain” involves “metacognitive strategies, topical knowledge and affective responses as well” (Bachman 23)

Authenticity

Essentially, the language assessment tasks should correspond with the “target language use.” For instance, you need to know vocabulary of the items you would be selling for a job and you were given a test that had a written passage with descriptions of the merchandise that you would sell. In this passage, key vocabulary terms have been deleted and you need to fill in the blanks. The topical content of the test is relevant, but the task of filling in missing words might be irrelevant or inauthentic.

Interactiveness

Interactiveness is defined as the use of the “test taker’s individual characteristics in accomplishing a test task. The individual characteristics that are most relevant for language testing are the test taker’s language ability, topical knowledge, and affective schemata” (Bachman 25). Topical knowledge can also be referred to as “real-word knowledge” and affective schemata “provide the basis on which language users assess the characteristics of the language use task and its setting in terms of past emotional experiences in similar contexts” (Bachman 65). The interactiveness of a specific test task can be explained by the ways in which the test taker’s language knowledge, “topical knowledge” and “affective schemata” are engaged by the test task.

Impact

The impact of assessment exists on two stages: “a micro level, in terms of the individuals, and a macro level, in terms of the educational system or society”. Bachman (1990) also points out, “tests are not developed and used in a value-free psychometric test-tube; they are virtually always intended to serve the needs of an educational system or of society at large.” Washback is a byproduct of impact and assessment can have a positive effect or negative effect on students and teachers.

Practicality

To judge if a test is practical one must consider the resources that will be needed to develop an assessment that is useful and also whether this fits in with the resources available. Practicality is meeting the demands of a particular test within the limits of existing resources such as human resources, material resources, and time.

Bachman, L., & Palmer, A. S. (1996). Language Testing in Practice. Hong Kong: Oxford University Press

TESOL

Saturday, May 7, 2016

Development Process for Language Assessment

No comments:

Post a Comment