Bachman Measurement Qualities of A good Test
A test's usefulness, according to Bachman and Palmer (1996), can be determined by considering the following measurements qualities of the test: reliability, construct validity, authenticity, interactivity, impact, and practicality. These qualities can easily describe a good language test's usefulness.
Test Reliability:
The term reliability, according to Bachman and Palmer (1996), refers to consistency of measurement. Elaborately, they go on to say that a reliable test score is consistent across different characteristics of the testing situation. Moreover, if test scores are inconsistent, they provide no information about the ability being measured. Because it is impossible to eliminate inconsistencies on the whole, we try to reduce variations in the test's task features that do not correspond to variations in Target Language Use (TLU) tasks.
Construct Validity:
The test's reliability and validity are strongly correlated. Any valid test is considered a reliable test; however, not all reliable tests can be can be considered valid (Alderson, 2000). Recently, according to Alderson (2000), “the term construct validity is used to refer to the general, overarching notion of validity”. Therefore, the main focus of discussing the test's validity is construct validity, in addition to some issues regarding this test's content validity.
According to Bachman and Palmer (1996), the term construct validity refers to the extent to which people can interpret a given test score as an indicator of the abilities or constructs that people want to measure. However, no test is entirely valid because validation is an ongoing process (Weir, 2005).
Authenticity:
Bachman (1991) defines authenticity as the appropriateness of a language user’s response to language as communication. However, this definition was too general. Therefore, Bachman and Palmer (1996) divided this idea into two parts. The first relates to the target language's use, which they refer to as authenticity; and they define the second according to its relation to the learners involved in the test. Below is a detailed explanation of authenticity and its implications for the current test.
Bachman and Palmer (1996) defined authenticity as the degree to which a given language test's tasks' characteristics correspond to a TLU task's features. Authenticity relates a test's task to the domain of generalization to which we want our scores' interpretations to be generalized. It potentially affects test takers' perceptions of the test and their performance (Bachman, 2000).
Interactiveness:
Interactiveness, according to Bachman and Palmer (1996), is “the extent and type of involvement of the test taker’s individual characteristics in accomplishing a test task” (p. 25). Does the test motivate students? Is the language used in the test's questions and instructions appropriate for the students' level? Do the test's items represent the language used in the classroom, as well as the target language? All these questions represent the crucial elements that affect a test's interactiveness. Many recent views consider this notion the core of language teaching and learning.
Impact:
According to Bachman and Palmer (1996), impact can be defined broadly in terms of the various ways a test's use affects society, an educational system, and the individuals within them. In general terms, a test operates at the macro level of a societal educational system while corresponding to individuals, i.e., test takers, at the micro level. According to the test's developer (Appendix 1), society, educational systems, and individuals correlate strongly to this test.
Practicality:
“Practicality is the relationship between the resources that will be required in design, development, and use of the test and the resources that will be available for these activities” (Bachman and Palmer, 1996:36). They illustrated that this quality is unlike the others because it focuses on how the test is conducted. Moreover, Backman and Palmer (1996) classified the addressed resources into three types: human resources, material resources, and time.
Based on this definition, practicality can be measured by the availability of the resources required to develop and conduct the test. Therefore, our judgment of the language test is whether it is practical or impractical.
References:
Alderson, J.C. (2000). Assessing reading. Cambridge: Cambridge University Press
Bachman, L.F. & Palmer, A.S. (1996). Language testing in Practice. Oxford: Oxford University Press
Bachman, L. (2000). Modern language testing at the turn of the century: assuring that what we count counts.
Language Testing. (17), pp. 1–42.
Bachman, L.F. (1991). What does language testing have to offer? TESOL Quarterly 25(4), pp. 671–704.
Hughes, A. (2003). Testing for language teachers. Cambridge: Cambridge University Press
Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press
Weir, C.J. (2005). Language Testing and Validation. Basingstoke: Palgrave
A test's usefulness, according to Bachman and Palmer (1996), can be determined by considering the following measurements qualities of the test: reliability, construct validity, authenticity, interactivity, impact, and practicality. These qualities can easily describe a good language test's usefulness.
Test Reliability:
The term reliability, according to Bachman and Palmer (1996), refers to consistency of measurement. Elaborately, they go on to say that a reliable test score is consistent across different characteristics of the testing situation. Moreover, if test scores are inconsistent, they provide no information about the ability being measured. Because it is impossible to eliminate inconsistencies on the whole, we try to reduce variations in the test's task features that do not correspond to variations in Target Language Use (TLU) tasks.
Construct Validity:
The test's reliability and validity are strongly correlated. Any valid test is considered a reliable test; however, not all reliable tests can be can be considered valid (Alderson, 2000). Recently, according to Alderson (2000), “the term construct validity is used to refer to the general, overarching notion of validity”. Therefore, the main focus of discussing the test's validity is construct validity, in addition to some issues regarding this test's content validity.
According to Bachman and Palmer (1996), the term construct validity refers to the extent to which people can interpret a given test score as an indicator of the abilities or constructs that people want to measure. However, no test is entirely valid because validation is an ongoing process (Weir, 2005).
Authenticity:
Bachman (1991) defines authenticity as the appropriateness of a language user’s response to language as communication. However, this definition was too general. Therefore, Bachman and Palmer (1996) divided this idea into two parts. The first relates to the target language's use, which they refer to as authenticity; and they define the second according to its relation to the learners involved in the test. Below is a detailed explanation of authenticity and its implications for the current test.
Bachman and Palmer (1996) defined authenticity as the degree to which a given language test's tasks' characteristics correspond to a TLU task's features. Authenticity relates a test's task to the domain of generalization to which we want our scores' interpretations to be generalized. It potentially affects test takers' perceptions of the test and their performance (Bachman, 2000).
Interactiveness:
Interactiveness, according to Bachman and Palmer (1996), is “the extent and type of involvement of the test taker’s individual characteristics in accomplishing a test task” (p. 25). Does the test motivate students? Is the language used in the test's questions and instructions appropriate for the students' level? Do the test's items represent the language used in the classroom, as well as the target language? All these questions represent the crucial elements that affect a test's interactiveness. Many recent views consider this notion the core of language teaching and learning.
Impact:
According to Bachman and Palmer (1996), impact can be defined broadly in terms of the various ways a test's use affects society, an educational system, and the individuals within them. In general terms, a test operates at the macro level of a societal educational system while corresponding to individuals, i.e., test takers, at the micro level. According to the test's developer (Appendix 1), society, educational systems, and individuals correlate strongly to this test.
Practicality:
“Practicality is the relationship between the resources that will be required in design, development, and use of the test and the resources that will be available for these activities” (Bachman and Palmer, 1996:36). They illustrated that this quality is unlike the others because it focuses on how the test is conducted. Moreover, Backman and Palmer (1996) classified the addressed resources into three types: human resources, material resources, and time.
Based on this definition, practicality can be measured by the availability of the resources required to develop and conduct the test. Therefore, our judgment of the language test is whether it is practical or impractical.
References:
Alderson, J.C. (2000). Assessing reading. Cambridge: Cambridge University Press
Bachman, L.F. & Palmer, A.S. (1996). Language testing in Practice. Oxford: Oxford University Press
Bachman, L. (2000). Modern language testing at the turn of the century: assuring that what we count counts.
Language Testing. (17), pp. 1–42.
Bachman, L.F. (1991). What does language testing have to offer? TESOL Quarterly 25(4), pp. 671–704.
Hughes, A. (2003). Testing for language teachers. Cambridge: Cambridge University Press
Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press
Weir, C.J. (2005). Language Testing and Validation. Basingstoke: Palgrave