Bürger, Sarah; Kröhne, Ulf; Goldhammer, Frank:

The transition to computer-based testing in large-scale assessments
Investigating (partial) measurement invariance between modes

In: Psychological Test and Assessment Modeling, 4 (2016) 58 , 597-616

Äquivalenz, Computerunterstütztes Verfahren, Experiment, Item-Response-Theory, Messung, Schülerleistungstest, Technologiebasiertes Testen, Testverfahren, Wirkung

This paper provides an overview and recommendations on how to conduct a mode effect study in large-scale assessments by addressing criteria of equivalence between paper-based and computerbased tests. These criteria are selected according to the intended use of test scores and test score interpretations. A mode effect study can be implemented using experimental designs. The major benefit of combining experimental design considerations with the IRT methodology of mode effects is the possibility to investigate partial measurement invariance. This allows test scores from different modes to be used interchangeably and means of latent variables or mean differences and correlations to be compared on the population level even if some items differ in difficulty between modes. For this purpose, a multiple-group IRT model approach for analyzing mode effects on the test and item levels is presented. Instances where partial measurement invariance suffices to combine item parameters into one metric are reviewed in this paper. Furthermore, relevant study design requirements and potential sources of mode effects are discussed. Finally, an extension of the modelling approach to explain mode effects by means of item properties such as response format is presented. (DIPF/Orig.)

last modified Nov 11, 2016