-
-
Author(s): Deribo, Tobias; Goldhammer, Frank; Kröhne, Ulf
Title: Changes in the speed-ability relation through different treatments of rapid guessing
In: Educational and Psychological Measurement, 83 (2023) 3, S. 473-494
DOI: 10.1177/00131644221109490
URL: https://journals.sagepub.com/doi/10.1177/00131644221109490
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Antwort; Deutschland; Empirische Untersuchung; Fertigkeit; Informations- und Kommunikationstechnologie; Item-Response-Theory; Leistungstest; Modell; Panel; Psychometrie; Reliabilität; Student; Test; Validität; Verhalten; Zeit
Abstract (english): As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a response given under rapid-guessing behavior does bias constructs and relations of interest. Bias also appears reasonable for latent speed estimates obtained under rapid-guessing behavior, as well as the identified relation between speed and ability. This bias seems especially problematic considering that the relation between speed and ability has been shown to be able to improve precision in ability estimation. For this reason, we investigate if and how responses and response times obtained under rapid-guessing behavior affect the identified speed-ability relation and the precision of ability estimates in a joint model of speed and ability. Therefore, the study presents an empirical application that highlights a specific methodological problem resulting from rapid-guessing behavior. Here, we could show that different (non-)treatments of rapid guessing can lead to different conclusions about the underlying speed-ability relation. Furthermore, different rapid-guessing treatments led to wildly different conclusions about gains in precision through joint modeling. The results show the importance of taking rapid guessing into account when the psychometric use of response times is of interest. (DIPF/Orig.)
DIPF-Departments: Lehr und Lernqualität in Bildungseinrichtungen
-
-
Author(s): Zesch, Torsten; Horbach, Andrea; Zehner, Fabian
Title: To score or not to score. Factors influencing performance and feasibility of automatic content scoring of text responses
In: Educational Measurement: Issues and Practice, 42 (2023) 1, S. 44-58
DOI: 10.1111/emip.12544
URL: https://onlinelibrary.wiley.com/doi/10.1111/emip.12544
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft
Language: Englisch
Keywords: Antwort; Automatisierung; Bewertung; Einflussfaktor; Inhalt; Leistung; Text; Tool; Verfahren
Abstract (english): In this article, we systematize the factors influencing performance and feasibility of automatic content scoring methods for short text responses. We argue that performance (i.e., how well an automatic system agrees with human judgments) mainly depends on the linguistic variance seen in the responses and that this variance is indirectly influenced by other factors such as target population or input modality. Extending previous work, we distinguish conceptual, realization, and nonconformity variance, which are differentially impacted by the various factors. While conceptual variance relates to different concepts embedded in the text responses, realization variance refers to their diverse manifestation through natural language. Nonconformity variance is added by aberrant response behavior. Furthermore, besides its performance, the feasibility of using an automatic scoring system depends on external factors, such as ethical or computational constraints, which influence whether a system with a given performance is accepted by stakeholders. Our work provides (i) a framework for assessment practitioners to decide a priori whether automatic content scoring can be successfully applied in a given setup as well as (ii) new empirical findings and the integration of empirical findings from the literature on factors that influence automatic systems' performance. (DIPF/Orig.)
DIPF-Departments: Lehr und Lernqualität in Bildungseinrichtungen
-
-
Author(s): Persic-Beck, Lothar; Goldhammer, Frank; Kroehne, Ulf
Title: Disengaged response behavior when the response button is blocked. Evaluation of a micro-intervention
In: Frontiers in Psychology. Section Quantitative Psychology and Measurement, 13 (2022) , S. 954532
DOI: 10.3389/fpsyg.2022.954532
URL: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.954532/full
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Antwort; Datenanalyse; Dauer; Effektivität; Einflussfaktor; Erwachsener; Evaluation; Frage; Intervention; Kompetenz; Leistungstest; Logdatei; Messung; Motivation; Technologiebasiertes Testen; Testkonstruktion; Validität; Verhalten; Verhaltensänderung
Abstract (english): In large-scale assessments, disengaged participants might rapidly guess on items or skip items, which can affect the score interpretation's validity. This study analyzes data from a linear computer-based assessment to evaluate a micro-intervention that blocked the possibility to respond for 2 s. The blocked response was implemented to prevent participants from accidental navigation and as a naive attempt to prevent rapid guesses and rapid omissions. The response process was analyzed by interpreting log event sequences within a finite-state machine approach. Responses were assigned to different response classes based on the event sequence. Additionally, post hoc methods for detecting rapid responses based on response time thresholds were applied to validate the classification. Rapid guesses and rapid omissions could be distinguished from accidental clicks by the log events following the micro-intervention. Results showed that the blocked response interfered with rapid responses but hardly led to behavioral changes. However, the blocked response could improve the post hoc detection of rapid responding by identifying responses that narrowly exceed time-bound thresholds. In an assessment context, it is desirable to prevent participants from accidentally skipping items, which in itself may lead to an increasing popularity of initially blocking responses. If, however, data from those assessments is analyzed for rapid responses, additional log data information should be considered.
DIPF-Departments: Lehr und Lernqualität in Bildungseinrichtungen
-
-
Author(s): Gombert, Sebastian
Title: Methods and perspectives for the automated analytic assessment of free-text responses in formative scenarios
In: Jivet, Joana; Di Mitri, Daniele; Schneider, Jan; Papamitsiou, Zacharoula; Fominykh, Mikhail (Hrsg.): Proceedings of the Doctoral Consortium of the 17th European Conference on Technology Enhanced Learning co-located with the 17th European Conference on Technology Enhanced Learning (EC-TEL 2022), Toulouse, France, September 12, 2022, Aachen: RWTH, 2022 (CEUR Workshop Proceedings, 3292), S. 61-65
URL: https://ceur-ws.org/Vol-3292/DCECTEL2022_paper08.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Antwort; Aufsatz; Automatisierung; Benotung; Bewertung; Spracherkennung; Test; Text
Abstract: Assessment is the process of testing learners' skills and knowledge. Free-text response items are well suited for the assessment of learners' active knowledge and writing skills. However, the automatic assessment of respective responses is not trivial and requires the application of natural language processing. Accordingly, the automatic assessment of free-text responses is a widely researched topic in educational natural language processing. Most past work targets holistic scoring, the process of assigning overall scores or grades to responses. This is problematic in formative scenarios because learners require feedback rather than summative scores in such scenarios. Such feedback ideally targets specific aspects of responses, and, accordingly, automated systems which only predict holistic scores cannot be used as a basis for providing the same. What is instead needed are systems which implement analytic scoring approaches. Analytic scoring targets specific aspects of responses and scores them according to corresponding criteria. This requires different systems than addressed by the broad research on automated holistic scoring. In my PhD work which is outlined by this paper, I want to explore approaches for implementing analytic scoring systems by means of state-of-the-art natural language processing. These systems are targeted at providing a basis for feedback generation. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung
-
-
Author(s): Becker, Benjamin; Debeer, Dries; Weirich, Sebastian; Goldhammer, Frank
Title: On the speed sensitivity parameter in the lognormal model for response times. Implications for test assembly
In: Applied Psychological Measurement, 45 (2021) 6, S. 407-422
DOI: 10.1177/01466216211008530
URL: https://journals.sagepub.com/doi/abs/10.1177/01466216211008530
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Software; Technologiebasiertes Testen; Messverfahren; Item-Response-Theory; Leistungstest; Frage; Antwort; Dauer; Einflussfaktor; Testkonstruktion; Modell; Vergleich; Testtheorie; Simulation
Abstract: In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed. (DIPF/Orig.)
Abstract (english): In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed. (DIPF/Orig.)
DIPF-Departments: Lehr und Lernqualität in Bildungseinrichtungen
-
-
Author(s): Brod, Garvin
Title: Predicting as a learning strategy
In: Psychonomic Bulletin & Review, 28 (2021) 6, S. 1839-1847
DOI: 10.3758/s13423-021-01904-1
URL: https://link.springer.com/article/10.3758/s13423-021-01904-1
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Lernstrategie; Prognose; Information; Wissen; Antwort; Gedächtnis; Kognitive Prozesse; Strategie; Vergleich; Neugier; Fehler; Feedback; Unterricht; Forschung
Abstract (english): This article attempts to delineate the procedural and mechanistic characteristics of predicting as a learning strategy. While asking students to generate a prediction before presenting the correct answer has long been a popular learning strategy, the exact mechanisms by which it improves learning are only beginning to be unraveled. Moreover, predicting shares many features with other retrieval-based learning strategies (e.g., practice testing, pretesting, guessing), which begs the question of whether there is more to it than getting students to engage in active retrieval. I argue that active retrieval as such does not suffice to explain beneficial effects of predicting. Rather, the effectiveness of predicting is also linked to changes in the way the ensuing feedback is processed. Initial evidence suggests that predicting boosts surprise about unexpected answers, which leads to enhanced attention to the correct answer and strengthens its encoding. I propose that it is this affective aspect of predicting that sets it apart from other retrieval-based learning strategies, particularly from guessing. Predicting should thus be considered as a learning strategy in its own right. Studying its unique effects on student learning promises to bring together research on formal models of learning from prediction error, epistemic emotions, and instructional design. (DIPF/Orig.)
DIPF-Departments: Bildung und Entwicklung
-
-
Author(s): Deribo, Tobias; Kröhne, Ulf; Goldhammer, Frank
Title: Model‐based treatment of rapid guessing
In: Journal of Educational Measurement, 58 (2021) 2, S. 281-303
DOI: 10.1111/jedm.12290
URL: https://onlinelibrary.wiley.com/doi/10.1111/jedm.12290?af=R
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Leistungstest; Testkonstruktion; Messverfahren; Computerunterstütztes Verfahren; Frage; Antwort; Verhalten; Dauer; Problemlösen; Modell; Student; Medienkompetenz; Item-Response-Theory; Multiple-Choice-Verfahren; Validität; Panel; Längsschnittuntersuchung
Abstract (english): The increased availability of time-related information as a result of computer-based assessment has enabled new ways to measure test-taking engagement. One of these ways is to distinguish between solution and rapid guessing behavior. Prior research has recommended response-level filtering to deal with rapid guessing. Response-level filtering can lead to parameter bias if rapid guessing depends on the measured trait or (un-)observed covariates. Therefore, a model based on Mislevy and Wu (1996) was applied to investigate the assumption of ignorable missing data underlying response-level filtering. The model allowed us to investigate different approaches to treating response-level filtered responses in a single framework through model parameterization. The study found that lower-ability test-takers tend to rapidly guess more frequently and are more likely to be unable to solve an item they guessed on, indicating a violation of the assumption of ignorable missing data underlying response-level filtering. Further ability estimation seemed sensitive to different approaches to treating response-level filtered responses. Moreover, model-based approaches exhibited better model fit and higher convergent validity evidence compared to more naïve treatments of rapid guessing. The results illustrate the need to thoroughly investigate the assumptions underlying specific treatments of rapid guessing as well as the need for robust methods. (DIPF/Orig.)
DIPF-Departments: Lehr und Lernqualität in Bildungseinrichtungen
-
-
Author(s): Goldhammer, Frank; Kroehne, Ulf; Hahnel, Carolin; De Boeck, Paul
Title: Controlling speed in component skills of reading improves the explanation of reading comprehension
In: Journal of Educational Psychology, 113 (2021) 5, S. 861-878
DOI: 10.1037/edu0000655
URN: urn:nbn:de:0111-pedocs-237977
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-237977
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Lesekompetenz; Fertigkeit; Kognitive Prozesse; Leistung; Antwort; Zeit; Wort; Semantik; Text; Leseverstehen; PISA <Programme for International Student Assessment>; Schüler; Messverfahren; Test; Experimentelle Untersuchung; Empirische Untersuchung; Deutschland
Abstract: Efficiency in reading component skills is crucial for reading comprehension, as efficient subprocesses do not extensively consume limited cognitive resources, making them available for comprehension processes. Cognitive efficiency is typically measured with speeded tests of relatively easy items. Observed responses and response times indicate the latent variables of ability and speed. Interpreting only ability or speed as efficiency may be misleading because there is a within-person dependency between both variables (speed-ability tradeoff [SAT]). Therefore, the present study measures efficiency as ability conditional on speed by controlling speed experimentally with item-level time limits. The proposed timed ability measures of reading component skills are expected to have a clearer interpretation in terms of efficiency and to be better predictors for reading comprehension. To support this claim, this study investigates two component skills, visual word recognition and sentence-level semantic integration (sentence reading), to understand how differences in ability in a timed condition are related to differences in ability and speed in a traditional untimed condition. Moreover, untimed and timed reading component skill measures were used to explain reading comprehension. A German subsample from Programme for International Student Assessment (PISA) 2012 completed the reading component skills tasks with and without item-level time limits and PISA reading tasks. The results showed that timed ability is only moderately related to untimed ability. Furthermore, timed ability measures proved to be stronger predictors of sentence-level and text-level reading comprehension than the corresponding untimed ability and speed measures, although using untimed ability and speed jointly as predictors increased the amount of explained variance.
Abstract (english): Efficiency in reading component skills is crucial for reading comprehension, as efficient subprocesses do not extensively consume limited cognitive resources, making them available for comprehension processes. Cognitive efficiency is typically measured with speeded tests of relatively easy items. Observed responses and response times indicate the latent variables of ability and speed. Interpreting only ability or speed as efficiency may be misleading because there is a within-person dependency between both variables (speed-ability tradeoff [SAT]). Therefore, the present study measures efficiency as ability conditional on speed by controlling speed experimentally with item-level time limits. The proposed timed ability measures of reading component skills are expected to have a clearer interpretation in terms of efficiency and to be better predictors for reading comprehension. To support this claim, this study investigates two component skills, visual word recognition and sentence-level semantic integration (sentence reading), to understand how differences in ability in a timed condition are related to differences in ability and speed in a traditional untimed condition. Moreover, untimed and timed reading component skill measures were used to explain reading comprehension. A German subsample from Programme for International Student Assessment (PISA) 2012 completed the reading component skills tasks with and without item-level time limits and PISA reading tasks. The results showed that timed ability is only moderately related to untimed ability. Furthermore, timed ability measures proved to be stronger predictors of sentence-level and text-level reading comprehension than the corresponding untimed ability and speed measures, although using untimed ability and speed jointly as predictors increased the amount of explained variance.
DIPF-Departments: Lehr und Lernqualität in Bildungseinrichtungen
-
-
Author(s): Schmitterer, Alexandra; Brod, Garvin
Title: Which data do elementary school teachers use to determine reading difficulties in their students?
In: Journal of Learning Disabilities, 54 (2021) 5, S. 349-364
DOI: 10.1177/0022219420981990
URN: urn:nbn:de:0111-pedocs-237621
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-237621
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft
Language: Englisch
Keywords: Lesestörung; Intervention; Grundschullehrer; Entscheidung; Diagnostik; Daten; Lesefertigkeit; Lesetest; Rechtschreibtest; Wortschatztest; Grundschüler; Schuljahr 03; Mehrebenenanalyse; Regressionsanalyse; Empirische Untersuchung; Hessen; Niedersachsen; Deutschland
Abstract: Small-group interventions allow for tailored instruction for students with learning difficulties. A crucial first step is the accurate identification of students who need such an intervention. This study investigated how teachers decide whether their students need a remedial reading intervention. To this end, 64 teachers of 697 third-grade students from Germany were asked to rate whether a reading intervention for their students was "not necessary," "potentially necessary," or "definitely necessary." Independent experimenters tested the students' reading and spelling abilities with standardized tests, and a subsample of 370 children participated in standardized tests of phonological awareness and vocabulary. Findings show that teachers' decisions with regard to students' needing a reading intervention overlapped more with results from standardized spelling assessments than from reading assessments. Hierarchical linear models indicated that students' spelling abilities, along with phonological awareness and vocabulary, explained variance in teachers' ratings over and above students' reading skills. Teachers, thus, relied on proximal cues such as spelling skills to reach their decision. These findings are discussed in relation to clinical standards and educational contexts. Findings indicate that the teachers' assignment of children to interventions might be underspecified, and starting points for specific teacher training programs are outlined. (DIPF/Orig.)
DIPF-Departments: Bildung und Entwicklung
-
-
Author(s): Klieme, Eckhard
Title: Guter Unterricht unter den Bedingungen der Pandemie. Lehrkräfte haben weiterhin die Verantwortung für das Lernen
In: SchulVerwaltung. Ausgabe Baden-Württemberg, 30 (2021) 1, S. 14-17
DOI: 10.25656/01:23919
URN: urn:nbn:de:0111-pedocs-239192
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-239192
Publication Type: 3b. Beiträge in weiteren Zeitschriften; praxisorientiert
Language: Deutsch
Keywords: Deutschland; Pandemie; Unterricht; Qualität; Lernbedingungen; Lehrerrolle; Verantwortung; Schulorganisation; Krise; Unterrichtsgespräch; Kommunikation; Aufgabenstellung; Digitale Medien; Selbstgesteuertes Lernen; Kognitives Lernen; Aktives Lernen; Kompetenz; Förderung
Abstract: Im vorliegenden Beitrag geht es darum, festzuhalten, was guten Unterricht im Kern ausmacht. Diese Faktoren sollten auch in der jetzigen Krisensituation beachtet werden. So attraktiv es erscheinen mag, in Zeiten der Pandemie Unterricht ganz neu zu denken - fächerübergreifend, situationsorientiert oder von "Schlüsselthemen" im Sinne Klafkis her -, so wenig vorbereitet sind Lehrende und Lernende auf einen radikalen Umbruch - von wenigen Reformschulen abgesehen, die z.B. nach Jenaplan-Prinzipien arbeiten. Und so wichtig es ist, Kinder und Jugendliche gerade in diesen Zeiten sozial und emotional zu stützen, so unverantwortlich wäre es, den systematischen Aufbau fachlicher und damit verbundener fachübergreifender Kompetenzen zu vernachlässigen. (DIPF/Orig.)
DIPF-Departments: Lehr und Lernqualität in Bildungseinrichtungen