Search results in the DIPF database of publications

Your query:

(Personen: "Kroehne," und "Ulf")

Invariance of the response processes between gender and modes in an assessment of reading Kroehne, Ulf; Hahnel, Carolin; Goldhammer, Frank Journal Article | In: Frontiers in Applied Mathematics and Statistics | 2019 39231 Endnote: Author(s): Kroehne, Ulf; Hahnel, Carolin; Goldhammer, Frank
Title: Invariance of the response processes between gender and modes in an assessment of reading
In: Frontiers in Applied Mathematics and Statistics, (2019) , S. 5:2
DOI: 10.3389/fams.2019.00002
URL: https://www.frontiersin.org/articles/10.3389/fams.2019.00002/full
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft
Language: Englisch
Keywords: Lesefertigkeit; Technologiebasiertes Testen; Computerunterstütztes Verfahren; Papier-Bleistift-Test; Antwort; Zeit; Messung; Item-Response-Theory; Modell; Geschlechtsspezifischer Unterschied; Logdatei; Datenanalyse; Empirische Untersuchung; Deutschland
Abstract: In this paper, we developed a method to extract item-level response times from log data that are available in computer-based assessments (CBA) and paper-based assessments (PBA) with digital pens. Based on response times that were extracted using only time differences between responses, we used the bivariate generalized linear IRT model framework (B-GLIRT, [1]) to investigate response times as indicators for response processes. A parameterization that includes an interaction between the latent speed factor and the latent ability factor in the cross-relation function was found to fit the data best in CBA and PBA. Data were collected with a within-subject design in a national add-on study to PISA 2012 administering two clusters of PISA 2009 reading units. After investigating the invariance of the measurement models for ability and speed between boys and girls, we found the expected gender effect in reading ability to coincide with a gender effect in speed in CBA. Taking this result as indication for the validity of the time measures extracted from time differences between responses, we analyzed the PBA data and found the same gender effects for ability and speed. Analyzing PBA and CBA data together we identified the ability mode effect as the latent difference between reading measured in CBA and PBA. Similar to the gender effect the mode effect in ability was observed together with a difference in the latent speed between modes. However, while the relationship between speed and ability is identical for boys and girls we found hints for mode differences in the estimated parameters of the cross-relation function used in the B-GLIRT model. (DIPF/Orig.)
DIPF-Departments: Bildungsqualität und Evaluation

Vertiefende Analysen zur Umstellung des Modus von Papier auf Computer Goldhammer, Frank; Harrison, Scott; Bürger, Sarah; Kroehne, Ulf; Lüdtke, Oliver; […] Book Chapter | Aus: Reiss, Kristina; Weis, Mirjam; Klieme, Eckhard; Köller, Olaf (Hrsg.): PISA 2018: Grundbildung im internationalen Vergleich | Münster: Waxmann | 2019 39806 Endnote: Author(s): Goldhammer, Frank; Harrison, Scott; Bürger, Sarah; Kroehne, Ulf; Lüdtke, Oliver; Robitzsch, Alexander; Köller, Olaf; Heine, Jörg-Henrik; Mang, Julia
Title: Vertiefende Analysen zur Umstellung des Modus von Papier auf Computer
In: Reiss, Kristina; Weis, Mirjam; Klieme, Eckhard; Köller, Olaf (Hrsg.): PISA 2018: Grundbildung im internationalen Vergleich, Münster: Waxmann, 2019 , S. 163-186
URL: https://www.pisa.tum.de/fileadmin/w00bgi/www/Berichtsbaende_und_Zusammenfassungungen/PISA_2018_Berichtsband_online_29.11.pdf#page=163
Publication Type: 4. Beiträge in Sammelwerken; Sammelband (keine besondere Kategorie)
Language: Deutsch
Keywords: PISA <Programme for International Student Assessment>; Papier-Bleistift-Test; Technologiebasiertes Testen; Veränderung; Methode; Wirkung; Computerunterstütztes Verfahren; Testaufgabe; Antwort; Schwierigkeit; Lesen; Mathematik; Naturwissenschaften; Testkonstruktion; Testdurchführung; Korrelation; Vergleich; Deutschland
Abstract: In PISA 2015 wurde der Erhebungsmodus von Papier zu Computer umgestellt. Eine nationale Ergänzungsstudie im Rahmen von PISA 2018 hatte entsprechend das Ziel, vertiefende Analysen zu möglichen Unterschieden papierbasierter und computerbasierter Messungen durchzuführen. Im Fokus standen die Vergleichbarkeit des gemessenen Konstrukts und der einzelnen Aufgaben (Items), beispielsweise hinsichtlich ihrer Schwierigkeit. Darüber hinaus wurden die Auswirkungen des Moduswechsels auf die Vergleichbarkeit mit den Ergebnissen früherer PISA-Erhebungen in Deutschland untersucht. Als empirische Basis wurden Daten aus dem PISA-2015-Feldtest genutzt sowie Daten, die im Rahmen der nationalen PISA-Haupterhebung 2018 an einem zweiten Testtag mit papierbasierten Testheften aus PISA 2009 zusätzlich erhoben wurden. Erste Ergebnisse der Ergänzungsstudie liefern Belege für die Konstruktäquivalenz zwischen papier- und computerbasierten Messungen. Zudem weisen die Daten der Ergänzungsstudie darauf hin, dass die computerbasierten Items im Mittel etwas schwieriger sind als die papierbasierten Items. Hinsichtlich der Veränderungen zwischen 2015 und 2018 zeigt sich eine hohe Übereinstimmung von international berichtetem (originalem) und nationalem (marginalem) Trend. Die Veränderungen zwischen 2009 und 2018 fallen für den nationalen Trend, der allein auf papierbasierten Messungen beruht, insgesamt etwas günstiger aus als für den originalen Trend. (DIPF/Orig.)
DIPF-Departments: Bildungsqualität und Evaluation

How to conceptualize, represent, and analyze log data from technology-based assessments? A generic […] Kroehne, Ulf; Goldhammer, Frank Journal Article | In: Behaviormetrika | 2018 38895 Endnote: Author(s): Kroehne, Ulf; Goldhammer, Frank
Title: How to conceptualize, represent, and analyze log data from technology-based assessments? A generic framework and an application to questionnaire items
In: Behaviormetrika, 45 (2018) 2, S. 527-563
DOI: 10.1007/s41237-018-0063-y
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft
Language: Englisch
Keywords: Bildungsforschung; Empirische Forschung; Logdatei; Datenanalyse; Technologiebasiertes Testen; PISA <Programme for International Student Assessment>; Fragebogen; Konzeption; Testkonstruktion; Daten; Typologie; Hardware; Antwort; Verhalten; Dauer; Interaktion; Mensch-Maschine-Kommunikation; Indikator
Abstract: Log data from educational assessments attract more and more attention and large-scale assessment programs have started providing log data as scientific use files. Such data generated as a by-product of computer-assisted data collection has been known as paradata in survey research. In this paper, we integrate log data from educational assessments into a taxonomy of paradata. To provide a generic framework for the analysis of log data, finite state machines are suggested. Beyond its computational value, the specific benefit of using finite state machines is achieved by separating platform-specific log events from the definition of indicators by states. Specifically, states represent filtered log data given a theoretical process model, and therefore, encode the information of log files selectively. The approach is empirically illustrated using log data of the context questionnaires of the Programme for International Student Assessment (PISA). We extracted item-level response time components from questionnaire items that were administered as item batteries with multiple questions on one screen and related them to the item responses. Finally, the taxonomy and the finite state machine approach are discussed with respect to the definition of complete log data, the verification of log data and the reproducibility of log data analyses. (DIPF/Orig.)
DIPF-Departments: Bildungsqualität und Evaluation

Modeling individual response time effects between and within experimental speed conditions. A GLMM […] Goldhammer, Frank; Steinwascher, Merle A.; Kroehne, Ulf; Naumann, Johannes Journal Article | In: British Journal of Mathematical and Statistical Psychology | 2017 37357 Endnote: Author(s): Goldhammer, Frank; Steinwascher, Merle A.; Kroehne, Ulf; Naumann, Johannes
Title: Modeling individual response time effects between and within experimental speed conditions. A GLMM approach for speeded tests
In: British Journal of Mathematical and Statistical Psychology, 70 (2017) 2, S. 238-256
DOI: 10.1111/bmsp.12099
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft
Language: Englisch
Keywords: Test; Testkonstruktion; Antwort; Dauer; Unterschied; Messverfahren; Entscheidung; Einflussfaktor; Fehler; Modell; Vergleich
Abstract: Completing test items under multiple speed conditions avoids the performance measure being confounded with individual differences in the speed-accuracy compromise, and offers insights into the response process, that is, how response time relates to the probability of a correct response. This relation is traditionally represented by two conceptually different functions: the speed-accuracy trade-off function (SATF) across conditions relating the condition average response time to the condition average of accuracy, and the conditional accuracy function (CAF) within a condition describing accuracy conditional on response time. Using a generalized linear mixed modelling approach, we propose an item response modelling framework that is suitable for item response and response time data from experimental speed conditions. The proposed SATF and CAF model accommodates response time effects between conditions (i.e., person and item SATF slope) and within conditions (i.e., residual CAF slopes), captures person and item differences in these effects, and is suitable for measures with a strong speed component. Moreover, for a single condition a CAF model is proposed distinguishing person, item and residual CAF. The properties of the models are illustrated with an empirical example. (DIPF/Orig.)
DIPF-Departments: Bildungsqualität und Evaluation

Development and evaluation of a computer adaptive test to assess anxiety in cardiovascular […] Abberger, Birgit; Haschke, Anne; Wirtz, Markus; Kroehne, Ulf; Bengel, Juergen; Baumeister, Harald Journal Article | In: Archives of Physical Medicine and Rehabilitation | 2013 34200 Endnote: Author(s): Abberger, Birgit; Haschke, Anne; Wirtz, Markus; Kroehne, Ulf; Bengel, Juergen; Baumeister, Harald
Title: Development and evaluation of a computer adaptive test to assess anxiety in cardiovascular rehabilitation patients
In: Archives of Physical Medicine and Rehabilitation, 94 (2013) 12, S. 2433-2439
DOI: 10.1016/j.apmr.2013.07.009
URL: http://www.sciencedirect.com/science/article/pii/S0003999313005443
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Angst; Benutzerfreundlichkeit; Evaluation; Item-Response-Theory; Kranker; Messung; Psychodiagnostik; Psychometrie; Technologiebasiertes Testen; Test
Abstract: Objective: To develop and evaluate a computer adaptive test for the assessment of anxiety in cardiovascular rehabilitation patients (ACAT-cardio) that tailors an optimal test for each patient and enables precise and time-effective measurement. Design: Simulation study, validation study (against the anxiety subscale of the Hospital Anxiety and Depression Scale and the physical component summary scale of the 12-Item Short-Form Health Survey), and longitudinal study (beginning and end of rehabilitation). Setting: Cardiac rehabilitation centers. Participants: Cardiovascular rehabilitation patients: simulation study sample (n=106; mean age, 57.8y; 25.5% women) and validation and longitudinal study sample (n=138; mean age, 58.6 and 57.9y, respectively; 16.7% and 12.1% women, respectively). Interventions: Not applicable. Main Outcome Measures: Hospital Anxiety and Depression Scale, 12-Item Short-Form Health Survey, and ACAT-cardio. Results: The mean number of items was 9.2 with an average processing time of 1:13 minutes when an SE ≤.50 was used as a stopping rule; with an SE ≤.32, there were 28 items and a processing time of 3:47 minutes. Validity could be confirmed via correlations between .68 and .81 concerning convergent validity (ACAT-cardio vs Hospital Anxiety and Depression Scale anxiety subscale) and correlations between −.47 and −.30 concerning discriminant validity (ACAT-cardio vs 12-Item Short-Form Health Survey physical component summary scale). Sensitivity to change was moderate to high with standardized response means between .45 and .82. Conclusions: The ACAT-cardio shows good psychometric properties and provides the opportunity for an innovative and time-effective assessment of anxiety in cardiovascular rehabilitation. A more flexible stopping rule might further improve the ACAT-cardio. Additionally, testing in other cardiovascular populations would increase generalizability.
DIPF-Departments: Bildungsqualität und Evaluation

Adaptive screening for depression. Recalibration of an itembank for the assessment of depression in […] Forkmann, Thomas; Kroehne, Ulf; Wirtz, Markus; Norra, Christine; Baumeister, Harald; […] Journal Article | In: Journal of Psychosomatic Research | 2013 34201 Endnote: Author(s): Forkmann, Thomas; Kroehne, Ulf; Wirtz, Markus; Norra, Christine; Baumeister, Harald; Gauggel, Siegfried; Elhan, Atilla Halil; Tennant, Alan; Boecker, Maren
Title: Adaptive screening for depression. Recalibration of an itembank for the assessment of depression in persons with mental and somatic diseases and evaluation in a simulated computer-adaptive test environment
In: Journal of Psychosomatic Research, 75 (2013) 5, S. 437-443
DOI: 10.1016/j.jpsychores.2013.08.022
URL: http://www.sciencedirect.com/science/article/pii/S0022399913003395
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Depression; Evaluation; Itembank; Psychometrie; Psychosomatik; Rasch-Modell; Screening-Verfahren; Simulation; Technologiebasiertes Testen; Validität
Abstract: This study conducted a simulation study for computer-adaptive testing based on the Aachen Depression Item Bank (ADIB), which was developed for the assessment of depression in persons with somatic diseases. Prior to computer-adaptive test simulation, the ADIB was newly calibrated.Recalibration was performed in a sample of 161 patients treated for a depressive syndrome, 103 patients from cardiology, and 103 patients from otorhinolaryngology (mean age 44.1, SD = 14.0; 44.7% female) and was cross-validated in a sample of 117 patients undergoing rehabilitation for cardiac diseases (mean age 58.4, SD = 10.5; 24.8% women). Unidimensionality of the itembank was checked and a Rasch analysis was performed that evaluated local dependency (LD), differential item functioning (DIF), item fit and reliability. CAT-simulation was conducted with the total sample and additional simulated data.Recalibration resulted in a strictly unidimensional item bank with 36 items, showing good Rasch model fit (item fit residuals < |2.5|) and no DIF or LD. CAT simulation revealed that 13 items on average were necessary to estimate depression in the range of − 2 and + 2 logits when terminating at SE ≤ 0.32 and 4 items if using SE ≤ 0.50. Receiver Operating Characteristics analysis showed that θ estimates based on the CAT algorithm have good criterion validity with regard to depression diagnoses (Area Under the Curve ≥ .78 for all cut-off criteria). The recalibration of the ADIB succeeded and the simulation studies conducted suggest that it has good screening performance in the samples investigated and that it may reasonably add to the improvement of depression assessment.
DIPF-Departments: Bildungsqualität und Evaluation