Ergebnis der Suche in der DIPF Publikationendatenbank

Ihre Abfrage:

(Schlagwörter: "Simulation")

Integrating state dynamics and trait change. A tutorial using the example of stress reactivity and […] Brose, Annette; Neubauer, Andreas B.; Schmiedek, Florian Zeitschriftenbeitrag | In: European Journal of Personality | 2022 41321 Endnote: Autor*innen: Brose, Annette; Neubauer, Andreas B.; Schmiedek, Florian
Titel: Integrating state dynamics and trait change. A tutorial using the example of stress reactivity and change in well-being
In: European Journal of Personality, 36 (2022) 2, S. 180-199
DOI: 10.1177/08902070211014055
URL: https://journals.sagepub.com/doi/10.1177/08902070211014055
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Mehrebenenanalyse; Stress; Reaktion; Wirkung; Wohlbefinden; Emotionaler Zustand; Veränderung; Persönlichkeitsmerkmal; Messverfahren; Methode; Modellierung; Simulation; Strukturgleichungsmodell; Regressionsanalyse
Abstract: Recent theoretical accounts on the causes of trait change emphasize the potential relevance of states. In the same vein, reactions to daily stress have been shown to prospectively predict change in well-being, speaking for the proposition that state dynamics can be a precursor to long-term change in more stable individual-differences characteristics. A common analysis approach towards linking state dynamics such as stress reactivity and change in some more stable individual differences characteristic has been a two-step approach, modeling state dynamics and trait change separately. In this paper, we elaborate on one-step procedures to simultaneously model state dynamics and trait change, realized in the multilevel structural equation modeling framework. We highlight three distinct advantages over the two-step approach which pre-exists in the methodological literature, and we disseminate these advantages to a larger audience. We target a readership of substantive researchers interested in the relationships between state dynamics and traits or trait change, and we provide them with a tutorial style paper on state-of-the-art methods on these topics. (DIPF/Orig.)
DIPF-Abteilung: Bildung und Entwicklung

Keep me in the loop. Real-time feedback with multimodal data Di Mitri, Daniele; Schneider, Jan; Drachsler, Hendrik Zeitschriftenbeitrag | In: International Journal of Artificial Intelligence in Education | 2022 43531 Endnote: Autor*innen: Di Mitri, Daniele; Schneider, Jan; Drachsler, Hendrik
Titel: Keep me in the loop. Real-time feedback with multimodal data
In: International Journal of Artificial Intelligence in Education, 32 (2022) 4, S. 1093-1118
DOI: 10.1007/s40593-021-00281-z
URL: https://link.springer.com/article/10.1007/s40593-021-00281-z
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Datenanalyse; Datenverarbeitung; Effektivität; Empirische Untersuchung; Feedback; Fehler; Fertigkeit; Fragebogen; Lernprozess; Medizin; Psychomotorik; Simulation; System; Technologie; Teilnehmer; Tool; Training
Abstract (english): This paper describes the CPR Tutor, a real-time multimodal feedback system for cardiopulmonary resuscitation (CPR) training. The CPR Tutor detects training mistakes using recurrent neural networks. The CPR Tutor automatically recognises and assesses the quality of the chest compressions according to five CPR performance indicators. It detects training mistakes in real-time by analysing a multimodal data stream consisting of kinematic and electromyographic data. Based on this assessment, the CPR Tutor provides audio feedback to correct the most critical mistakes and improve the CPR performance. The mistake detection models of the CPR Tutor were trained using a dataset from 10 experts. Hence, we tested the validity of the CPR Tutor and the impact of its feedback functionality in a user study involving additional 10 participants. The CPR Tutor pushes forward the current state of the art of real-time multimodal tutors by providing: (1) an architecture design, (2) a methodological approach for delivering real-time feedback using multimodal data and (3) a field study on real-time feedback for CPR training. This paper details the results of a field study by quantitatively measuring the impact of the CPR Tutor feedback on the performance indicators and qualitatively analysing the participants' questionnaire answers. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Performance of infit and outfit confidence intervals calculated via parametric bootstrapping Silva Diaz, John Alexander; Köhler, Carmen; Hartig, Johannes Zeitschriftenbeitrag | In: Applied Measurement in Education | 2022 42707 Endnote: Autor*innen: Silva Diaz, John Alexander; Köhler, Carmen; Hartig, Johannes
Titel: Performance of infit and outfit confidence intervals calculated via parametric bootstrapping
In: Applied Measurement in Education, 35 (2022) 2, S. 116-132
DOI: 10.1080/08957347.2022.2067540
URL: https://www.tandfonline.com/doi/full/10.1080/08957347.2022.2067540
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Item-Response-Theory; Rasch-Modell; Statistik; Methode; Verfahren; Stichprobe; Test; Analyse; Simulation
Abstract: Testing item fit is central in item response theory (IRT) modeling, since a good fit is necessary to draw valid inferences from estimated model parameters. Infit and outfit fit statistics, widespread indices for detecting deviations from the Rasch model, are affected by data factors, such as sample size. Consequently, the traditional use of fixed infit and outfit cutoff points is an ineffective practice. This article evaluates if confidence intervals estimated via parametric bootstrapping provide more suitable cutoff points than the conventionally applied range of 0.8-1.2, and outfit critical ranges adjusted by sample size. The performance is evaluated under different sizes of misfit, sample sizes, and number of items. Results show that the confidence intervals performed better in terms of power, but had inflated type-I error rates, which resulted from mean square values pushed below unity in the large size of misfit conditions. However, when performing a one-side test with the upper range of the confidence intervals, the forementioned inflation was fixed. (DIPF/Orig.)
DIPF-Abteilung: Lehr und Lernqualität in Bildungseinrichtungen

On the speed sensitivity parameter in the lognormal model for response times. Implications for test […] Becker, Benjamin; Debeer, Dries; Weirich, Sebastian; Goldhammer, Frank Zeitschriftenbeitrag | In: Applied Psychological Measurement | 2021 42009 Endnote: Autor*innen: Becker, Benjamin; Debeer, Dries; Weirich, Sebastian; Goldhammer, Frank
Titel: On the speed sensitivity parameter in the lognormal model for response times. Implications for test assembly
In: Applied Psychological Measurement, 45 (2021) 6, S. 407-422
DOI: 10.1177/01466216211008530
URL: https://journals.sagepub.com/doi/abs/10.1177/01466216211008530
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Software; Technologiebasiertes Testen; Messverfahren; Item-Response-Theory; Leistungstest; Frage; Antwort; Dauer; Einflussfaktor; Testkonstruktion; Modell; Vergleich; Testtheorie; Simulation
Abstract: In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed. (DIPF/Orig.)
Abstract (english): In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed. (DIPF/Orig.)
DIPF-Abteilung: Lehr und Lernqualität in Bildungseinrichtungen

Simultaneous constrained adaptive item selection for group-based testing Bengs, Daniel; Kröhne, Ulf; Brefeld, Ulf Zeitschriftenbeitrag | In: Journal of Educational Measurement | 2021 40702 Endnote: Autor*innen: Bengs, Daniel; Kröhne, Ulf; Brefeld, Ulf
Titel: Simultaneous constrained adaptive item selection for group-based testing
In: Journal of Educational Measurement, 58 (2021) 2, S. 236-261
DOI: 10.1111/jedm.12285
URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12285
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Adaptives Testen; Aufgabe; Auswahl; Computerunterstütztes Verfahren; Empirische Untersuchung; Gruppe; Leistungsmessung; Modell; Simulation; Technologiebasiertes Testen; Test
Abstract (english): By tailoring test forms to the test‐taker's proficiency, Computerized Adaptive Testing (CAT) enables substantial increases in testing efficiency over fixed forms testing. When used for formative assessment, the alignment of task difficulty with proficiency increases the chance that teachers can derive useful feedback from assessment data. The application of CAT to formative assessment in the classroom, however, is hindered by the large number of different items used for the whole class; the required familiarization with a large number of test items puts a significant burden on teachers. An improved CAT procedure for group‐based testing is presented, which uses simultaneous automated test assembly to impose a limit on the number of items used per group. The proposed linear model for simultaneous adaptive item selection allows for full adaptivity and the accommodation of constraints on test content. The effectiveness of the group‐based CAT is demonstrated with real‐world items in a simulated adaptive test of 3,000 groups of test‐takers, under different assumptions on group composition. Results show that the group‐based CAT maintained the efficiency of CAT, while a reduction in the number of used items by one half to two‐thirds was achieved, depending on the within‐group variance of proficiencies.
DIPF-Abteilung: Bildungsqualität und Evaluation

Evaluation of online information in university students. Development and scaling of the screening […] Hahnel, Carolin; Eichmann, Beate; Goldhammer, Frank Zeitschriftenbeitrag | In: Frontiers in Psychology | 2020 40881 Endnote: Autor*innen: Hahnel, Carolin; Eichmann, Beate; Goldhammer, Frank
Titel: Evaluation of online information in university students. Development and scaling of the screening instrument EVON
In: Frontiers in Psychology, (2020) , S. 11:562128
DOI: 10.3389/fpsyg.2020.562128
URN: urn:nbn:de:0111-pedocs-232241
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-232241
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Deutschland; Internet; Informationskompetenz; Ressource; Glaubwürdigkeit; Relevanz; Bewertung; Test; Testentwicklung; Itemanalyse; Suchmaschine; Simulation; Technologiebasiertes Testen; Interview; Erhebungsinstrument; Evaluation; Student; Rasch-Modell; Empirische Untersuchung;
Abstract: As Internet sources provide information of varying quality, it is an indispensable prerequisite skill to evaluate the relevance and credibility of online information. Based on the assumption that competent individuals can use different properties of information to assess its relevance and credibility, we developed the EVON (evaluation of online information), an interactive computer-based test for university students. The developed instrument consists of eight items that assess the skill to evaluate online information in six languages. Within a simulated search engine environment, students are requested to select the most relevant and credible link for a respective task. To evaluate the developed instrument, we conducted two studies: (1) a pre-study for quality assurance and observing the response process (cognitive interviews of n = 8 students) and (2) a main study aimed at investigating the psychometric properties of the EVON and its relation to other variables (n = 152 students). The results of the pre-study provided first evidence for a theoretically sound test construction with regard to students' item processing behavior. The results of the main study showed acceptable psychometric outcomes for a standardized screening instrument with a small number of items. The item design criteria affected the item difficulty as intended, and students' choice to visit a website had an impact on their task success. Furthermore, the probability of task success was positively predicted by general cognitive performance and reading skill. Although the results uncovered a few weaknesses (e.g., a lack of difficult items), and the efforts of validating the interpretation of EVON outcomes still need to be continued, the overall results speak in favor of a successful test construction and provide first indication that the EVON assesses students' skill in evaluating online information in search engine environments. (DIPF/Orig.)
DIPF-Abteilung: Bildungsqualität und Evaluation

Using a multilevel random item Rasch model to examine item difficulty variance between random groups Hartig, Johannes; Köhler, Carmen; Naumann, Alexander Zeitschriftenbeitrag | In: Psychological Test and Assessment Modeling | 2020 40525 Endnote: Autor*innen: Hartig, Johannes; Köhler, Carmen; Naumann, Alexander
Titel: Using a multilevel random item Rasch model to examine item difficulty variance between random groups
In: Psychological Test and Assessment Modeling, 62 (2020) 1, S. 11-27
URL: https://www.psychologie-aktuell.com/fileadmin/Redaktion/Journale/ptam-2020-1/02_Hartig.pdf
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Rasch-Modell; Mehrebenenanalyse; Methode; Leistungsfähigkeit; Vergleichsuntersuchung; Simulation
Abstract: In educational assessments, item difficulties are typically assumed to be invariant across groups (e.g., schools or countries). We refer to variances of item difficulties on the group level violating this assumption as random group differential item functioning (RG-DIF). We examine the performance of three methods to estimate RG-DIF: (1) three-level Generalized Linear Mixed Models (GLMMs), (2) three-level GLMMs with anchor items, and (3) item-wise multilevel logistic regression (ML-LR) controlling for the estimated trait score. In a simulation study, the magnitude of RG-DIF and the covariance of the item difficulties on the group level were varied. When group level effects were independent, all three methods performed well. With correlated DIF, estimated variances on the group level were biased with the full three-level GLMM and ML-LR. This bias was more pronounced for ML-LR than for the full three-level GLMM. Using a three-level GLMM with anchor items allowed unbiased estimation of RG-DIF.
Abstract (english): In educational assessments, item difficulties are typically assumed to be invariant across groups (e.g., schools or countries). We refer to variances of item difficulties on the group level violating this assumption as random group differential item functioning (RG-DIF). We examine the performance of three methods to estimate RG-DIF: (1) three-level Generalized Linear Mixed Models (GLMMs), (2) three-level GLMMs with anchor items, and (3) item-wise multilevel logistic regression (ML-LR) controlling for the estimated trait score. In a simulation study, the magnitude of RG-DIF and the covariance of the item difficulties on the group level were varied. When group level effects were independent, all three methods performed well. With correlated DIF, estimated variances on the group level were biased with the full three-level GLMM and ML-LR. This bias was more pronounced for ML-LR than for the full three-level GLMM. Using a three-level GLMM with anchor items allowed unbiased estimation of RG-DIF.
DIPF-Abteilung: Bildungsqualität und Evaluation

Comparing attitudes across groups. An IRT-based item-fit statistic for the analysis of measurement […] Buchholz, Janine; Hartig, Johannes Zeitschriftenbeitrag | In: Applied Psychological Measurement | 2019 37766 Endnote: Autor*innen: Buchholz, Janine; Hartig, Johannes
Titel: Comparing attitudes across groups. An IRT-based item-fit statistic for the analysis of measurement invariance
In: Applied Psychological Measurement, 43 (2019) 3, S. 241-250
DOI: 10.1177/0146621617748323
URN: urn:nbn:de:0111-dipfdocs-174393
URL: http://www.dipfdocs.de/volltexte/2020/17439/pdf/APM_2019_3_Buchholz_Hartig_Comparing_attitudes_across_groups_A.pdf
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Einstellung <Psy>; Messung; Fragebogen; Internationaler Vergleich; Gruppe; Vergleich; Item-Response-Theory; Skalierung; Modell; Statistische Methode; Simulation
Abstract (english): Questionnaires for the assessment of attitudes and other psychological traits are crucial in educational and psychological research, and Item Response Theory (IRT) has become a viable tool for scaling such data. Many international large-scale assessments aim at comparing these constructs across countries, and the invariance of measures across countries is thus required. In its most recent cycle, the Programme for International Student Assessment (PISA 2015) implemented an innovative approach for testing the invariance of IRT-scaled constructs in the context questionnaires administered to students, parents, school principals and teachers. On the basis of a concurrent calibration with equal item parameters across all groups (i.e., languages within countries), a group-specific item-fit statistic (root-mean-square deviance; RMSD) was used as a measure for the invariance of item parameters for individual groups. The present simulation study examines the statistic's distribution under different types and extents of (non-) invariance in polytomous items. Responses to five four-point Likert-type items were generated under the Generalized Partial Credit Model (GPCM) for 1000 simulees in 50 groups each. For one of the five items, either location or discrimination parameters were drawn from a normal distribution. In addition to this type of non-invariance, we varied the extent of non-invariance by manipulating the variation of these distributions. Results indicate that the RMSD statistic is better at detecting non-invariance related to between-group differences in item location than in item discrimination. The study's findings may be used as a starting point to sensitivity analysis aiming to define cut-off values for determining (non-) invariance. (DIPF/Orig.)
DIPF-Abteilung: Bildungsqualität und Evaluation

Assessment of competences in sustainability management. Analyses to the construct dimensionality Seeber, Susan; Michaelis, Christian; Repp, Anton; Hartig, Johannes; Aichele, Christine; […] Zeitschriftenbeitrag | In: Zeitschrift für Pädagogische Psychologie | 2019 39562 Endnote: Autor*innen: Seeber, Susan; Michaelis, Christian; Repp, Anton; Hartig, Johannes; Aichele, Christine; Schumann, Matthias; Anke, Jan Moritz; Dierkes, Stefan; Siepelmeyer, David
Titel: Assessment of competences in sustainability management. Analyses to the construct dimensionality
In: Zeitschrift für Pädagogische Psychologie, 33 (2019) 2, S. 148-158
DOI: 10.1024/1010-0652/a000240
URN: urn:nbn:de:0111-pedocs-237802
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-237802
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Nachhaltige Entwicklung; Kompetenz; Diagnostik; Messung; Modell; Student; Wirtschaftswissenschaft; Unternehmen; Simulation; Management; Curriculum; Messverfahren; Diagnostischer Test; Testkonstruktion; Erhebungsinstrument; Faktorenanalyse; Strukturgleichungsmodell
Abstract: Dieser Beitrag thematisiert die Messung von Kompetenzen für das Nachhaltigkeitsmanagement. Eine zentrale Annahme des zugrunde gelegten Kompetenzmodells ist, dass sich die Dimensionen nach der Wissensrepräsentation (deklaratives vs. schematisches und strategisches Wissen) und nach inhaltlichen Bereichen (Betriebswirtschaft, Nachhaltigkeit aus gesellschaftlicher Perspektive und Nachhaltigkeitsmanagement) unterscheiden. An der Studie nahmen 850 Studierende aus 16 deutschen Universitäten wirtschaftswissenschaftlicher Studiengänge teil. Die Analysen wurden auf der Grundlage von Strukturgleichungsmodellierungen durchgeführt. Die Ergebnisse zeigen einen erwartungskonformen Befund dahingehend, dass die über unterschiedliche Assessmentformate und inhaltliche Anforderungen adressierten Wissensarten zwei disjunkte Dimensionen darstellen. Die Modellanalysen zeigen eine bessere Passung zum mehrdimensionalen Modell, bei dem zwischen deklarativem Wissen im Bereich der Betriebswirtschaftslehre und der Nachhaltigkeit aus gesellschaftlicher Perspektive einerseits und dem Nachhaltigkeitsmanagement andererseits unterschieden wird. (DIPF/Orig.)
Abstract (english): The paper discusses an examination of the dimensions of a competence model for sustainability management. A central assumption is that the dimensions of the competence model differ according to knowledge representation (i. e., declarative vs. schematic and strategic knowledge) and content area (i. e., business administration and sustainability from a societal perspective, as well as sustainability management). Study participants included 850 students from 16 universities in Germany, and the analyses were conducted on the basis of structural equation modeling. The results reveal an expectation-compliant finding whereby the types of knowledge addressed by different assessment formats and content requirements can be presented in two disjunct dimensions. On the one hand, the model analyses indicate a better fit to the multidimensional model, which distinguishes between declarative knowledge in the field of business administration and sustainability from a social perspective, while on the other hand, the analyses suggest a better fit to sustainability management. (DIPF/Orig.)
DIPF-Abteilung: Bildungsqualität und Evaluation

Kollaboratives Problemlösen in PISA 2015. Deutschland im Fokus Zehner, Fabian; Weis, Mirjam; Vogel, Freydis; Leutner, Detlev; Reiss, Kristina Zeitschriftenbeitrag | In: Zeitschrift für Erziehungswissenschaft | 2019 39123 Endnote: Autor*innen: Zehner, Fabian; Weis, Mirjam; Vogel, Freydis; Leutner, Detlev; Reiss, Kristina
Titel: Kollaboratives Problemlösen in PISA 2015. Deutschland im Fokus
In: Zeitschrift für Erziehungswissenschaft, 22 (2019) 3, S. 617-646
DOI: 10.1007/s11618-019-00874-4
URN: urn:nbn:de:0111-pedocs-176046
URL: http://nbn-resolving.org/urn:nbn:de:0111-pedocs-176046
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Deutsch
Schlagwörter: Schülerleistungstest; Fragebogen; PISA <Programme for International Student Assessment>; Internationaler Vergleich; Deutschland; OECD-Länder; Schüler; Problemlösen; Kooperation; Kompetenz; Schuljahr; Schulform; Computerunterstütztes Verfahren; Simulation; Technologiebasiertes Testen; Messverfahren; Qualität; Psychometrie; Item-Response-Theory; Skalierung
Abstract: Dieser Beitrag fokussiert die Ergebnisse in Deutschland zum internationalen Vergleich kollaborativer Problemlösekompetenz bei Fünfzehnjährigen im Programme for International Student Assessment (PISA) 2015 und berichtet Ergebnisse einer Kreuzvalidierung der Skalierung. Eingesetzt wurde ein neuer computerbasierter Test, der die Schülerinnen und Schüler mit simulierten Gruppenmitgliedern Probleme lösen lässt. Daten von n = 124.994 Fünfzehnjährigen aus 51 Staaten zur kollaborativen Problemlösekompetenz wurden erhoben. Die Schülerinnen und Schüler in Deutschland weisen eine überdurchschnittliche Kompetenz auf (525 Punkte), liegen eine viertel Standardabweichung unter dem OECD-Spitzenstaat Japan (552 Punkte) und eine viertel Standardabweichung über dem OECD-Schnitt (500 Punkte). In allen Staaten weisen Mädchen höhere Werte auf als Jungen. Während der Anteil hochkompetenter Jugendlicher in Deutschland vergleichbar hoch mit den Spitzenstaaten ausfällt, erreichen 21 % nur Kompetenzstufe I oder bleiben darunter, doppelt so viele wie in Japan. Der Beitrag präsentiert zudem nationale Ergebnisse, liefert empirische Evidenz zur Qualität des Tests und diskutiert diesen kritisch. (DIPF/Orig.)
Abstract (english): Focusing on Germany, this article presents results from the international comparison of fifteen-year-olds in collaborative problem solving and a cross validation of the scaling in the Programme for International Student Assessment (PISA) 2015. A new computer-based test was used requesting students to solve a problem jointly with simulated group members. Data from collaborative problem solving of fifteen-year-olds (n = 124,994) in 51 countries were assessed. The German mean competence level (525 points) is a quarter standard deviation above the OECD average (500 points) and a quarter standard deviation below the OECD's top performing country Japan (552 points). In all participating countries, girls outperform boys. While the percentage of top-performing students in Germany is comparable to proportions in the best-performing OECD countries, 21% of the students in Germany only reach competence level I or below, twice as many as in Japan. National results are presented as well as empirical evidence on the quality of the test, which is critically discussed. (DIPF/Orig.)
DIPF-Abteilung: Bildungsqualität und Evaluation