Ergebnis der Suche in der DIPF Publikationendatenbank

Ihre Abfrage:

(Schlagwörter: "Technologiebasiertes Testen")

Disengaged response behavior when the response button is blocked. Evaluation of a micro-intervention Persic-Beck, Lothar; Goldhammer, Frank; Kroehne, Ulf Zeitschriftenbeitrag | In: Frontiers in Psychology. Section Quantitative Psychology and Measurement | 2022 43065 Endnote: Autor*innen: Persic-Beck, Lothar; Goldhammer, Frank; Kroehne, Ulf
Titel: Disengaged response behavior when the response button is blocked. Evaluation of a micro-intervention
In: Frontiers in Psychology. Section Quantitative Psychology and Measurement, 13 (2022) , S. 954532
DOI: 10.3389/fpsyg.2022.954532
URL: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.954532/full
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Antwort; Datenanalyse; Dauer; Effektivität; Einflussfaktor; Erwachsener; Evaluation; Frage; Intervention; Kompetenz; Leistungstest; Logdatei; Messung; Motivation; Technologiebasiertes Testen; Testkonstruktion; Validität; Verhalten; Verhaltensänderung
Abstract (english): In large-scale assessments, disengaged participants might rapidly guess on items or skip items, which can affect the score interpretation's validity. This study analyzes data from a linear computer-based assessment to evaluate a micro-intervention that blocked the possibility to respond for 2 s. The blocked response was implemented to prevent participants from accidental navigation and as a naive attempt to prevent rapid guesses and rapid omissions. The response process was analyzed by interpreting log event sequences within a finite-state machine approach. Responses were assigned to different response classes based on the event sequence. Additionally, post hoc methods for detecting rapid responses based on response time thresholds were applied to validate the classification. Rapid guesses and rapid omissions could be distinguished from accidental clicks by the log events following the micro-intervention. Results showed that the blocked response interfered with rapid responses but hardly led to behavioral changes. However, the blocked response could improve the post hoc detection of rapid responding by identifying responses that narrowly exceed time-bound thresholds. In an assessment context, it is desirable to prevent participants from accidentally skipping items, which in itself may lead to an increasing popularity of initially blocking responses. If, however, data from those assessments is analyzed for rapid responses, additional log data information should be considered.
DIPF-Abteilung: Lehr und Lernqualität in Bildungseinrichtungen

On the speed sensitivity parameter in the lognormal model for response times. Implications for test […] Becker, Benjamin; Debeer, Dries; Weirich, Sebastian; Goldhammer, Frank Zeitschriftenbeitrag | In: Applied Psychological Measurement | 2021 42009 Endnote: Autor*innen: Becker, Benjamin; Debeer, Dries; Weirich, Sebastian; Goldhammer, Frank
Titel: On the speed sensitivity parameter in the lognormal model for response times. Implications for test assembly
In: Applied Psychological Measurement, 45 (2021) 6, S. 407-422
DOI: 10.1177/01466216211008530
URL: https://journals.sagepub.com/doi/abs/10.1177/01466216211008530
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Software; Technologiebasiertes Testen; Messverfahren; Item-Response-Theory; Leistungstest; Frage; Antwort; Dauer; Einflussfaktor; Testkonstruktion; Modell; Vergleich; Testtheorie; Simulation
Abstract: In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed. (DIPF/Orig.)
Abstract (english): In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed. (DIPF/Orig.)
DIPF-Abteilung: Lehr und Lernqualität in Bildungseinrichtungen

Simultaneous constrained adaptive item selection for group-based testing Bengs, Daniel; Kröhne, Ulf; Brefeld, Ulf Zeitschriftenbeitrag | In: Journal of Educational Measurement | 2021 40702 Endnote: Autor*innen: Bengs, Daniel; Kröhne, Ulf; Brefeld, Ulf
Titel: Simultaneous constrained adaptive item selection for group-based testing
In: Journal of Educational Measurement, 58 (2021) 2, S. 236-261
DOI: 10.1111/jedm.12285
URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12285
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Adaptives Testen; Aufgabe; Auswahl; Computerunterstütztes Verfahren; Empirische Untersuchung; Gruppe; Leistungsmessung; Modell; Simulation; Technologiebasiertes Testen; Test
Abstract (english): By tailoring test forms to the test‐taker's proficiency, Computerized Adaptive Testing (CAT) enables substantial increases in testing efficiency over fixed forms testing. When used for formative assessment, the alignment of task difficulty with proficiency increases the chance that teachers can derive useful feedback from assessment data. The application of CAT to formative assessment in the classroom, however, is hindered by the large number of different items used for the whole class; the required familiarization with a large number of test items puts a significant burden on teachers. An improved CAT procedure for group‐based testing is presented, which uses simultaneous automated test assembly to impose a limit on the number of items used per group. The proposed linear model for simultaneous adaptive item selection allows for full adaptivity and the accommodation of constraints on test content. The effectiveness of the group‐based CAT is demonstrated with real‐world items in a simulated adaptive test of 3,000 groups of test‐takers, under different assumptions on group composition. Results show that the group‐based CAT maintained the efficiency of CAT, while a reduction in the number of used items by one half to two‐thirds was achieved, depending on the within‐group variance of proficiencies.
DIPF-Abteilung: Bildungsqualität und Evaluation

Advancements in technology-based assessment. Emerging item formats, test designs, and data sources Goldhammer, Frank; Scherer, Ronny; Greiff, Samuel (Hrsg.) Sammelband | Lausanne: Frontiers Media | 2020 39802 Endnote: Herausgeber*innen: Goldhammer, Frank; Scherer, Ronny; Greiff, Samuel
Titel: Advancements in technology-based assessment. Emerging item formats, test designs, and data sources
Erscheinungsvermerk: Lausanne: Frontiers Media, 2020 (Frontiers in Psychology. Sonderheft)
DOI: 10.3389/fpsyg.2019.03047
URL: https://www.frontiersin.org/research-topics/7841/advancements-in-technology-based-assessment-emerging-item-formats-test-designs-and-data-sources
Dokumenttyp: 2. Herausgeberschaft; Zeitschriftensonderheft
Sprache: Englisch
Schlagwörter: Technologiebasiertes Testen; Item; Test; Design; Auswertung; Automatisierung; Prozessdatenverarbeitung; Lernen; Bewertung
Abstract (english): Technology has become an indispensable tool for educational and psychological assessment in today's world. Researchers and large-scale assessment programs alike are increasingly using digital technology (e.g., laptops, tablets, and smartphones) to collect behavioral data beyond the mere idea of responses as correct. Along these lines, technology innovates and enhances assessments in terms of item and test design, methods of test delivery, data collection and analysis, as well as the reporting of test results. The aim of this Research Topic is to present recent advancements in technology-based assessment. Our focus is on cognitive assessments, including the measurement of abilities, competencies, knowledge, and skills but may also include non-cognitive aspects of the assessment. In the area of (cognitive) assessments the innovations driven by technology are manifold: Digital assessments facilitate the creation of new types of stimuli and response formats that were out of reach for assessments using paper; for instance, interactive simulations including multimedia elements, as well as virtual or augmented realities which serve as the task environment. Moreover, technology allows the automated generation of items based on specific item models. Such items can be assembled into tests in a more flexible way than that offered by paper-and-pencil tests and could even be created on the fly; for instance, tailoring item difficulty to individual ability (adaptive testing), while assuring that multiple content constraints are met. As a requirement for adaptive testing or to lower the burden of raters coding item responses manually, computers enable the automatic scoring of constructed responses; for instance, text responses can be scored automatically by using natural language processing and text mining. Technology-based assessments provide not only response data (e.g., correct vs. incorrect responses) but also process data (e.g., frequencies and sequences of test-taking strategies, including navigation behavior) which reflects the course of solving a test item. Process data has been used successfully, among others, to evaluate the data quality, to define process-oriented constructs, to improve measurement precision, and to address substantial research questions. We expect the contributions of this Research Topic to build on this research by considering how technology can further improve, and enhance, educational and psychological assessment. Regarding educational testing, both research papers on the assessment of learning (e.g., summative assessment of learning outcomes) and on the assessment for learning (e.g., formative assessment to support the learning process) are welcome. We expect submissions of empirical papers that present and evaluate innovative technology-based assessment approaches, as well as new applications or illustrations of already existing approaches. We are also interested in papers addressing the validity of test scores and other indicators obtained from innovative assessment procedures.
DIPF-Abteilung: Bildungsqualität und Evaluation

Evaluation of online information in university students. Development and scaling of the screening […] Hahnel, Carolin; Eichmann, Beate; Goldhammer, Frank Zeitschriftenbeitrag | In: Frontiers in Psychology | 2020 40881 Endnote: Autor*innen: Hahnel, Carolin; Eichmann, Beate; Goldhammer, Frank
Titel: Evaluation of online information in university students. Development and scaling of the screening instrument EVON
In: Frontiers in Psychology, (2020) , S. 11:562128
DOI: 10.3389/fpsyg.2020.562128
URN: urn:nbn:de:0111-pedocs-232241
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-232241
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Deutschland; Internet; Informationskompetenz; Ressource; Glaubwürdigkeit; Relevanz; Bewertung; Test; Testentwicklung; Itemanalyse; Suchmaschine; Simulation; Technologiebasiertes Testen; Interview; Erhebungsinstrument; Evaluation; Student; Rasch-Modell; Empirische Untersuchung;
Abstract: As Internet sources provide information of varying quality, it is an indispensable prerequisite skill to evaluate the relevance and credibility of online information. Based on the assumption that competent individuals can use different properties of information to assess its relevance and credibility, we developed the EVON (evaluation of online information), an interactive computer-based test for university students. The developed instrument consists of eight items that assess the skill to evaluate online information in six languages. Within a simulated search engine environment, students are requested to select the most relevant and credible link for a respective task. To evaluate the developed instrument, we conducted two studies: (1) a pre-study for quality assurance and observing the response process (cognitive interviews of n = 8 students) and (2) a main study aimed at investigating the psychometric properties of the EVON and its relation to other variables (n = 152 students). The results of the pre-study provided first evidence for a theoretically sound test construction with regard to students' item processing behavior. The results of the main study showed acceptable psychometric outcomes for a standardized screening instrument with a small number of items. The item design criteria affected the item difficulty as intended, and students' choice to visit a website had an impact on their task success. Furthermore, the probability of task success was positively predicted by general cognitive performance and reading skill. Although the results uncovered a few weaknesses (e.g., a lack of difficult items), and the efforts of validating the interpretation of EVON outcomes still need to be continued, the overall results speak in favor of a successful test construction and provide first indication that the EVON assesses students' skill in evaluating online information in search engine environments. (DIPF/Orig.)
DIPF-Abteilung: Bildungsqualität und Evaluation

Rapid guessing rates across administration mode and test setting Kröhne, Ulf; Deribo, Tobias; Goldhammer, Frank Zeitschriftenbeitrag | In: Psychological Test and Assessment Modeling | 2020 40317 Endnote: Autor*innen: Kröhne, Ulf; Deribo, Tobias; Goldhammer, Frank
Titel: Rapid guessing rates across administration mode and test setting
In: Psychological Test and Assessment Modeling, 62 (2020) 2, S. 144-177
DOI: 10.25656/01:23630
URN: urn:nbn:de:0111-pedocs-236307
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-236307
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft
Sprache: Englisch
Schlagwörter: Test; Bewertung; Innovation; Validität; Technologiebasiertes Testen; Design; Testkonstruktion; Testverfahren; Wirkung; Verhalten; Logdatei; Experiment; Student; Vergleichsuntersuchung
Abstract (english): Rapid guessing can threaten measurement invariance and the validity of large-scale assessments, which are often conducted under low-stakes conditions. Comparing measures collected under different administration modes or in different test settings necessitates that rapid guessing rates also be comparable. Response time thresholds can be used to identify rapid guessing behavior. Using data from an experiment embedded in an assessment of university students as part of the National Educational Panel Study (NEPS), we show that rapid guessing rates can differ across modes. Specifically, rapid guessing rates are found to be higher for un-proctored individual online assessment. It is also shown that rapid guessing rates differ across different groups of students and are related to properties of the test design. No relationship between dropout behavior and rapid guessing rates was found. (DIPF/Orig.)
DIPF-Abteilung: Bildungsqualität und Evaluation

Reanalysis of the German PISA data. A comparison of different approaches for trend estimation with […] Robitzsch, Alexander; Lüdtke, Oliver; Goldhammer, Frank; Kröhne, Ulf; Köller, Olaf Zeitschriftenbeitrag | In: Frontiers in Psychology | 2020 40319 Endnote: Autor*innen: Robitzsch, Alexander; Lüdtke, Oliver; Goldhammer, Frank; Kröhne, Ulf; Köller, Olaf
Titel: Reanalysis of the German PISA data. A comparison of different approaches for trend estimation with a particular emphasis on mode effects
In: Frontiers in Psychology, (2020) , S. 11:884
DOI: 10.3389/fpsyg.2020.00884
URN: urn:nbn:de:0111-pedocs-232269
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-232269
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: PISA <Programme for International Student Assessment>; Test; Verfahren; Skalierung; Methode; Technologiebasiertes Testen; Veränderung; Entwicklung; Wirkungsforschung; Deutschland
Abstract: International large-scale assessments, such as the Program for International Student Assessment (PISA), are conducted to provide information on the effectiveness of education systems. In PISA, the target population of 15-year-old students is assessed every 3 years. Trends show whether competencies have changed in the countries between PISA cycles. In order to provide valid trend estimates, it is desirable to retain the same test conditions and statistical methods in all PISA cycles. In PISA 2015, however, the test mode changed from paper-based to computer-based tests, and the scaling method was changed. In this paper, we investigate the effects of these changes on trend estimation in PISA using German data from all PISA cycles (2000-2015). Our findings suggest that the change from paper-based to computer-based tests could have a severe impact on trend estimation but that the change of the scaling model did not substantially change the trend estimates.
DIPF-Abteilung: Bildungsqualität und Evaluation

ReCo: Textantworten automatisch auswerten. Methodenworkshop Zehner, Fabian; Andersen, Nico Zeitschriftenbeitrag | In: Zeitschrift für Soziologie der Erziehung und Sozialisation | 2020 40196 Endnote: Autor*innen: Zehner, Fabian; Andersen, Nico
Titel: ReCo: Textantworten automatisch auswerten. Methodenworkshop
In: Zeitschrift für Soziologie der Erziehung und Sozialisation, 40 (2020) 3, S. 334-340
DOI: 10.25656/01:22115
URN: urn:nbn:de:0111-pedocs-221153
URL: https://nbn-resolving.org/urn:nbn:de:0111-pedocs-221153
Dokumenttyp: 3b. Beiträge in weiteren Zeitschriften; praxisorientiert
Sprache: Deutsch
Schlagwörter: Software; Technologiebasiertes Testen; Antwort; Text; Testauswertung; Automatisierung; Datenanalyse; Konzeption; Methodik
Abstract: Mit dem vorliegenden Beitrag wird erstmalig der Prototyp einer R- sowie Java-basierten und frei verfügbaren Software veröffentlicht, die für die Verwendung mit deutschen Textantworten evaluiert wurde und aktuell für weitere Sprachen weiter entwickelt wird: ReCo (Automatic Text Response Coder; Zehner, Sälzer & Goldhammer, 2016). ReCo ist auf Kurztextantworten spezialisiert und adressiert Semantik, weshalb auch von Inhaltsscoring die Rede ist. Die hier vorgestellte Software enthält einen Demodatensatz, bei dem es wichtig ist, vorab anzumerken, dass dieser und die hier zitierten Beispielantworten lediglich eine sehr geringe Sprachvielfalt enthalten. Das liegt daran, dass dieser Datensatz auf empirischen Daten basiert und wegen deren Vertraulichkeit umfangreich manuell manipuliert wurde, was mit sprachlich komplexeren Items nicht möglich gewesen wäre. Die ReCo-Methodik selbst funktioniert hingegen auch bei komplexeren Antworten [...]. Dieser Beitrag skizziert kurz die ReCo-Methodik und stellt erstmals die Shiny-App vor, die automatisches Kodieren für eigene Daten flexibel anwendbar macht. Dafür wird skizziert, wie der aktuell verfügbare Prototyp installiert und auf einen Demodatensatz angewendet wird. Zuletzt gibt der Beitrag einen Ausblick, welche Funktionalitäten die App nach Verlassen der aktuellen Prototypenphase sowie in der langfristigen Entwicklung haben wird. Aktuelle Entwicklungen können auf der ReCo-Webseite verfolgt werden: www.reco.science (DIPF/Orig.)
DIPF-Abteilung: Bildungsqualität und Evaluation

Evaluating educational standards using assessment "with" and "through" technology Frenken, Lena; Libbrecht, Paul; Greefrath, Gilbert; Schiffner, Daniel; Schnitzler, Carola Sammelbandbeitrag | Aus: Donevska-Todorova, Ana; Faggiano, Eleonora; Trgalova, Jana; Lavicza, Zsolt; Weinhandl, Robert; Clark-Wilson, Alison; Weigand, Hans-Georg (Hrsg.): Proceedings of the Thenth ERME Topic Conference (ETC 10) on Mathematics Education in the Digital Age (MEDA), 16-18 September 2020 in Linz, Austria | Paris: Centre pour la communication scientifique directe | 2020 40260 Endnote: Autor*innen: Frenken, Lena; Libbrecht, Paul; Greefrath, Gilbert; Schiffner, Daniel; Schnitzler, Carola
Titel: Evaluating educational standards using assessment "with" and "through" technology
Aus: Donevska-Todorova, Ana; Faggiano, Eleonora; Trgalova, Jana; Lavicza, Zsolt; Weinhandl, Robert; Clark-Wilson, Alison; Weigand, Hans-Georg (Hrsg.): Proceedings of the Thenth ERME Topic Conference (ETC 10) on Mathematics Education in the Digital Age (MEDA), 16-18 September 2020 in Linz, Austria, Paris: Centre pour la communication scientifique directe, 2020 , S. 361-368
URL: https://hal.archives-ouvertes.fr/hal-02932218/document#page=374
Dokumenttyp: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Schüler; Leistungsbeurteilung; Vergleichsarbeit; Bildungsstandards; Mathematik; Technologiebasiertes Testen; Umsetzung; Deutschland
Abstract: This paper reports on a feasibility study of creating a standardised assessment instrument to evaluate students' competencies found in the German national standards. The study aimed at combining widespread tools in math-classes, such as dynamic geometry and spreadsheets, in an integrated and computer-driven way. We report on the mathematical and technical feasibility: What limits were reached, and which opportunities have appeared? The report provides indications that a development process is feasible but that an attention to the task description is required, as the student may be unaware of the manipulations to perform tasks. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Analysing log file data from PIAAC Goldhammer, Frank; Hahnel, Carolin; Kroehne, Ulf Sammelbandbeitrag | Aus: Maehler, Débora B.; Rammstedt, Beatrice (Hrsg.): Large-scale cognitive assessment: Analyzing PIAAC data | Cham: Springer | 2020 40529 Endnote: Autor*innen: Goldhammer, Frank; Hahnel, Carolin; Kroehne, Ulf
Titel: Analysing log file data from PIAAC
Aus: Maehler, Débora B.; Rammstedt, Beatrice (Hrsg.): Large-scale cognitive assessment: Analyzing PIAAC data, Cham: Springer, 2020 (Methodology of Educational Measurement and Assessment), S. 239-269
DOI: 10.1007/978-3-030-47515-4_10
URL: https://link.springer.com/chapter/10.1007/978-3-030-47515-4_10
Dokumenttyp: 4. Beiträge in Sammelwerken; Sammelband (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: PIAAC (Programme for the International Assessment of Adult Competencies); Technologiebasiertes Testen; Computerunterstütztes Verfahren; Logdatei; Datenanalyse; Software; Tools; Nutzung; Forschung; Zugang; Dokumentation
Abstract: The OECD Programme for the International Assessment of Adult Competencies (PIAAC) was the first computer-based large-scale assessment to provide anonymised log file data from the cognitive assessment together with extensive online documentation and a data analysis support tool. The goal of the chapter is to familiarise researchers with how to access, understand, and analyse PIAAC log file data for their research purposes. After providing some conceptual background on the multiple uses of log file data and how to infer states of information processing from log file data, previous research using PIAAC log file data is reviewed. Then, the accessibility, structure, and documentation of the PIAAC log file data are described in detail, as well as how to use the PIAAC LogDataAnalyzer to extract predefined process indicators and how to create new process indicators based on the raw log data export.
DIPF-Abteilung: Bildungsqualität und Evaluation