-
-
Autor*innen: Zesch, Torsten
Titel: Measuring contextual fitness using error contexts extracted from the Wikipedia revision history
Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon: Association for Computational Linguistics, 2012 , S. 529-538
URL: http://aclweb.org/anthology-new/E/E12/E12-1054.pdf
Dokumenttyp: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Computerlinguistik; Evaluation; Fehler; Messung; Semantik; Statistische Methode; Textanalyse; Verfahren
Abstract (english): We evaluate measures of contextual fitness on the task of detecting real-word spelling errors. For that purpose, we extract naturally occurring errors and their contexts from the Wikipedia revision history. We show that such natural errors are better suited for evaluation than the previously used artificially created errors. In particular, the precision of statistical methods has been largely over-estimated, while the precision of knowledge-based approaches has been under-estimated. Additionally, we show that knowledge-based approaches can be improved by using semantic relatedness measures that make use of knowledge beyond classical taxonomic relations. Finally, we show that statistical and knowledgebased methods can be combined for increased performance.
DIPF-Abteilung: Informationszentrum Bildung