Logo: Deutsches Institut für Internationale Pädagogische Forschung

Forschung

Publikationendatenbank

Treffer anzeigen

Autor:
Reimers, Nils; Gurevych, Iryna:

Titel:
Reporting score distributions makes a difference
Performance study of LSTM-networks for sequence tagging

Quelle:
In: Association for Computational Linguistics (Hrsg.): The Conference on Empirical Methods in Natural Language Processing (EMNLP 2017) Stroudsburg, PA : Association for Computational Linguistics (2017) , 338-348

URL des Volltextes:
http://www.aclweb.org/anthology/D/D17/D17-1035.pdf

Sprache:
Englisch

Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings

Schlagwörter:
Computerlinguistik, Spracherkennung


Abstract(original):
In this paper we show that reporting a single performance score is insufficient to compare non-deterministic approaches. We demonstrate for common sequence tagging tasks that the seed value for the random number generator can result in statistically significant (p ( 10^-4) differences for state-of-the-art systems. For two recent systems for NER, we observe an absolute difference of one percentage point F1-score depending on the selected seed value, making these systems perceived either as state-of-the-art or mediocre. Instead of publishing and reporting single performance scores, we propose to compare score distributions based on multiple executions. Based on the evaluation of 50.000 LSTM-networks for five sequence tagging tasks, we present network architectures that produce both superior performance as well as are more stable with respect to the remaining hyperparameters. The full experimental results are published in (Reimers and Gurevych, 2017). The implementation of our network is publicly available. (DIPF/Orig.)


DIPF-Abteilung:
Informationszentrum Bildung

Notizen:

zuletzt verändert: 11.11.2016