Logo: Deutsches Institut für Internationale Pädagogische Forschung

Forschung

Publikationendatenbank

Treffer anzeigen

Autor:
Hirschmann, Fabian; Nam, Jinseok; Fürnkranz, Johannes:

Titel:
What makes word-level neural machine translation hard
A case study on English-German translation

Quelle:
In: The COLING 2016 Organizing Committee (Hrsg.): Proceedings of the 26th International Conference on Computational Linguistics (COLING) Osaka : The COLING 2016 Organizing Committee (2016) , 3199-3208

URL des Volltextes:
http://aclweb.org/anthology/C/C16/C16-1301.pdf

Sprache:
Englisch

Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings

Schlagwörter:
Automatisierung, Computerlinguistik, Computerunterstütztes Verfahren, Deutsch, Englisch, Syntax, Übersetzung, Wort, Wörterbuch


Abstract(englisch):
Traditional machine translation systems often require heavy feature engineering and the combination of multiple techniques for solving different subproblems. In recent years, several end-to-end learning architectures based on recurrent neural networks have been proposed. Unlike traditional systems, Neural Machine Translation (NMT) systems learn the parameters of the model and require only minimal preprocessing. Memory and time constraints allow to take only a fixed number of words into account, which leads to the out-of-vocabulary (OOV) problem. In this work, we analyze why the OOV problem arises and why it is considered a serious problem in German. We study the effectiveness of compound word splitters for alleviating the OOV problem, resulting in a 2.5+ BLEU points improvement over a baseline on the WMT'14 German-to-English translation task. For English-to-German translation, we use target-side compound splitting through a special syntax during training that allows the model to merge compound words and gain 0.2 BLEU points. (DIPF/Orig.)


DIPF-Abteilung:
Informationszentrum Bildung

Notizen:

zuletzt verändert: 11.11.2016