Remus, Steffen; Biemann, Chris:

Three knowledge-free methods for automatic lexical chain extraction

In: Association for Computational Linguistics (Hrsg.): Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics Atlanta, Georgia : Association for Computational Linguistics (2013) , 989-999

4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings

Computerlinguistik, Kohäsion, Semantik, Statistische Methode, Struktur, Text

We present three approaches to lexical chaining based on the LDA topic model and evaluate them intrinsically on a manually annotated set of German documents. After motivating the choice of statistical methods for lexical chaining with their adaptability to different languages and subject domains, we describe our new two-level chain annotation scheme, which rooted in the concept of cohesive harmony. Also, we propose a new measure for direct evaluation of lexical chains. Our three LDA-based approaches outperform two knowledge-based state-of-the art methods to lexical chaining by a large margin, which can be attributed to lacking coverage of the knowledge resource. Subsequent analysis shows that the three methods yield a different chaining behavior, which could be utilized in tasks that use lexical chaining as a component within NLP applications. (DIPF/Org.)

