Learning shortest paths for word graphs
In: Atzmueller, Martin; Scholz, Christoph (Hrsg.): The Fourth International Workshop on Mining Ubiquitous and Social Environments
URL des Volltextes:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
The vast amount of information on the Web drives the need for aggregation and summarisation techniques. We study event extrac- tion as a text summarisation task using redundant sentences which is also known as sentence compression. Given a set of sentences describing the same event, we aim at generating a summarisation that is (i) a single sentence, (ii) simply structured and easily understandable, and (iii) minimal in terms of the number of words/tokens. Existing approaches for sentence compression are often based on finding the shortest path in word graphs that is spanned by related input sentences. These approaches, however, deploy manually crafted heuristics for edge weights and lack theoretical justification. In this paper, we cast sentence compression as a structured prediction problem. Edges of the compression graph are represented by features drawn from adjacent nodes so that corresponding weights are learned by a generalised linear model. Decoding is performed in polynomial time by a generalised shortest path algorithm using loss augmented inference. We report on preliminary results on artificial and real world data.