DIPF database of publications

Detailansicht Treffer

DIPF database of publications

Show results

Author
Erbs, Nicolai; Gurevych, Iryna; Zesch, Torsten:

Title:
Hierarchy identification for automatically generating table-of-contents

Source:
In: Galia Angelova, Kalina Bontcheva, Ruslan Mitkov (Hrsg.): Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP) 2013 Shoumen, Bulgarien : RANLP (2013) , 252-260

URL of full text:
http://lml.bas.bg/ranlp2013/docs/RANLP_main.pdf

Language:
Englisch

Document type
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings

Schlagwörter:
Algorithmus, Analyse, Inhalt, Inhaltsanalyse, Stuktur, Text


Abstract(englisch):
A table-of-contents (TOC) provides a quick reference to a document's content and structure. We present the first study on identifying the hierarchical structure for automatically generating a TOC using only textual features instead of structural hints e.g. from HTML-tags. We create two new datasets to evaluate our approaches for hierarchy identification. We find that our algorithm performs on a level that is sufficient for a fully automated system. For documents without given segment titles, we extend our work by automatically generating segment titles. We make the datasets and our experimental framework publicly available in order to foster future research in TOC generation.


Notes: