Logo: Deutsches Institut für Internationale Pädagogische Forschung

Publications

Publikationendatenbank

show results

Autor:
Szarvas, György; Vincze, Veronika; Farkas, Richárd; Móra, György; Gurevych, Iryna:

Titel:
Cross-genre and cross-domain detection of semantic uncertainty

Quelle:
In: Computational Linguistics Journal, 38 (2012) 2 , 335-367

URL des Volltextes:
http://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00098

Sprache:
Englisch

Dokumenttyp:
3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft

Schlagwörter:
Computerlinguistik, Computerunterstütztes Verfahren, Information, Information Retrieval, Klassifikation, Modell, Natürlichsprachiges System, Semantik, Sprachanalyse, Textanalyse, Wissenschaftsdisziplin


Abstract(englisch):
Uncertainty is an important linguistic phenomenon that is relevant in various Natural Language Processing applications, in diverse genres from medical to community generated, newswire or scientific discourse and domains from science to humanities. The semantic uncertainty of a proposition can be identified in most cases by using a finite dictionary - i.e. lexical cues - and the key steps of uncertainty detection in an application include the steps of locating the (genre- and domain-specific) lexical cues, disambiguating them, and linking them with the units of interest for the particular application (e.g. identified events in information extraction). In this study, we focus on the genre and domain differences of the context-dependent semantic uncertainty cue recognition task. We introduce a unified subcategorization of semantic uncertainty as different domain applications can apply different uncertainty categories. Based on this categorization, we normalized the annotation of three corpora and present results with a state-of-the-art uncertainty cue recognition model for four fine-grained categories of semantic uncertainty. Our results reveal the domain and genre dependence of the problem; nevertheless, we also show that even a distant source domain dataset can contribute to the recognition and disambiguation of uncertainty cues, efficiently reducing the annotation costs needed to cover a new domain. Thus, the unified subcategorization and domain adaptation for training the models offer an efficient solution for cross-domain and cross-genre semantic uncertainty recognition.


DIPF-Abteilung:
Informationszentrum Bildung

Notizen:

last modified Nov 11, 2016