Ergebnis der Suche in der DIPF Publikationendatenbank

Ihre Abfrage:

(Schlagwörter: "Synonym")

Lexical substitution dataset for German Cholakov, Kostadin; Biemann, Chris; Eckle-Kohler, Judith; Gurevych, Iryna Sammelbandbeitrag | Aus: Calzolari, Nicoletta;Choukri,Khalid;Declerck,Thierry;Loftsson,Hrafn;Maegaard,Bente;Mariani,Joseph;Moreno,Asuncion;Odijk,Jan;Piperidis,Stelios (Hrsg.): Proceedings of the 9th International Conference on Language Resources and Evaluations (LREC 2014) | Reykjavik: European Language Resources Association | 2014 34575 Endnote: Autor*innen: Cholakov, Kostadin; Biemann, Chris; Eckle-Kohler, Judith; Gurevych, Iryna
Titel: Lexical substitution dataset for German
Aus: Calzolari, Nicoletta;Choukri,Khalid;Declerck,Thierry;Loftsson,Hrafn;Maegaard,Bente;Mariani,Joseph;Moreno,Asuncion;Odijk,Jan;Piperidis,Stelios (Hrsg.): Proceedings of the 9th International Conference on Language Resources and Evaluations (LREC 2014), Reykjavik: European Language Resources Association, 2014 , S. 1406-1411
URL: http://www.lrec-conf.org/proceedings/lrec2014/pdf/545_Paper.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Computerlinguistik; Computerunterstütztes Verfahren; Daten; Deutsch; Nachschlagewerk; Online; Sprachanalyse; Synonym; Textanalyse; World wide web 2.0; Wort
Abstract: This article describes a lexical substitution dataset for German. The whole dataset contains 2,040 sentences from the German Wikipedia, with one target word in each sentence. There are 51 target nouns, 51 adjectives, and 51 verbs randomly selected from 3 frequency groups based on the lemma frequency list of the German WaCKy corpus. 200 sentences have been annotated by 4 professional annotators and the remaining sentences by 1 professional annotator and 5 additional annotators who have been recruited via crowdsourcing. The resulting dataset can be used to evaluate not only lexical substitution systems, but also different sense inventories and word sense disambiguation systems.
DIPF-Abteilung: Informationszentrum Bildung

Supervised all-words lexical substitution using delexicalized features Szarvas, György; Biemann, Chris; Gurevych, Iryna Sammelbandbeitrag | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT) | Stroudsburg; PA: Association for Computational Linguistics | 2013 33528 Endnote: Autor*innen: Szarvas, György; Biemann, Chris; Gurevych, Iryna
Titel: Supervised all-words lexical substitution using delexicalized features
Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), Stroudsburg; PA: Association for Computational Linguistics, 2013 , S. 1131-1141
URL: https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2013/SzarvasBiemannGurevych_naaclhlt2013.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Automatisierung; Computerlinguistik; Information Retrieval; Methode; Modell; Sinn; Synonym; Textanalyse; Thesaurus; Verfahren; Wort
Abstract (english): We propose a supervised lexical substitution system that does not use separate classifiers per word and is therefore applicable to any word in the vocabulary. Instead of learning word-specific substitution patterns, a global model for lexical substitution is trained on delexicalized (i.e., non lexical) features, which allows to exploit the power of supervised methods while being able to generalize beyond target words in the training set. This way, our approach remains technically straightforward, provides better performance and similar coverage in comparison to unsupervised approaches. Using features from lexical resources, as well as a variety of features computed from large corpora (n-gram counts, distributional similarity) and a ranking method based on the posterior probabilities obtained from a Maximum Entropy classifier, we improve over the state of the art in the LexSub Best-Precision metric and the Generalized Average Precision measure. Robustness of our approach is demonstrated by evaluating it successfully on two different datasets.
DIPF-Abteilung: Informationszentrum Bildung

Uncertainty detection for natural language watermarking Szarvas, György; Gurevych, Iryna Sammelbandbeitrag | Aus: Mitkov, Ruslan; Park, Jong C. (Hrsg.): Proceedings of the Sixth International Joint Conference on Natural Language Processing (IJCNLP 2013) | Nagoya: Asian Federation of Natural Language Processing | 2013 34038 Endnote: Autor*innen: Szarvas, György; Gurevych, Iryna
Titel: Uncertainty detection for natural language watermarking
Aus: Mitkov, Ruslan; Park, Jong C. (Hrsg.): Proceedings of the Sixth International Joint Conference on Natural Language Processing (IJCNLP 2013), Nagoya: Asian Federation of Natural Language Processing, 2013 , S. 1188-1194
URL: https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2013/IJCNLP_2013_Szarvas.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Algorithmus; Computerlinguistik; Daten; Information; Synonym; Text; Veränderung; Wort
Abstract: In this paper we investigate the application of uncertainty detection to text watermarking, a problem where the aim is to produce individually identifiable copies of a source text via small manipulations to the text (e.g. synonym substitutions). As previous attempts showed, accurate paraphrasing is challenging in an open vocabulary setting, so we propose the use of the closed word class of uncertainty cues. We demonstrate that these words are promising for text watermarking as they can be accurately disambiguated (from the noncue uses of the same words) and their substitution with other cues has marginal impact to the meaning of the text.
DIPF-Abteilung: Informationszentrum Bildung