Menü Überspringen
Kontakt
Presse
Deutsch
English
Not track
Datenverarbeitung
Suche
Anmelden
DIPF aktuell
Forschung
Infrastrukturen
Institut
Zurück
Kontakt
Presse
Deutsch
English
Not track
Datenverarbeitung
Suche
Startseite
>
Forschung
>
Publikationen
>
Publikationendatenbank
Ergebnis der Suche in der DIPF Publikationendatenbank
Ihre Abfrage:
(Schlagwörter: "Automatisierung")
zur erweiterten Suche
Suchbegriff
Nur Open Access
Suchen
Markierungen aufheben
Alle Treffer markieren
Export
54
Inhalte gefunden
Alle Details anzeigen
Supervised all-words lexical substitution using delexicalized features
Szarvas, György; Biemann, Chris; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT) | Stroudsburg; PA: Association for Computational Linguistics | 2013
33528 Endnote
Autor*innen:
Szarvas, György; Biemann, Chris; Gurevych, Iryna
Titel:
Supervised all-words lexical substitution using delexicalized features
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), Stroudsburg; PA: Association for Computational Linguistics, 2013 , S. 1131-1141
URL:
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2013/SzarvasBiemannGurevych_naaclhlt2013.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Information Retrieval; Methode; Modell; Sinn; Synonym; Textanalyse; Thesaurus; Verfahren; Wort
Abstract (english):
We propose a supervised lexical substitution system that does not use separate classifiers per word and is therefore applicable to any word in the vocabulary. Instead of learning word-specific substitution patterns, a global model for lexical substitution is trained on delexicalized (i.e., non lexical) features, which allows to exploit the power of supervised methods while being able to generalize beyond target words in the training set. This way, our approach remains technically straightforward, provides better performance and similar coverage in comparison to unsupervised approaches. Using features from lexical resources, as well as a variety of features computed from large corpora (n-gram counts, distributional similarity) and a ranking method based on the posterior probabilities obtained from a Maximum Entropy classifier, we improve over the state of the art in the LexSub Best-Precision metric and the Generalized Average Precision measure. Robustness of our approach is demonstrated by evaluating it successfully on two different datasets.
DIPF-Abteilung:
Informationszentrum Bildung
UKP-BIU. Similarity and entailment metrics for student response analysis
Zesch, Torsten; Levy, Omer; Gurevych, Iryna; Dagan, Ido
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013), in conjunction with the 2nd Joint Conference on Lexical and Computational Semantics (*SEM 2013) | Stroudsburg; PA: Association for Computational Linguistics | 2013
33554 Endnote
Autor*innen:
Zesch, Torsten; Levy, Omer; Gurevych, Iryna; Dagan, Ido
Titel:
UKP-BIU. Similarity and entailment metrics for student response analysis
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013), in conjunction with the 2nd Joint Conference on Lexical and Computational Semantics (*SEM 2013), Stroudsburg; PA: Association for Computational Linguistics, 2013 , S. 285-289
URL:
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2013/S13-2048.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Antwort; Automatisierung; Computerunterstütztes Verfahren; Evaluation; Messverfahren; Qualität; Schülerleistungstest; Semantik; Textanalyse
Abstract:
Our system combines text similarity measures with a textual entailment system. In the main task, we focused on the influence of lexicalized versus unlexicalized features, and how they affect performance on unseen questions and domains. We also participated in the pilot partial entailment task, where our system significantly outperforms a strong baseline.
DIPF-Abteilung:
Informationszentrum Bildung
Acquisition of multiword lexical units for FrameNet
Hartmann, Silvana; Gurevych, Iryna
Verschiedenartige Dokumente
| 2013
33590 Endnote
Autor*innen:
Hartmann, Silvana; Gurevych, Iryna
Titel:
Acquisition of multiword lexical units for FrameNet
Erscheinungsvermerk:
Berkeley: Språkbanken (the Swedish Language Bank), 2013 (International FrameNet Workshop 2013)
URL:
http://spraakbanken.gu.se/sites/spraakbanken.gu.se/files/fn_mwe_at_fn_ws_130419.pdf
Dokumenttyp:
5. Arbeits- und Diskussionspapiere; Arbeits- und Diskussionspapier (keine besondere Kategorie)
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Lexikon; Semantik; Textanalyse; Wort
Abstract (english):
FrameNet [1] is a well-known resource for modeling the predicate argument structure of words and organizing them in situation-specific frames and semantic roles (i.e., frame elements). Its interesting formalism to represent the semantics of multiword expressions (MWEs) is often overlooked [2]. FrameNet can represent the relation between constituents of Figure 1: Incorporated roles. MWEs. The following example from [2] illustrates this: storage container and bread container evoke the Container frame. Roles of this frame are the Material of the container, its Contents, Size, or Function. For storage container, storage the Function role, while for bread container, bread the Contents role (Fig. 1). The FrameNet lexicon model provides the option to annotate Function and Contents as an "incorporated role" (ICR) for the respective MWEs. Thus, the implicit relations between the constituents of the MWEs are made explicit. A large FrameNet MWE lexicon can enhance FrameNet-based semantic role labeling (SRL) by a better model for MWEs see analogous developments integrating MWE detection in parsing [3]. Moreover, the lexicon can be used as information source for the automatic interpretation of MWEs in applications such as information extraction, question answering, or machine translation, for instance by providing features for noun compound interpretation (NCI) [5]. Finally, it provides a basis for further theoretical investigation of MWE semantics. Unfortunately, the coverage of MWEs in FrameNet 1.5 is low; it contains less than 1,000 multi-word entries. This also aspects the performance of FrameNet-based SRL [4]. Currently, FrameNet does not make use of its potential to model the relations within MWEs: even though leather jacket does occur in the FrameNet example sentences for the Clothing frame with the desired incorporated role (Material), it does not receive a separate lexical entry. To close this gap, and to make full use of FrameNet's potential, an automatic process for the acquisition of MWE lexical units and MWE semantics is desired. Such an automatic approach needs to be based on solid theoretical foundations. Therefore, we present an analysis of the current state of MWEs in FrameNet. Then, we focus on the acquisition of MWE semantics, specically of ICRs, which, to our knowledge, has not been addressed before. We present a new approach to bootstrap the ICRs of MWEs in FrameNet by annotating their paraphrases with semantic roles, for instance container that contains bread for bread container. The semantic dependencies between the verb contains that evokes the Container frame and bread, that the Contents role, mirror the relations between the constituents in bread container (Fig. 2). Thus, we can extract the incorporated arguments from the explicit role annotations on the paraphrases. Our approach is related to the work on NCI using paraphrases [6], but is not restricted to compounds and applicable in a multilingual setting. For lexical acquisition of MWEs, previous work on lexical acquisition for FrameNet, for instance using distributional methods [7], can be adapted to MWEs. Our contributions are (i) analyzing the state of MWEs in FrameNet, and (ii) a preliminary evaluation and discussion of the proposed method for ICR detection on MWEs.
DIPF-Abteilung:
Informationszentrum Bildung
Detecting and correcting language errors using measures of contextual fitness
Zesch, Torsten
Zeitschriftenbeitrag
| In: TAL Journal | 2012
33563 Endnote
Autor*innen:
Zesch, Torsten
Titel:
Detecting and correcting language errors using measures of contextual fitness
In:
TAL Journal, 53 (2012) 3, S. 11-31
URL:
http://www.atala.org/IMG/pdf/Zesch-TAL3-3.pdf
Dokumenttyp:
3a. Beiträge in begutachteten Zeitschriften; Beitrag in Sonderheft
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Fehler; Messung; Nachschlagewerk; Online; Rechtschreibung; Textanalyse
Abstract (english):
While detecting simple language errors (e.g. misspellings, number agreement, etc.) is nowadays standard functionality in all but the simplest text-editors, other more complicated language errors might go unnoticed. A difficult case are errors that come in the disguise of a valid word that fits syntactically into the sentence. We use the Wikipedia revision history to extract a dataset with such errors in their context. We show that the new dataset provides a more realistic picture of the performance of contextual fitness measures. The achieved error detection quality is generally sufficient for competent language users who are willing to accept a certain level of false alarms, but might be problematic for non-native writers who accept all suggestions made by the systems. We make the full experimental framework publicly available which will allow other scientists to reproduce our experiments and to conduct follow-up experiments.
DIPF-Abteilung:
Informationszentrum Bildung
Markierungen aufheben
Alle Treffer markieren
Export
<
1
...
5
6
(aktuell)
Alle anzeigen
(54)