Search results in the DIPF database of publications

Your query:

(Schlagwörter: "Computerlinguistik")

Twin BERT contextualized sentence embedding space learning and gradient-boosted decision tree […] Gombert, Sebastian Book Chapter | Aus: Zehe, Albin; Konle, Leonard; Dümpelmann, Lea; Guis, Evelyn; Guhr, Svenja; Hotho, Andreas; Jannidis, Fotis; Kaufmann, Lucas; Krug, Markus; Puppe, Frank; Reiter, Nils; Schreiber, Annekea (Hrsg.): Shared Task on Scene Segmentation @ KONVENS 2021 (STSS 2021): Proceedings of the Shared Task on Scene Segmentation, co-located with the 17th Conference on Natural Language Processing (KONVENS 2021), Düsseldorf, Germany, September 6th, 2021 | Aachen: RWTH | 2021 42080 Endnote: Author(s): Gombert, Sebastian
Title: Twin BERT contextualized sentence embedding space learning and gradient-boosted decision tree ensembles for scene segmentation in German literature
In: Zehe, Albin; Konle, Leonard; Dümpelmann, Lea; Guis, Evelyn; Guhr, Svenja; Hotho, Andreas; Jannidis, Fotis; Kaufmann, Lucas; Krug, Markus; Puppe, Frank; Reiter, Nils; Schreiber, Annekea (Hrsg.): Shared Task on Scene Segmentation @ KONVENS 2021 (STSS 2021): Proceedings of the Shared Task on Scene Segmentation, co-located with the 17th Conference on Natural Language Processing (KONVENS 2021), Düsseldorf, Germany, September 6th, 2021, Aachen: RWTH, 2021 (CEUR Workshop Proceedings, 3001), S. 42-48
URL: http://ceur-ws.org/Vol-3001/paper5.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Geisteswissenschaften; Digitalisierung; Text; Erzählung; Szene; Segmentierung; Künstliche Intelligenz; Computerlinguistik; Tagungsbeitrag
Abstract: Dieser Beitrag dokumentiert eine Einreichung für die Shared Task zur Szenensegmentierung, die im Rahmen der KONVENS 2021 durchgeführt wurde (Zehe et al., 2021b). Ziel dieser Gemeinschaftsaufgabe war es, Methoden zu finden, um narrative Texte in verschiedene Szenen zu segmentieren - Textabschnitte, in denen Ort, Zeit und Figurenkonstellation mehr oder weniger kohärent bleiben. Diese Aufgabe wird als Satzklassifikationsaufgabe formuliert, bei der Sätze am Rande der Szenen von Sätzen innerhalb der Szene unterschieden werden müssen. Der in dieser Arbeit vorgestellte Ansatz basiert auf zwei Schritten. Im ersten Schritt wird ein Twin-BERT-Trainingsaufbau verwendet, um einen Satzeinbettungsraum zu erlernen, in dem Sätze, die als Szenengrenzen fungieren, gut von Sätzen, die sich in der Szene befinden, unterschieden werden. Im zweiten Schritt werden die von diesem Modell generierten Satzeinbettungen als Merkmalsvektoren verwendet, um ein gradient-boosted Entscheidungsbaum-Ensemble zu füttern, das die endgültigen Vorhersagen durchführt. In der Rangliste der gemeinsamen Aufgaben belegte das System den zweiten Platz in Track 1 und den ersten Platz in Track 2. (Übersetzt mit www.DeepL.com/Translator (kostenlose Version))
Abstract (english): This paper documents a submission to the shared task on scene segmentation hosted at KONVENS 2021 (Zehe et al., 2021b). The aim of this shared task was to find methods for segmenting narrative texts into different scenes-segments of text where location, time and the constellation of characters stay more or less coherent. This task is formulated as a sentence classification task where sentences bordering the scenes have to be distinguished from in-scene sentences. The approach presented in this paper is based on two steps. In the first one, a twin BERT training setup is used to learn a sentence embedding space in which sentences functioning as scene borders are well-separated from ones that are in-scene. In the second one, the sentence embeddings generated by this model are used as feature vectors to feed a gradient-boosted decision tree ensemble which conducts final predictions. In the shared task leaderboard, the system ranked second in track 1 and first in track 2. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Argumentation mining in user-generated web discourse Habernal, Ivan; Gurevych, Iryna Journal Article | In: Computational Linguistics Journal | 2017 36233 Endnote: Author(s): Habernal, Ivan; Gurevych, Iryna
Title: Argumentation mining in user-generated web discourse
In: Computational Linguistics Journal, 43 (2017) 1, S. 125-179
DOI: 10.1162/COLI_a_00276
URL: http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00276#.WIDIonpp-nU
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Argumentation; Automatisierung; Computerlinguistik; Data Mining; Diskurs; Erziehungswissenschaft; Information Retrieval; Modell; Reliabilität; Soziale Software; Textanalyse; World wide web 2.0
Abstract: The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

What is the essence of a claim? Cross-domain claim identification Daxenberger, Johannes; Habernal, Ivan; Stab, Christian; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): The Conference on Empirical Methods in Natural Language Processing (EMNLP 2017): Proceedings of the conference, September 9-11, 2017, Copenhagen, Denmark | Stroudsburg; PA: Association for Computational Linguistics | 2017 37872 Endnote: Author(s): Daxenberger, Johannes; Habernal, Ivan; Stab, Christian; Gurevych, Iryna
Title: What is the essence of a claim? Cross-domain claim identification
In: Association for Computational Linguistics (Hrsg.): The Conference on Empirical Methods in Natural Language Processing (EMNLP 2017): Proceedings of the conference, September 9-11, 2017, Copenhagen, Denmark, Stroudsburg; PA: Association for Computational Linguistics, 2017 , S. 2045-2056
URL: http://www.aclweb.org/anthology/D/D17/D17-1217.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Argumentation; Computerlinguistik; Data Mining; Qualitative Forschung; Sprachanalyse; Text; Textanalyse
Abstract: Argument mining has become a popular research area in NLP. It typically includes the identification of argumentative components, e.g. claims, as the central component of an argument. We perform a qualitative analysis across six different datasets and show that these appear to conceptualize claims quite differently. To learn about the consequences of such different conceptualizations of claim for practical applications, we carried out extensive experiments using state-of-the-art feature-rich and deep learning systems, to identify claims in a cross-domain fashion. While the divergent conceptualization of claims in different datasets is indeed harmful to cross-domain classification, we show that there are shared properties on the lexical level as well as system configurations that can help to overcome these gaps. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Neural end-to-end learning for computational argumentation mining Eger, Steffen; Daxenberger, Johannes; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): The 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017): Proceedings of the conference, vol. 1 (long papers), July 30 - August 4, 2017, Vancouver, Canada | Stroudsburg; PA: Association for Computational Linguistics | 2017 37878 Endnote: Author(s): Eger, Steffen; Daxenberger, Johannes; Gurevych, Iryna
Title: Neural end-to-end learning for computational argumentation mining
In: Association for Computational Linguistics (Hrsg.): The 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017): Proceedings of the conference, vol. 1 (long papers), July 30 - August 4, 2017, Vancouver, Canada, Stroudsburg; PA: Association for Computational Linguistics, 2017 , S. 11-22
DOI: 10.18653/v1/P17-1002
URL: https://aclanthology.info/pdf/P/P17/P17-1002.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Argumentation; Automatisierung; Computerlinguistik; Data Mining; Klassifikation; Rhetorik; Semantik; Textanalyse
Abstract: We investigate neural techniques for end-to-end computational argumentation mining (AM). We frame AM both as a token-based dependency parsing and as a token-based sequence tagging problem, including a multi-task learning setup. Contrary to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results. In contrast, less complex (local) tagging models based on BiL-STMs perform robustly across classification scenarios, being able to catch long-range dependencies inherent to the AM problem. Moreover, we find that jointly learning 'natural' subtasks, in a multi-task learning setup, improves performance. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

EELECTION at SemEval-2017 Task 10. Ensemble of nEural Learners for kEyphrase ClassificaTION Eger, Steffen; Do Dinh, Erik-Lân; Kuznetsov, Ilia; Kiaeeha, Masoud; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): 11th International Workshop on Semantic Evaluations (SemEval-2017): Proceedings of the workshop, August 3-4, 2017, Vancouver, Canada | Stroudsburg; PA: Association for Computational Linguistics | 2017 37870 Endnote: Author(s): Eger, Steffen; Do Dinh, Erik-Lân; Kuznetsov, Ilia; Kiaeeha, Masoud; Gurevych, Iryna
Title: EELECTION at SemEval-2017 Task 10. Ensemble of nEural Learners for kEyphrase ClassificaTION
In: Association for Computational Linguistics (Hrsg.): 11th International Workshop on Semantic Evaluations (SemEval-2017): Proceedings of the workshop, August 3-4, 2017, Vancouver, Canada, Stroudsburg; PA: Association for Computational Linguistics, 2017 , S. 942-946
URL: http://aclweb.org/anthology/S17-2163
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Klassifikation; Publikation; Semantik; Textanalyse; Wissenschaft
Abstract: This paper describes our approach to the SemEval 2017 Task 10: "Extracting Keyphrases and Relations from Scientific Publications", specifically to Subtask (B): "Classification of identified keyphrases". We explored three different deep learning approaches: a character-level convolutional neural network (CNN), a stacked learner with an MLP meta-classifier, and an attention based Bi-LSTM. From these approaches, we created an ensemble of differently hyper-parameterized systems, achieving a micro-F1-score of 0.63 on the test data. Our approach ranks 2nd (score of 1st placed system: 0.64) out of four according to this official score. However, we erroneously trained 2 out of 3 neural nets (the stacker and the CNN) on only roughly 15% of the full data, namely, the original development set. When trained on the full data (training+development), our ensemble has a micro-F1-score of 0.69. Our code is available from https://github.com/UKPlab/semeval2017-scienceie. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Frame-based data factorizations Mair, Sebastian; Boubekki, Ahcène; Brefeld, Ulf Book Chapter | Aus: Precup, Doina; Teh, Yee Whye (Hrsg.): Proceedings of the International Conference on Machine Learning (IMCL 2017), 6-11 August 2017, International Convention Centre, Sydney, Australia | Red Hook; NY: Curran | 2017 37658 Endnote: Author(s): Mair, Sebastian; Boubekki, Ahcène; Brefeld, Ulf
Title: Frame-based data factorizations
In: Precup, Doina; Teh, Yee Whye (Hrsg.): Proceedings of the International Conference on Machine Learning (IMCL 2017), 6-11 August 2017, International Convention Centre, Sydney, Australia, Red Hook; NY: Curran, 2017 (Proceedings of Machine Learning Research, 70), S. 2305-2313
URL: http://proceedings.mlr.press/v70/mair17a/mair17a.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Automatisierung; Computerlinguistik; Daten; Datenanalyse; Methode; Verfahren
Abstract: Archetypal Analysis is the method of choice to compute interpretable matrix factorizations. Every data point is represented as a convex combination of factors, i.e., points on the boundary of the convex hull of the data. This renders computation inefficient. In this paper, we show that the set of vertices of a convex hull, the so-called frame, can be efficiently computed by a quadratic program. We provide theoretical and empirical results for our proposed approach and make use of the frame to accelerate Archetypal Analysis. The novel method yields similar reconstruction errors as baseline competitors but is much faster to compute. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

SemEval-2017 task 7. Detection and interpretation of English puns Miller, Tristan; Hempelmann, Christian; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): 11th International Workshop on Semantic Evaluations (SemEval-2017): Proceedings of the workshop, August 3-4, 2017, Vancouver, Canada | Stroudsburg; PA: Association for Computational Linguistics | 2017 37877 Endnote: Author(s): Miller, Tristan; Hempelmann, Christian; Gurevych, Iryna
Title: SemEval-2017 task 7. Detection and interpretation of English puns
In: Association for Computational Linguistics (Hrsg.): 11th International Workshop on Semantic Evaluations (SemEval-2017): Proceedings of the workshop, August 3-4, 2017, Vancouver, Canada, Stroudsburg; PA: Association for Computational Linguistics, 2017 , S. 58-68
DOI: 10.18653/v1/S17-2005
URL: http://aclweb.org/anthology/S17-2005
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Wort; Humor; Rhetorik; Semantik; Linguistik; Phonologie; Automatisierung; Erkennen; Interpretation; System; Evaluation
Abstract: A pun is a form of wordplay in which a word suggests two or more meanings by exploiting polysemy, homonymy, or phonological similarity to another word, for an intended humorous or rhetorical effect. Though a recurrent and expected feature in many discourse types, puns stymie traditional approaches to computational lexical semantics because they violate their one-sense-per-context assumption. This paper describes the first competitive evaluation for the automatic detection, location, and interpretation of puns. We describe the motivation for these tasks, the evaluation methods, and the manually annotated data set. Finally, we present an overview and discussion of the participating systems' methodologies, resources, and results. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

A "Wind of Change" Shaping public opinion of the "Arab Spring" using metaphors Núñez, Alexandra; Gerloff, Malte; Do Dinh, Erik-Lan; Rapp, Andrea; Gehring, Petra; Gurevych, Iryna Book Chapter | Aus: Alliance of Digital Humanities (Hrsg.): Digital Humanities 2017: Conference abstracts, McGill University & Université de Montréal, Montréal, Canada, August 8.11, 2017 | Montréal: Alliance of Digital Humanities | 2017 37342 Endnote: Author(s): Núñez, Alexandra; Gerloff, Malte; Do Dinh, Erik-Lan; Rapp, Andrea; Gehring, Petra; Gurevych, Iryna
Title: A "Wind of Change" Shaping public opinion of the "Arab Spring" using metaphors
In: Alliance of Digital Humanities (Hrsg.): Digital Humanities 2017: Conference abstracts, McGill University & Université de Montréal, Montréal, Canada, August 8.11, 2017, Montréal: Alliance of Digital Humanities, 2017 , S. 551-553
URL: https://dh2017.adho.org/abstracts/041/041.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Einflussfaktor; Grammatik; Metapher; Öffentliche Meinung; Presseberichterstattung; Semantik; Textanalyse
Abstract: How does mass media affect the way we think about controversial topics such as the "Arab Spring"? What persuasive role do metaphors play especially in opinion pieces? We analyze how the political events of the years 2010-2011 in the Middle East and North Africa Region ("Arab Spring") are categorized and assessed using metaphorical constructions in newspaper opinion pieces. We show ways in which particularly the use of metaphors reveals how the media tried to achieve acceptance for the events based on our cultural models (Quinn and Holland, 1987), which are grounded on our western knowledge. To this end, we constructed a pipeline that automatically detects (and filters) metaphors appearing within certain grammatical constructions, before clustering them by presumed source and target domains (Conceptual Metaphor Theory, Lakoff and Johnson, 1980). The results give us insights into how the "Arab Spring" is metaphorically structured by semantic clusters in opinion pieces. (DIPF/Autor)
DIPF-Departments: Informationszentrum Bildung

Reporting score distributions makes a difference. Performance study of LSTM-networks for sequence […] Reimers, Nils; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): The Conference on Empirical Methods in Natural Language Processing (EMNLP 2017): Proceedings of the Conference, September 9-11, 2017, Copenhagen, Denmark | Stroudsburg; PA: Association for Computational Linguistics | 2017 37871 Endnote: Author(s): Reimers, Nils; Gurevych, Iryna
Title: Reporting score distributions makes a difference. Performance study of LSTM-networks for sequence tagging
In: Association for Computational Linguistics (Hrsg.): The Conference on Empirical Methods in Natural Language Processing (EMNLP 2017): Proceedings of the Conference, September 9-11, 2017, Copenhagen, Denmark, Stroudsburg; PA: Association for Computational Linguistics, 2017 , S. 338-348
URL: http://www.aclweb.org/anthology/D/D17/D17-1035.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Spracherkennung
Abstract: In this paper we show that reporting a single performance score is insufficient to compare non-deterministic approaches. We demonstrate for common sequence tagging tasks that the seed value for the random number generator can result in statistically significant (p < 10^-4) differences for state-of-the-art systems. For two recent systems for NER, we observe an absolute difference of one percentage point F1-score depending on the selected seed value, making these systems perceived either as state-of-the-art or mediocre. Instead of publishing and reporting single performance scores, we propose to compare score distributions based on multiple executions. Based on the evaluation of 50.000 LSTM-networks for five sequence tagging tasks, we present network architectures that produce both superior performance as well as are more stable with respect to the remaining hyperparameters. The full experimental results are published in (Reimers and Gurevych, 2017). The implementation of our network is publicly available. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

End-to-end non-factoid question answering with an interactive visualization of neural attention […] Rücklé, Andreas; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, July 30 - August 4, 2017: System demonstrations | Stroudsburg; PA: Association for Computational Linguistics | 2017 37876 Endnote: Author(s): Rücklé, Andreas; Gurevych, Iryna
Title: End-to-end non-factoid question answering with an interactive visualization of neural attention weights
In: Association for Computational Linguistics (Hrsg.): Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, July 30 - August 4, 2017: System demonstrations, Stroudsburg; PA: Association for Computational Linguistics, 2017 , S. 19-24
DOI: 10.18653/v1/P17-4004
URL: https://aclanthology.info/pdf/P/P17/P17-4004.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Aufmerksamkeit; Vernetzung; Modell; Struktur; Analyse; Visualisierung; Forschung; Frage; Antwort; Benutzeroberfläche
Abstract: Advanced attention mechanisms are an important part of sucessful neural network approaches for non-factoid answer selection because they allow the models to focus on few important segments within rather long answer texts. Analyzing attention mechanisms is thus crucial for understanding strengths and weaknesses of particular models. We present an extensible, highly modular service architecture that enables the transformation of neural network models for non-factoid answer selection into fully featured end-to-end question answering systems. The primary objective of our system is to enable researchers a way to interactively explore and compare attention-based neural networks for answer selection. Our interactive user interface helps researchers to better understand the capabilities of the different approaches and can aid qualitative analyses. The source-code of our system is publicly available. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung