Search results in the DIPF database of publications

Your query:

(Schlagwörter: "Textanalyse")

Comparative exploration of document collections. A visual analytics approach Oelke, Daniela; Strobelt, Hendrik; Rohrdantz, Christian; Gurevych, Iryna; Deussen, Oliver Journal Article | In: Computer Graphics Forum | 2014 34578 Endnote: Author(s): Oelke, Daniela; Strobelt, Hendrik; Rohrdantz, Christian; Gurevych, Iryna; Deussen, Oliver
Title: Comparative exploration of document collections. A visual analytics approach
In: Computer Graphics Forum, 33 (2014) 3, S. 201-210
DOI: 10.1111/cgf.12376
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Automatisierung; Computergrafik; Computerlinguistik; Informationssystem; Methode; Modellierung; Semantik; Textanalyse; Vergleich; Visualisierung
Abstract: We present an analysis and visualization method for computing what distinguishes a given document collection from others. We determine topics that discriminate a subset of collections from the remaining ones by applying probabilistic topic modeling and subsequently approximating the two relevant criteria distinctiveness and characteristicness algorithmically through a set of heuristics. Furthermore, we suggest a novel visualization method called DiTop-View, in which topics are represented by glyphs (topic coins) that are arranged on a 2D plane. Topic coins are designed to encode all information necessary for performing comparative analyses such as the class membership of a topic, its most probable terms and the discriminative relations. We evaluate our topic analysis using statistical measures and a small user experiment and present an expert case study with researchers from political sciences analyzing two real-world datasets. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Suchen als Methode? Zu einigen Problemen digitaler Metapherndetektion Gehring, Petra; Gurevych, Iryna Journal Article | In: Journal Phänomenologie - Schwerpunktthema: Metaphern als strenge Wissenschaft | 2014 34988 Endnote: Author(s): Gehring, Petra; Gurevych, Iryna
Title: Suchen als Methode? Zu einigen Problemen digitaler Metapherndetektion
In: Journal Phänomenologie - Schwerpunktthema: Metaphern als strenge Wissenschaft, 21 (2014) 41, S. 99-109
URL: https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/sp_gehring_gurevych.pdf
Publication Type: 3b. Beiträge in weiteren Zeitschriften; wissenschaftsorientiert
Language: Deutsch
Keywords: Algorithmus; Computerlinguistik; Computerunterstütztes Verfahren; Diskurs; Metapher; Recherche; Textanalyse
Abstract: Eine Vielzahl von geistes-, kultur- und textwissenschaftlichen Disziplinen nimmt aktuell die Metapher in den Blick. Seit mehreren Jahrzehnten wuchs das Interesse an Metaphorik an: Neben linguistischen Untersuchungen, die zumeist das Ganze eines Sprachsystems (oder modellhaft Kognitionssystems) oder aber empirische Daten alltagsprachlichen Zuschnitts zum Hintergrund haben, jedoch auch Fachsprachen oder Diskurse oder andere textlinguistisch oder textpragmatisch bestimmte Materialien zum Gegenstand haben können, sind es literatur- und kulturwissenschaftliche Forschungen, die sich an Metaphernvorkommen festmachen: Motivforschung, Toposforschung sagte man früher. Heute spricht man auch hier von Diskursen. Hinzu kommt ein Alltagskultur und Literatur allenfalls berührendes Feld: die Wissens-, Wissenschafts- und Ideengeschichte, deren Augenmerk der Theoriesprache der Forschung gilt: Tropen, in welchen Rationalitäten sich umzugestalten scheinen, Tropen, in welchen ein "Mut" des Geistes sich "in seinen Bildern selbst voraus" ist. Wissensgeschichte als Metapherngeschichte: hierfür stehen heute-erwachsen aus global gesehen einer gewissen Sondertradition, der deutschsprachigen Begriffsgeschichte der 1970er und 1980er Jahre- die Namen Reinhart Koselleck ("historische Semantik") sowie Hans Blumenberg ("Metaphorologie"). Parallel, dabei aber ganz anders ausgerichtet, haben Metapherntheorien Konjunktur - seit längerem in der Sprachphilosophie sowie neuerdings in der sogenannten Philosophy of Mind, in Psycholinguistik sowie psychologischer Emotionsforschung, dazu in Soziologie und Politikwissenschaft. Keinesfalls wird also unter dem Namen Metapher/metaphor überall am selben Gegenstand geforscht. Festhalten lässt sich hingegen: Die Metapher boomt. Sie wird auf ganz unterschiedliche Weise als Schlüssel zu semantikorientierten, pragmatikorientierten oder sogar medienorientierten Fragen genutzt. Gebraucht werden in dieser Lage: Methoden. Damit nähern wir uns unserem Thema. (DIPF/Autor)
DIPF-Departments: Informationszentrum Bildung

DKPro TC. A Java-based framework for supervised learning experiments on textual data Daxenberger, Johannes; Ferschke, Oliver; Gurevych, Iryna; Zesch, Torsten Book Chapter | Aus: Bontcheva, Kalina; Jingbo, Zhu (Hrsg.): Proceedings of COLING 2014: System demonstrations | Stroudsburg; PA: Association for Computational Linguistics | 2014 34721 Endnote: Author(s): Daxenberger, Johannes; Ferschke, Oliver; Gurevych, Iryna; Zesch, Torsten
Title: DKPro TC. A Java-based framework for supervised learning experiments on textual data
In: Bontcheva, Kalina; Jingbo, Zhu (Hrsg.): Proceedings of COLING 2014: System demonstrations, Stroudsburg; PA: Association for Computational Linguistics, 2014 , S. 61-66
URL: http://aclweb.org/anthology/P/P14/P14-5011.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Computerprogramm; Data Mining; Datenverarbeitung; Klassifikation; Programmiersprache; Text; Textanalyse
Abstract: We present DKPro TC, a framework for supervised learning experiments on textual data. The main goal of DKPro TC is to enable researchers to focus on the actual research task behind the learning problem and let the framework handle the rest. It enables rapid prototyping of experiments by relying on an easy-to-use workflow engine and standardized document preprocessing based on the Apache Unstructured Information Management Architecture (Ferrucci and Lally, 2004). It ships with standard feature extraction modules, while at the same time allowing the user to add customized extractors. The extensive reporting and logging facilities make DKPro TC experiments fully replicable. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

DKPro keyphrases. Flexible and reusable keyphrase extraction experiments Erbs, Nicolai; Bispo Santos, Pedro; Gurevych, Iryna; Zesch, Torsten Book Chapter | Aus: Bontcheva, Kalina; Jingbo, Zhu (Hrsg.): Proceedings of COLING 2014: System demonstrations | Stroudsburg; PA: Association for Computational Linguistics | 2014 34722 Endnote: Author(s): Erbs, Nicolai; Bispo Santos, Pedro; Gurevych, Iryna; Zesch, Torsten
Title: DKPro keyphrases. Flexible and reusable keyphrase extraction experiments
In: Bontcheva, Kalina; Jingbo, Zhu (Hrsg.): Proceedings of COLING 2014: System demonstrations, Stroudsburg; PA: Association for Computational Linguistics, 2014 , S. 31-36
URL: http://www.aclweb.org/anthology/P/P14/P14-5006.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Abstract; Automatisierung; Computerlinguistik; Inhaltserschließung; Text; Textanalyse; Tool
Abstract: DKPro Keyphrases is a keyphrase extraction framework based on UIMA. It offers a wide range of state-of-the-art keyphrase experiments approaches. At the same time, it is a workbench for developing new extraction approaches and evaluating their impact. DKPro Keyphrases is publicly available under an open-source license. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Sense and similarity. A study of sense-level similarity measures Erbs, Nicolai; Gurevych, Iryna; Zesch, Torsten Book Chapter | Aus: Bos, Johan; Frank, Anette; Navigli, Roberto (Hrsg.): Proceedings of the 3rd Joint Conference on Lexical and Computational Semantics (SEM 2014) | Stroudsburg; PA: Association for Computational Linguistics | 2014 34975 Endnote: Author(s): Erbs, Nicolai; Gurevych, Iryna; Zesch, Torsten
Title: Sense and similarity. A study of sense-level similarity measures
In: Bos, Johan; Frank, Anette; Navigli, Roberto (Hrsg.): Proceedings of the 3rd Joint Conference on Lexical and Computational Semantics (SEM 2014), Stroudsburg; PA: Association for Computational Linguistics, 2014 , S. 30-39
URL: http://www.aclweb.org/anthology/S14-1004
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Ambiguität; Begriff; Computerlinguistik; Messung; Semantik; Sinn; Textanalyse; Wort
Abstract: In this paper, we investigate the difference between word and sense similarity measures and present means to convert a state-of-the-art word similarity measure into a sense similarity measure. In order to evaluate the new measure, we create a special sense similarity dataset and re-rate an existing word similarity dataset using two different sense inventories from WordNet and Wikipedia. We discover that word-level measures were not able to differentiate between different senses of one word, while sense-level measures actually increase correlation when shifting to sense similarities. Sense-level similarity measures improve when evaluated with a re-rated sense-aware gold standard, while correlation with word-level similarity measures decreases. (DIPF/Org.)
DIPF-Departments: Informationszentrum Bildung

UKPDIPF. A lexical semantic approach to sentiment polarity prediction in Twitter data Flekova, Lucie; Ferschke, Oliver; Gurevych, Iryna Book Chapter | Aus: Nakov, Preslav; Zesch, Torsten (Hrsg.): Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) | Stroudsburg; PA: Association for Computational Linguistics | 2014 34724 Endnote: Author(s): Flekova, Lucie; Ferschke, Oliver; Gurevych, Iryna
Title: UKPDIPF. A lexical semantic approach to sentiment polarity prediction in Twitter data
In: Nakov, Preslav; Zesch, Torsten (Hrsg.): Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Stroudsburg; PA: Association for Computational Linguistics, 2014 , S. 704-710
URL: http://alt.qcri.org/semeval2014/cdrom/pdf/SemEval2014126.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Ausdruck <Psy>; Computerlinguistik; Emotion; Klassifikation; Schriftsprache; Semantik; Soziale Software; Textanalyse
Abstract: We present a sentiment classification system that participated in the SemEval 2014 shared task on sentiment analysis in Twitter. Our system expands tokens in a tweet with semantically similar expressions using a large novel distributional thesaurus and calculates the semantic relatedness of the expanded tweets to word lists repre- senting positive and negative sentiment. This approach helps to assess the polarity of tweets that do not directly contain polarity cues. Moreover, we incorporate syntactic, lexical and surface sentiment features. On the message level, our system achieved the 8th place in terms of macroaveraged F-score among 50 systems, with particularly good performance on the Life-Journal corpus (F1=71.92) and the Twitter sarcasm (F1=54.59) dataset. On the expression level, our system ranked 14 out of 27 systems, based on macro-averaged F-score. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Unsupervised relation extraction of in-domain data from focused crawls Remus, Steffen Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics (ACL) | Stroudsburg; PA: Association for Computational Linguistics | 2014 34655 Endnote: Author(s): Remus, Steffen
Title: Unsupervised relation extraction of in-domain data from focused crawls
In: Association for Computational Linguistics (Hrsg.): Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics (ACL), Stroudsburg; PA: Association for Computational Linguistics, 2014 , S. 11-20
URL: http://aclweb.org/anthology//E/E14/E14-3002.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Semantik; Textanalyse
Abstract: This thesis proposal approaches unsuper- vised relation extraction from web data, which is collected by crawling only those parts of the web that are from the same do- main as a relatively small reference cor- pus. The first part of this proposal is con- cerned with the efficient discovery of web documents for a particular domain and in a particular language. We create a com- bined, focused web crawling system that automatically collects relevant documents and minimizes the amount of irrelevant web content. The collected web data is semantically processed in order to acquire rich in-domain knowledge. Here, we focus on fully unsupervised relation extraction by employing the extended distributional hypothesis. We use distributional similar- ities between two pairs of nominals based on dependency paths as context and vice versa for identifying relational structure. We apply our system for the domain of educational sciences by focusing primarily on crawling scientific educational publica- tions in the web. We are able to produce promising initial results on relation identi- fication and we will discuss future direc- tions. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Annotating argument components and relations in persuasive essays Stab, Christian; Gurevych, Iryna Book Chapter | Aus: Tsujii, Junichi;Hajic, Jan (Hrsg.): Proceedings of COLING 2014: Technical papers | Stroudsburg; PA: Association for Computational Linguistics | 2014 34976 Endnote: Author(s): Stab, Christian; Gurevych, Iryna
Title: Annotating argument components and relations in persuasive essays
In: Tsujii, Junichi;Hajic, Jan (Hrsg.): Proceedings of COLING 2014: Technical papers, Stroudsburg; PA: Association for Computational Linguistics, 2014 , S. 1501-1510
URL: http://anthology.aclweb.org/C/C14/C14-1142.pdf
Publication Type: 4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Argumentation; Automatisierung; Computerlinguistik; Diskurs; Klassifikation; Modell; Qualität; Reliabilität; Text; Textanalyse
Abstract: In this paper, we present a novel approach to model arguments, their components and relations in persuasive essays in English. We propose an annotation scheme that includes the annotation of claims and premises as well as support and attack relations for capturing the structure of argumentative discourse. We further conduct a manual annotation study with three annotators on 90 persuasive essays. The obtained inter-rater agreement of αU = 0.72 for argument components and α = 0.81 for argumentative relations indicates that the proposed annotation scheme successfully guides annotators to substantial agreement. The final corpus and the annotation guidelines are freely available to encourage future research in argument recognition. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Lexical substitution dataset for German Cholakov, Kostadin; Biemann, Chris; Eckle-Kohler, Judith; Gurevych, Iryna Book Chapter | Aus: Calzolari, Nicoletta;Choukri,Khalid;Declerck,Thierry;Loftsson,Hrafn;Maegaard,Bente;Mariani,Joseph;Moreno,Asuncion;Odijk,Jan;Piperidis,Stelios (Hrsg.): Proceedings of the 9th International Conference on Language Resources and Evaluations (LREC 2014) | Reykjavik: European Language Resources Association | 2014 34575 Endnote: Author(s): Cholakov, Kostadin; Biemann, Chris; Eckle-Kohler, Judith; Gurevych, Iryna
Title: Lexical substitution dataset for German
In: Calzolari, Nicoletta;Choukri,Khalid;Declerck,Thierry;Loftsson,Hrafn;Maegaard,Bente;Mariani,Joseph;Moreno,Asuncion;Odijk,Jan;Piperidis,Stelios (Hrsg.): Proceedings of the 9th International Conference on Language Resources and Evaluations (LREC 2014), Reykjavik: European Language Resources Association, 2014 , S. 1406-1411
URL: http://www.lrec-conf.org/proceedings/lrec2014/pdf/545_Paper.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Computerunterstütztes Verfahren; Daten; Deutsch; Nachschlagewerk; Online; Sprachanalyse; Synonym; Textanalyse; World wide web 2.0; Wort
Abstract: This article describes a lexical substitution dataset for German. The whole dataset contains 2,040 sentences from the German Wikipedia, with one target word in each sentence. There are 51 target nouns, 51 adjectives, and 51 verbs randomly selected from 3 frequency groups based on the lemma frequency list of the German WaCKy corpus. 200 sentences have been annotated by 4 professional annotators and the remaining sentences by 1 professional annotator and 5 additional annotators who have been recruited via crowdsourcing. The resulting dataset can be used to evaluate not only lexical substitution systems, but also different sense inventories and word sense disambiguation systems.
DIPF-Departments: Informationszentrum Bildung

Automatically detecting corresponding edit-turn-pairs in Wikipedia Daxenberger, Johannes; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Short Papers | Stroudsburg; PA: Association for Computational Linguistics | 2014 34577 Endnote: Author(s): Daxenberger, Johannes; Gurevych, Iryna
Title: Automatically detecting corresponding edit-turn-pairs in Wikipedia
In: Association for Computational Linguistics (Hrsg.): Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Short Papers, Stroudsburg; PA: Association for Computational Linguistics, 2014 , S. 187-192
URL: http://anthology.aclweb.org//P/P14/P14-2031.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerunterstütztes Verfahren; Information; Nachschlagewerk; Online; Soziale Software; Textanalyse; Wissen; World wide web 2.0
Abstract: In this study, we analyze links between edits in Wikipedia articles and turns from their discussion page. Our motivation is to better understand implicit details about the writing process and knowledge flow in collaboratively created resources. Based on properties of the involved edit and turn, we have defined constraints for corresponding edit-turn-pairs. We manually annotated a corpus of 636 corresponding and non-corresponding edit-turn-pairs. Furthermore, we show how our data can be used to automatically identify corresponding edit-turn-pairs. With the help of supervised machine learning, we achieve an accuracy of .87 for this task.
DIPF-Departments: Informationszentrum Bildung