Menü Überspringen
Kontakt
Presse
Deutsch
English
Not track
Datenverarbeitung
Suche
Anmelden
DIPF aktuell
Forschung
Infrastrukturen
Institut
Zurück
Kontakt
Presse
Deutsch
English
Not track
Datenverarbeitung
Suche
Startseite
>
Forschung
>
Publikationen
>
Publikationendatenbank
Ergebnis der Suche in der DIPF Publikationendatenbank
Ihre Abfrage:
(Schlagwörter: "Computerlinguistik")
zur erweiterten Suche
Suchbegriff
Nur Open Access
Suchen
Markierungen aufheben
Alle Treffer markieren
Export
110
Inhalte gefunden
Alle Details anzeigen
Semi-automatic detection of cross-lingual marketing blunders based on pragmatic label propagation […]
Meyer, Christian M.; Eckle-Kohler, Judith; Gurevych, Iryna
Sammelbandbeitrag
| Aus: The COLING 2016 Organizing Committee (Hrsg.): Proceedings of the 26th International Conference on Computational Linguistics (COLING) | Osaka: The COLING 2016 Organizing Committee | 2016
36985 Endnote
Autor*innen:
Meyer, Christian M.; Eckle-Kohler, Judith; Gurevych, Iryna
Titel:
Semi-automatic detection of cross-lingual marketing blunders based on pragmatic label propagation in Wiktionary
Aus:
The COLING 2016 Organizing Committee (Hrsg.): Proceedings of the 26th International Conference on Computational Linguistics (COLING), Osaka: The COLING 2016 Organizing Committee, 2016 , S. 2071-2081
URL:
http://www.aclweb.org/anthology/C/C16/C16-1195.pdf
Dokumenttyp:
4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Begriff; Computerlinguistik; Computerunterstütztes Verfahren; Fremdsprache; Marketing; Problem
Abstract (english):
We introduce the task of detecting cross-lingual marketing blunders, which occur if a trade name resembles an inappropriate or negatively connotated word in a target language. To this end, we suggest a formal task definition and a semi-automatic method based the propagation of pragmatic labels from Wiktionary across sense-disambiguated translations. Our final tool assists users by providing clues for problematic names in any language, which we simulate in two experiments on detecting previously occurred marketing blunders and identifying relevant clues for established international brands. We conclude the paper with a suggested research roadmap for this new task. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
TAXI. A taxonomy induction method based on lexico-syntactic patterns, substrings and focused […]
Panchenko, Alexander; Faralli, Stefano; Ruppert, Eugen; Remus, Steffen; Naets, Hubert; […]
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 10th International Workshop on Semantic Evaluation co-located with NAACL 2016 | Stroudsburg; PA: Association for Computational Linguistics | 2016
37068 Endnote
Autor*innen:
Panchenko, Alexander; Faralli, Stefano; Ruppert, Eugen; Remus, Steffen; Naets, Hubert; Fairon, Cédrick; Ponzetto, Simone Paolo; Biemann, Chris
Titel:
TAXI. A taxonomy induction method based on lexico-syntactic patterns, substrings and focused crawling
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the 10th International Workshop on Semantic Evaluation co-located with NAACL 2016, Stroudsburg; PA: Association for Computational Linguistics, 2016 , S. 1320-1327
URL:
http://www.aclweb.org/anthology/S16-1206
Dokumenttyp:
4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Taxonomie; Methode; Sprache; Englisch; Niederländisch; Französisch; Italienisch; Text; Begriff; Struktur; Evaluation
Abstract:
We present a system for taxonomy construction that reached the first place in all sub-tasks of the SemEval 2016 challenge on Taxonomy Extraction Evaluation. Our simple yet effective approach harvests hypernyms with substring inclusion and Hearst-style lexico-syntactic patterns from domain-specific texts obtained via language model based focused crawling. Extracted taxonomies are evaluated on English, Dutch, French and Italian for three domains each (Food, Environment and Science). Evaluations against a gold standard and by human judgment show that our method outperforms more complex and knowledge-rich approaches on most domains and languages. Furthermore, to adapt the method to a new domain or language, only a small amount of manual labour is needed. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Task-oriented intrinsic evaluation of semantic textual similarity
Reimers, Nils; Beyer, Philip; Gurevych, Iryna
Sammelbandbeitrag
| Aus: The COLING 2016 Organizing Committee (Hrsg.): Proceedings of the 26th International Conference on Computational Linguistics (COLING) | Osaka: The COLING 2016 Organizing Committee | 2016
36988 Endnote
Autor*innen:
Reimers, Nils; Beyer, Philip; Gurevych, Iryna
Titel:
Task-oriented intrinsic evaluation of semantic textual similarity
Aus:
The COLING 2016 Organizing Committee (Hrsg.): Proceedings of the 26th International Conference on Computational Linguistics (COLING), Osaka: The COLING 2016 Organizing Committee, 2016 , S. 87-96
URL:
https://www.aclweb.org/anthology/C/C16/C16-1009.pdf
Dokumenttyp:
4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Evaluation; Korrelation; Messverfahren; Semantik; Systemvergleich; Text
Abstract (english):
Semantic Textual Similarity (STS) is a foundational NLP task and can be used in a wide range of tasks. To determine the STS of two texts, hundreds of different STS systems exist, however, for an NLP system designer, it is hard to decide which system is the best on. To answer this question, an intrinsic evaluation of the STS systems is conducted by comparing the output of the system to human judgments on semantic similarity. The comparison is usually done using Pearson cor- relation. In this work, we show that relying on intrinsic evaluations with Pearson correlation can be misleading. In three common STS based tasks we could observe that the Pearson correlation was especially ill-suited to detect the best STS system for the task and other evaluation measures were much better suited. In this work we define how the validity of an intrinsic evaluation can be assessed and compare different intrinsic evaluation methods. Understanding of the properties of the targeted task is crucial and we propose a framework for conducting the intrinsic evaluation which takes the properties of the targeted task into account. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Still not there? Comparing traditional sequence-to-sequence models to encoder-decoder neural […]
Schnober, Carsten; Eger, Steffen; Do Dinh, Erik-Lân; Gurevych, Iryna
Sammelbandbeitrag
| Aus: The COLING 2016 Organizing Committee (Hrsg.): Proceedings of the 26th International Conference on Computational Linguistics (COLING) | Osaka: The COLING 2016 Organizing Committee | 2016
36984 Endnote
Autor*innen:
Schnober, Carsten; Eger, Steffen; Do Dinh, Erik-Lân; Gurevych, Iryna
Titel:
Still not there? Comparing traditional sequence-to-sequence models to encoder-decoder neural networks on monotone string translation tasks
Aus:
The COLING 2016 Organizing Committee (Hrsg.): Proceedings of the 26th International Conference on Computational Linguistics (COLING), Osaka: The COLING 2016 Organizing Committee, 2016 , S. 1703-1714
URL:
http://aclweb.org/anthology/C16-1160
Dokumenttyp:
4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Computerunterstütztes Verfahren; Korrektur; Rechtschreibung; Text
Abstract (english):
We analyze the performance of encoder-decoder neural models and compare them with well-known established methods. The latter represent different classes of traditional approaches that are applied to the monotone sequence-to-sequence tasks OCR post-correction, spelling correction, grapheme-to-phoneme conversion, and lemmatization. Such tasks are of practical relevance for various higher-level research fields including digital humanities, automatic text correction, and speech recognition. We investigate how well generic deep-learning approaches adapt to these tasks, and how they perform in comparison with established and more specialized methods, including our own adaptation of pruned CRFs. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Recognizing the absence of opposing arguments in persuasive essays
Stab, Christian; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 3rd Workshop on Argument Mining held in conjunction with the 2016 Annual Meeting of the Association for Computational Linguistics (ACL 2016) | Stroudsburg; PA: Association for Computational Linguistics | 2016
36976 Endnote
Autor*innen:
Stab, Christian; Gurevych, Iryna
Titel:
Recognizing the absence of opposing arguments in persuasive essays
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the 3rd Workshop on Argument Mining held in conjunction with the 2016 Annual Meeting of the Association for Computational Linguistics (ACL 2016), Stroudsburg; PA: Association for Computational Linguistics, 2016 , S. 113-118
URL:
http://aclweb.org/anthology/W/W16/W16-2813.pdf
Dokumenttyp:
4. Beiträge in Sammelbänden; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Argumentation; Aufsatz; Computerlinguistik; Dokument; Gegensatz; Klassifikation; Modell
Abstract (english):
In this paper, we introduce an approach for recognizing the absence of opposing arguments in persuasive essays. We model this task as a binary document classification and show that adversative transitions in combination with unigrams and syntactic production rules significantly outperform a challenging heuristic baseline. Our approach yields an accuracy of 75.6% and 84% of human performance in a persuasive essay corpus with various topics. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Predicting the spelling difficulty of words for language learners
Beinborn, Lisa; Zesch, Torsten; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Association for Computational Linguistic (Hrsg.): Proceedings of the 11th workshop on innovative use of NLP for building educational applications held in conjunction with NAACL 2016 | Stroudsburg; PA: Association for Computational Linguistics | 2016
36973 Endnote
Autor*innen:
Beinborn, Lisa; Zesch, Torsten; Gurevych, Iryna
Titel:
Predicting the spelling difficulty of words for language learners
Aus:
Association for Computational Linguistic (Hrsg.): Proceedings of the 11th workshop on innovative use of NLP for building educational applications held in conjunction with NAACL 2016, Stroudsburg; PA: Association for Computational Linguistics, 2016 , S. 73-83
URL:
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/wall/BEA2016_SpellingDifficulty.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Deutsch; Englisch; Fehler; Fremdsprache; Italienisch; Modell; Muttersprache; Phonetik; Psycholinguistik; Rechtschreibung
Abstract (english):
In many language learning scenarios, it is important to anticipate spelling errors. We model the spelling difficulty of words with new features that capture phonetic phenomena and are based on psycholinguistic findings. To train our model, we extract more than 140,000 spelling errors from three learner corpora covering English, German and Italian essays. The evaluation shows that our model can predict spelling difficulty with an accuracy of over 80% and yields a stable quality across corpora and languages. In addition, we provide a thorough error analysis that takes the native language of the learners into account and provides insights into cross-lingual transfer effects. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Token-level metaphor detection using neural networks
Do Dinh, Erik-Lân; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the fourth workshop on metaphor in NLP held in conjunction with NAACL 2016 | Stroudsburg; PA: Association for Computational Linguistics | 2016
36978 Endnote
Autor*innen:
Do Dinh, Erik-Lân; Gurevych, Iryna
Titel:
Token-level metaphor detection using neural networks
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the fourth workshop on metaphor in NLP held in conjunction with NAACL 2016, Stroudsburg; PA: Association for Computational Linguistics, 2016 , S. 28-33
URL:
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2016/2016_DoDinh_NAACL_pages.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Netzwerk; Semantik; Textanalyse
Abstract (english):
Automatic metaphor detection usually relies on various features, incorporating e.g. selectional preference violations or concreteness ratings to detect metaphors in text. These features rely on background corpora, hand-coded rules or additional, manually created resources, all specific to the language the system is being used on. We present a novel approach to metaphor detection using a neural network in combination with word embeddings, a method that has already proven to yield promising results for other natural language processing tasks. We show that foregoing manual feature engineering by solely relying on word embeddings trained on large corpora produces comparable results to other systems, while removing the need for additional resources. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Mass collaboration on the web. Textual content analysis by means of natural language processing
Habernal, Ivan; Daxenberger, Johannes; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Cress, Ulrike;Moskaliuk, Johannes;Jeong, Heisawn (Hrsg.): Mass collaboration and education | Cham: Springer | 2016
35504 Endnote
Autor*innen:
Habernal, Ivan; Daxenberger, Johannes; Gurevych, Iryna
Titel:
Mass collaboration on the web. Textual content analysis by means of natural language processing
Aus:
Cress, Ulrike;Moskaliuk, Johannes;Jeong, Heisawn (Hrsg.): Mass collaboration and education, Cham: Springer, 2016 , S. 367-390
DOI:
10.1007/978-3-319-13536-6_18
Dokumenttyp:
4. Beiträge in Sammelwerken; Sammelband (keine besondere Kategorie)
Sprache:
Englisch
Schlagwörter:
Argumentation; Computerlinguistik; Data Mining; Daten; Inhaltsanalyse; Text; Web log; Wiki; Wissen
Abstract:
This chapter describes perspectives for utilizing natural language processing (NLP) to analyze artifacts arising from mass collaboration on the web. In recent years, the amount of user-generated content on the web has grown drastically. This content is typically noisy and un- or at best semi-structured, so that traditional analysis tools cannot properly handle it. To discover linguistic structures in this data, manual analysis is not feasible due to the large quantities of data. In this chapter, we explain and analyze web-based resources of mass collaboration, namely, wikis, web forums, debate platforms, and blog comments. We introduce recent advances and ongoing efforts to analyze textual content in two of these resources with the help of NLP. This includes an approach to discover flows of knowledge in online mass collaboration as well as methods to mine argumentative structures in natural language text. Finally, we outline application scenarios of the previously discussed techniques and resources within the domain of education. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
What makes a convincing argument? Empirical analysis and detecting attributes of convincingness in […]
Habernal, Ivan; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 2016 conference on Empirical Methods in Natural Language Processing (EMNLP) | Stroudsburg; PA: Association for Computational Linguistics | 2016
36989 Endnote
Autor*innen:
Habernal, Ivan; Gurevych, Iryna
Titel:
What makes a convincing argument? Empirical analysis and detecting attributes of convincingness in Web argumentation
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the 2016 conference on Empirical Methods in Natural Language Processing (EMNLP), Stroudsburg; PA: Association for Computational Linguistics, 2016 , S. 1214-1223
URL:
https://aclweb.org/anthology/D16-1129
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Argumentation; Bewertung; Computerlinguistik; Klassifikation; Qualität; Überzeugung
Abstract (english):
This article tackles a new challenging task in computational argumentation. Given a pair of two arguments to a certain controversial topic, we aim to directly assess qualitative properties of the arguments in order to explain why one argument is more convincing than the other one. We approach this task in a fully empirical manner by annotating 26k explanations written in natural language. These explanations describe convincingness of arguments in the given argument pair, such as their strengths or flaws. We create a new crowd-sourced corpus containing 9,111 argument pairs, multi-labeled with 17 classes, which was cleaned and curated by employing several strict quality measures. We propose two tasks on this data set, namely (1) predicting the full label distribution and (2) classifying types of flaws in less convincing arguments. Our experiments with feature-rich SVM learners and Bidirectional LSTM neural networks with convolution and attention mechanism reveal that such a novel fine-grained analysis of Web argument convincingness is a very challenging task. We release the new UKPConvArg2 corpus and software under permissive licenses to the research community. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
C4Corpus. Multilingual web-size corpus with free license
Habernal, Ivan; Zayed, Omnia; Gurevych, Iryna
Sammelbandbeitrag
| Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) | Portoloz: European Language Resources Association | 2016
37065 Endnote
Autor*innen:
Habernal, Ivan; Zayed, Omnia; Gurevych, Iryna
Titel:
C4Corpus. Multilingual web-size corpus with free license
Aus:
European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoloz: European Language Resources Association, 2016 , S. 914-922
URL:
https://www.informatik.tu-darmstadt.de/de/forschung/veroeffentlichungen/details/?no_cache=1&tx_bibtex_pi1%5Bpub_id%5D=TUD-CS-2016-0023
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Datenanalyse; Dokument; Internet; Text; Textanalyse; Urheberrecht
Abstract:
Large Web corpora containing full documents with permissive licenses are crucial for many NLP tasks. In this article we present the construction of 12 million-pages Web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs. Our highly-scalable Hadoop-based framework is able to process the full CommonCrawl corpus on 2000+ CPU cluster on the Amazon Elastic Map/Reduce infrastructure. The processing pipeline includes license identification, state-of-the-art boilerplate removal, exact duplicate and near-duplicate document removal, and language detection. The construction of the corpus is highly configurable and fully reproducible, and we provide both the framework (DKPro C4CorpusTools) and the resulting data (C4Corpus) to the research community. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Markierungen aufheben
Alle Treffer markieren
Export
<
1
2
3
(aktuell)
4
...
11
>
Alle anzeigen
(110)