Ergebnis der Suche in der DIPF Publikationendatenbank

Ihre Abfrage:

(Schlagwörter: "Computerlinguistik")

Medical concept embeddings via labeled background corpora Mencía, Eneldo Loza; De Melo, Gerard; Nam, Jinseok Sammelbandbeitrag | Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) | Portoroz: European Language Resources Association | 2016 37067 Endnote: Autor*innen: Mencía, Eneldo Loza; De Melo, Gerard; Nam, Jinseok
Titel: Medical concept embeddings via labeled background corpora
Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoroz: European Language Resources Association, 2016 , S. 3629-3636
URL: http://www.lrec-conf.org/proceedings/lrec2016/pdf/1190_Paper.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Algorithmus; Automatisierung; Computerlinguistik; Medizin; Semantik; Sprache; Textanalyse
Abstract: In recent years, we have seen an increasing amount of interest in low-dimensional vector representations of words. Among other things, these facilitate computing word similarity and relatedness scores. The most well-known example of algorithms to produce representations of this sort are the word2vec approaches. In this paper, we investigate a new model to induce such vector spaces for medical concepts, based on a joint objective that exploits not only word co-occurrences but also manually labeled documents, as available from sources such as PubMed. Our extensive experimental analysis shows that our embeddings lead to significantly higher correlations with human similarity and relatedness assessments than previous work. Due to the simplicity and versatility of vector representations, these findings suggest that our resource can easily be used as a drop-in replacement to improve any systems relying on medical concept similarity measures. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Enriching wikidata with frame semantics Mousselly-Sergieh, Hatem; Gurevych, Iryna Sammelbandbeitrag | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 5th workshop on automated knowledge base construction (AKBC) 2016 held in conjunction with NAACL 2016 | Stroudsburg; PA: Association for Computational Linguistics | 2016 36979 Endnote: Autor*innen: Mousselly-Sergieh, Hatem; Gurevych, Iryna
Titel: Enriching wikidata with frame semantics
Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 5th workshop on automated knowledge base construction (AKBC) 2016 held in conjunction with NAACL 2016, Stroudsburg; PA: Association for Computational Linguistics, 2016 , S. 29-34
URL: https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2016/2016_NAACL_AKBC_HMS.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Automatisierung; Computerlinguistik; Lexikon; Mehrsprachigkeit; Online; Semantik
Abstract (english): Wikidata is a large-scale, multilingual and freely available knowledge base. It contains more than 14 million facts, however, it is still missing linguistic information. In this paper, we aim to bridge this gap by aligning Wikidata with FrameNet lexicon. We propose an approach based on word embedding to identify a mapping between Wikidata relations, called properties, and FrameNet frames and to annotate the arguments of each relation with the semantic roles of the matching frames. Early empirical results show the advantage of our approach compared to other baseline methods. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Domain-specific corpus expansion with focused webcrawling Remus, Steffen; Biemann, Chris Sammelbandbeitrag | Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) | Portoroz: European Language Resources Association | 2016 37066 Endnote: Autor*innen: Remus, Steffen; Biemann, Chris
Titel: Domain-specific corpus expansion with focused webcrawling
Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoroz: European Language Resources Association, 2016 , S. 3607-3611
URL: http://www.lrec-conf.org/proceedings/lrec2016/pdf/316_Paper.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Algorithmus; Automatisierung; Bildung; Computerlinguistik; Data Mining; Hypertext; Modell; Sprache; Text; Textanalyse
Abstract: This work presents a straightforward method for extending or creating in-domain web corpora by focused webcrawling. The focused webcrawler uses statistical N-gram language models to estimate the relatedness of documents and weblinks and needs as input only N-grams or plain texts of a predefined domain and seed URLs as starting points. Two experiments demonstrate that our focused crawler is able to stay focused in domain and language. The first experiment shows that the crawler stays in a focused domain, the second experiment demonstrates that language models trained on focused crawls obtain better perplexity scores on in-domain corpora. We distribute the focused crawler as open source software. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Crowdsourcing a large dataset of domain-specific context-sensitive semantic verb relations Sukhareva, Maria; Eckle-Kohler, Judith; Habernal, Ivan; Gurevych, Iryna Sammelbandbeitrag | Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) | Portoroz: European Language Resources Association | 2016 36972 Endnote: Autor*innen: Sukhareva, Maria; Eckle-Kohler, Judith; Habernal, Ivan; Gurevych, Iryna
Titel: Crowdsourcing a large dataset of domain-specific context-sensitive semantic verb relations
Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoroz: European Language Resources Association, 2016 , S. 2131-2137
URL: https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2016/lrec2016_sukhareva.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Automatisierung; Computerlinguistik; Data Mining; Klassifikation; Semantik; Textanalyse
Abstract (english): We present a new large dataset of 12403 context-sensitive verb relations manually annotated via crowdsourcing. These relations capture fine-grained semantic information between verb-centric propositions, such as temporal or entailment relations. We propose a novel semantic verb relation scheme and design a multi-step annotation approach for scaling-up the annotations using crowdsourcing. We employ several quality measures and report on agreement scores. The resulting dataset is available under a permissive CreativeCommons license. It represents a valuable resource for various applications, such as automatic information consolidation or automatic summarization. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Using semantic similarity for multi-label zero-shot classification of text documents Veeranna, Sappadla Prateek; Nam, Jinseok; Mencía, Eneldo Loza; Fürnkranz, Johannes Sammelbandbeitrag | Aus: European Symposium on Artificial Neural Networks (Hrsg.): ESANN 2016 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges (Belgium), 27-29 April 2016 | Bruges: European Symposium on Artificial Neural Networks | 2016 36982 Endnote: Autor*innen: Veeranna, Sappadla Prateek; Nam, Jinseok; Mencía, Eneldo Loza; Fürnkranz, Johannes
Titel: Using semantic similarity for multi-label zero-shot classification of text documents
Aus: European Symposium on Artificial Neural Networks (Hrsg.): ESANN 2016 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges (Belgium), 27-29 April 2016, Bruges: European Symposium on Artificial Neural Networks, 2016 , S. 423-428
URL: https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2016-174.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Computerlinguistik; Klassifikation; Semantik; Text
Abstract (english): In this paper, we examine a simple approach to zero-shot multi-label text classification, i.e., to the problem of predicting multiple, possibly previously unseen labels for a document. In particular, we propose to use a semantic embedding of label and document words and base the prediction of previously unseen labels on the similarity between the label name and the document words in this embedding. Experiments on three textual datasets across various domains show that even such a simple technique yields considerable performance improvements over a simple uninformed baseline. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

The eras and trends of automatic short answer grading Burrows, Steven; Gurevych, Iryna; Stein, Benno Zeitschriftenbeitrag | In: International Journal of Artificial Intelligence in Education | 2015 34978 Endnote: Autor*innen: Burrows, Steven; Gurevych, Iryna; Stein, Benno
Titel: The eras and trends of automatic short answer grading
In: International Journal of Artificial Intelligence in Education, 25 (2015) 1, S. 60-117
DOI: 10.1007/s40593-014-0026-8
URL: http://link.springer.com/article/10.1007/s40593-014-0026-8
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Frage; Leistungsbeurteilung; Methode; Notengebung; Technologiebasiertes Testen; Testaufgabe
Abstract: Automatic short answer grading (ASAG) is the task of assessing short natural language responses to objective questions using computational methods. The active research in this field has increased enormously of late with over 80 papers fitting a definition of ASAG. However, the past efforts have generally been ad-hoc and non-comparable until recently, hence the need for a unified view of the whole field. The goal of this paper is to address this aim with a comprehensive review of ASAG research and systems according to history and components. Our historical analysis identifies 35 ASAG systems within 5 temporal themes that mark advancement in methodology or evaluation. In contrast, our component analysis reviews 6 common dimensions from preprocessing to effectiveness. A key conclusion is that an era of evaluation is the newest trend in ASAG research, which is paving the way for the consolidation of the field. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Analyzing domain suitability of a sentiment lexicon by identifying distributionally bipolar words Flekova, Lucie; Ruppert, Eugen; Preotiuc-Pietro, Daniel Sammelbandbeitrag | Aus: Association for Computational Linguistics (Hrsg.): 6th workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2015): Workshop proceedings, 17 September 2015, Lisboa, Portugal | Red Hook; NY: Association for Computational Linguistics | 2015 37028 Endnote: Autor*innen: Flekova, Lucie; Ruppert, Eugen; Preotiuc-Pietro, Daniel
Titel: Analyzing domain suitability of a sentiment lexicon by identifying distributionally bipolar words
Aus: Association for Computational Linguistics (Hrsg.): 6th workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2015): Workshop proceedings, 17 September 2015, Lisboa, Portugal, Red Hook; NY: Association for Computational Linguistics, 2015 , S. 77-84
URL: http://www.emnlp2015.org/proceedings/WASSA/WASSA-2015.pdf#page=89
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Automatisierung; Computerlinguistik; Emotion; Kommunikation; Lexikographie; Lexikon; Online; Qualität; Soziale Software; Textanalyse; Thesaurus
Abstract: Contemporary sentiment analysis approaches rely on lexicon based methods. This is mainly due to their simplicity, although the best empirical results can be achieved by more complex techniques. We introduce a method to assess suitability of generic sentiment lexicons for a given domain, namely to identify frequent bigrams where a polar word switches polarity. Our bigrams are scored using Lexicographers Mutual Information and leveraging large automatically obtained corpora. Our score matches human perception of polarity and demonstrates improvements in classification results using our enhanced context-aware method. Our method enhances the assessment of lexicon based sentiment detection and can be further userd to quantify ambiguous words. (DIPF/Orig.)

Linking the thoughts. Analysis of argumentation structures in scientific publications Kirschner, Christian; Eckle-Kohler, Judith; Gurevych, Iryna Sammelbandbeitrag | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 2nd Workshop on Argumentation Mining held in conjunction with the 2015 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT 2015) | Denver; CO: Association for Computational Linguistics | 2015 35503 Endnote: Autor*innen: Kirschner, Christian; Eckle-Kohler, Judith; Gurevych, Iryna
Titel: Linking the thoughts. Analysis of argumentation structures in scientific publications
Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 2nd Workshop on Argumentation Mining held in conjunction with the 2015 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT 2015), Denver; CO: Association for Computational Linguistics, 2015 , S. 1-11
URL: https://aclweb.org/anthology/W/W15/W15-05.pdf
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Argumentation; Automatisierung; Bildungsforschung; Computerlinguistik; Data Mining; Klassifikation; Textanalyse; Veröffentlichung
Abstract: This paper presents the results of an annotation study focused on the fine-grained analysis of argumentation structures in scientific publications. Our new annotation scheme specifies four types of binary argumentative relations between sentences, resulting in the representation of arguments as small graph structures. We developed an annotation tool that supports the annotation of such graphs and carried out an annotation study with four annotators on 24 scientific articles from the domain of educational research. For calculating the inter-annotator agreement, we adapted existing measures and developed a novel graph based agreement measure which reflects the semantic similarity of different annotation graphs. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Constructive feedback, thinking process and cooperation. Assessing the quality of classroom […] Sousa, Tahir; Flekova, Lucie; Mieskes, Margot; Gurevych, Iryna Sammelbandbeitrag | Aus: Möller, Sebastian (Hrsg.): Proceedings of the Interspeech 2015 Conference Dresden | Berlin: Technische Universität | 2015 35635 Endnote: Autor*innen: Sousa, Tahir; Flekova, Lucie; Mieskes, Margot; Gurevych, Iryna
Titel: Constructive feedback, thinking process and cooperation. Assessing the quality of classroom interaction
Aus: Möller, Sebastian (Hrsg.): Proceedings of the Interspeech 2015 Conference Dresden, Berlin: Technische Universität, 2015 , S. 2739-2743
Dokumenttyp: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache: Englisch
Schlagwörter: Computerlinguistik; Datenanalyse; Denken; Deutschland; Diskursanalyse; Feedback; Interaktionsanalyse; Klassifikation; Kooperation; Mathematikunterricht; Qualität; Schulklasse; Schweiz; Semantik; Soziale Interaktion; Sprachanalyse; Unterrichtsforschung; Video
Abstract: Analyzing and assessing the quality of classroom lessons on a range of quality dimensions is a number one educational research topic, as this allows developing teacher trainings and interventions to improve lesson quality. We model this assessment as a text classification task, exploiting linguistic features to predict the scores in several lesson quality dimensions relevant for educational researchers. Our work relies on a variety of phenomena, amongst them paralinguistic features, such as laughter, from real classroom interactions. We used these features to train machine learning models to assess various quality dimensions of school lessons. Our results show, that especially features focusing on the discourse and semantics are beneficial for this classification task. (DIPF/Orig.)
DIPF-Abteilung: Informationszentrum Bildung

Predicting the difficulty of language proficiency tests Beinborn, Lisa; Zesch, Torsten; Gurevych, Iryna Zeitschriftenbeitrag | In: Transactions of the Association for Computational Linguistics | 2014 34990 Endnote: Autor*innen: Beinborn, Lisa; Zesch, Torsten; Gurevych, Iryna
Titel: Predicting the difficulty of language proficiency tests
In: Transactions of the Association for Computational Linguistics, 2 (2014) , S. 517-529
URL: http://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/414/88
Dokumenttyp: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache: Englisch
Schlagwörter: Computerlinguistik; Datenanalyse; Deutschland; Fremdsprache; Kenntnisse; Lernerfolg; Prognose; Schwierigkeit; Sprachfertigkeit; Sprachtest; Student; Verfahren
Abstract: Language proficiency tests are used to evaluate and compare the progress of language learners. We present an approach for automatic difficulty prediction of C-tests that performs on par with human experts. On the basis of detailed analysis of newly collected data, we develop a model for C-test difficulty introducing four dimensions: solution difficulty, candidate ambiguity, inter-gap dependency, and paragraph difficulty. We show that cues from all four dimensions contribute to C-test difficulty. (DIPF/Org.)
DIPF-Abteilung: Informationszentrum Bildung