Menü Überspringen
Kontakt
Presse
Deutsch
English
Not track
Datenverarbeitung
Suche
Anmelden
DIPF aktuell
Forschung
Infrastrukturen
Institut
Zurück
Kontakt
Presse
Deutsch
English
Not track
Datenverarbeitung
Suche
Startseite
>
Forschung
>
Publikationen
>
Publikationendatenbank
Ergebnis der Suche in der DIPF Publikationendatenbank
Ihre Abfrage:
(Schlagwörter: "Computerlinguistik")
zur erweiterten Suche
Suchbegriff
Nur Open Access
Suchen
Markierungen aufheben
Alle Treffer markieren
Export
110
Inhalte gefunden
Alle Details anzeigen
Medical concept embeddings via labeled background corpora
Mencía, Eneldo Loza; De Melo, Gerard; Nam, Jinseok
Sammelbandbeitrag
| Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) | Portoroz: European Language Resources Association | 2016
37067 Endnote
Autor*innen:
Mencía, Eneldo Loza; De Melo, Gerard; Nam, Jinseok
Titel:
Medical concept embeddings via labeled background corpora
Aus:
European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoroz: European Language Resources Association, 2016 , S. 3629-3636
URL:
http://www.lrec-conf.org/proceedings/lrec2016/pdf/1190_Paper.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Algorithmus; Automatisierung; Computerlinguistik; Medizin; Semantik; Sprache; Textanalyse
Abstract:
In recent years, we have seen an increasing amount of interest in low-dimensional vector representations of words. Among other things, these facilitate computing word similarity and relatedness scores. The most well-known example of algorithms to produce representations of this sort are the word2vec approaches. In this paper, we investigate a new model to induce such vector spaces for medical concepts, based on a joint objective that exploits not only word co-occurrences but also manually labeled documents, as available from sources such as PubMed. Our extensive experimental analysis shows that our embeddings lead to significantly higher correlations with human similarity and relatedness assessments than previous work. Due to the simplicity and versatility of vector representations, these findings suggest that our resource can easily be used as a drop-in replacement to improve any systems relying on medical concept similarity measures. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Enriching wikidata with frame semantics
Mousselly-Sergieh, Hatem; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 5th workshop on automated knowledge base construction (AKBC) 2016 held in conjunction with NAACL 2016 | Stroudsburg; PA: Association for Computational Linguistics | 2016
36979 Endnote
Autor*innen:
Mousselly-Sergieh, Hatem; Gurevych, Iryna
Titel:
Enriching wikidata with frame semantics
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the 5th workshop on automated knowledge base construction (AKBC) 2016 held in conjunction with NAACL 2016, Stroudsburg; PA: Association for Computational Linguistics, 2016 , S. 29-34
URL:
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2016/2016_NAACL_AKBC_HMS.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Lexikon; Mehrsprachigkeit; Online; Semantik
Abstract (english):
Wikidata is a large-scale, multilingual and freely available knowledge base. It contains more than 14 million facts, however, it is still missing linguistic information. In this paper, we aim to bridge this gap by aligning Wikidata with FrameNet lexicon. We propose an approach based on word embedding to identify a mapping between Wikidata relations, called properties, and FrameNet frames and to annotate the arguments of each relation with the semantic roles of the matching frames. Early empirical results show the advantage of our approach compared to other baseline methods. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Domain-specific corpus expansion with focused webcrawling
Remus, Steffen; Biemann, Chris
Sammelbandbeitrag
| Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) | Portoroz: European Language Resources Association | 2016
37066 Endnote
Autor*innen:
Remus, Steffen; Biemann, Chris
Titel:
Domain-specific corpus expansion with focused webcrawling
Aus:
European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoroz: European Language Resources Association, 2016 , S. 3607-3611
URL:
http://www.lrec-conf.org/proceedings/lrec2016/pdf/316_Paper.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Algorithmus; Automatisierung; Bildung; Computerlinguistik; Data Mining; Hypertext; Modell; Sprache; Text; Textanalyse
Abstract:
This work presents a straightforward method for extending or creating in-domain web corpora by focused webcrawling. The focused webcrawler uses statistical N-gram language models to estimate the relatedness of documents and weblinks and needs as input only N-grams or plain texts of a predefined domain and seed URLs as starting points. Two experiments demonstrate that our focused crawler is able to stay focused in domain and language. The first experiment shows that the crawler stays in a focused domain, the second experiment demonstrates that language models trained on focused crawls obtain better perplexity scores on in-domain corpora. We distribute the focused crawler as open source software. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Crowdsourcing a large dataset of domain-specific context-sensitive semantic verb relations
Sukhareva, Maria; Eckle-Kohler, Judith; Habernal, Ivan; Gurevych, Iryna
Sammelbandbeitrag
| Aus: European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016) | Portoroz: European Language Resources Association | 2016
36972 Endnote
Autor*innen:
Sukhareva, Maria; Eckle-Kohler, Judith; Habernal, Ivan; Gurevych, Iryna
Titel:
Crowdsourcing a large dataset of domain-specific context-sensitive semantic verb relations
Aus:
European Language Resources Association (Hrsg.): Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoroz: European Language Resources Association, 2016 , S. 2131-2137
URL:
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2016/lrec2016_sukhareva.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Data Mining; Klassifikation; Semantik; Textanalyse
Abstract (english):
We present a new large dataset of 12403 context-sensitive verb relations manually annotated via crowdsourcing. These relations capture fine-grained semantic information between verb-centric propositions, such as temporal or entailment relations. We propose a novel semantic verb relation scheme and design a multi-step annotation approach for scaling-up the annotations using crowdsourcing. We employ several quality measures and report on agreement scores. The resulting dataset is available under a permissive CreativeCommons license. It represents a valuable resource for various applications, such as automatic information consolidation or automatic summarization. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Using semantic similarity for multi-label zero-shot classification of text documents
Veeranna, Sappadla Prateek; Nam, Jinseok; Mencía, Eneldo Loza; Fürnkranz, Johannes
Sammelbandbeitrag
| Aus: European Symposium on Artificial Neural Networks (Hrsg.): ESANN 2016 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges (Belgium), 27-29 April 2016 | Bruges: European Symposium on Artificial Neural Networks | 2016
36982 Endnote
Autor*innen:
Veeranna, Sappadla Prateek; Nam, Jinseok; Mencía, Eneldo Loza; Fürnkranz, Johannes
Titel:
Using semantic similarity for multi-label zero-shot classification of text documents
Aus:
European Symposium on Artificial Neural Networks (Hrsg.): ESANN 2016 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges (Belgium), 27-29 April 2016, Bruges: European Symposium on Artificial Neural Networks, 2016 , S. 423-428
URL:
https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2016-174.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Klassifikation; Semantik; Text
Abstract (english):
In this paper, we examine a simple approach to zero-shot multi-label text classification, i.e., to the problem of predicting multiple, possibly previously unseen labels for a document. In particular, we propose to use a semantic embedding of label and document words and base the prediction of previously unseen labels on the similarity between the label name and the document words in this embedding. Experiments on three textual datasets across various domains show that even such a simple technique yields considerable performance improvements over a simple uninformed baseline. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
The eras and trends of automatic short answer grading
Burrows, Steven; Gurevych, Iryna; Stein, Benno
Zeitschriftenbeitrag
| In: International Journal of Artificial Intelligence in Education | 2015
34978 Endnote
Autor*innen:
Burrows, Steven; Gurevych, Iryna; Stein, Benno
Titel:
The eras and trends of automatic short answer grading
In:
International Journal of Artificial Intelligence in Education, 25 (2015) 1, S. 60-117
DOI:
10.1007/s40593-014-0026-8
URL:
http://link.springer.com/article/10.1007/s40593-014-0026-8
Dokumenttyp:
3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Frage; Leistungsbeurteilung; Methode; Notengebung; Technologiebasiertes Testen; Testaufgabe
Abstract:
Automatic short answer grading (ASAG) is the task of assessing short natural language responses to objective questions using computational methods. The active research in this field has increased enormously of late with over 80 papers fitting a definition of ASAG. However, the past efforts have generally been ad-hoc and non-comparable until recently, hence the need for a unified view of the whole field. The goal of this paper is to address this aim with a comprehensive review of ASAG research and systems according to history and components. Our historical analysis identifies 35 ASAG systems within 5 temporal themes that mark advancement in methodology or evaluation. In contrast, our component analysis reviews 6 common dimensions from preprocessing to effectiveness. A key conclusion is that an era of evaluation is the newest trend in ASAG research, which is paving the way for the consolidation of the field. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Analyzing domain suitability of a sentiment lexicon by identifying distributionally bipolar words
Flekova, Lucie; Ruppert, Eugen; Preotiuc-Pietro, Daniel
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): 6th workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2015): Workshop proceedings, 17 September 2015, Lisboa, Portugal | Red Hook; NY: Association for Computational Linguistics | 2015
37028 Endnote
Autor*innen:
Flekova, Lucie; Ruppert, Eugen; Preotiuc-Pietro, Daniel
Titel:
Analyzing domain suitability of a sentiment lexicon by identifying distributionally bipolar words
Aus:
Association for Computational Linguistics (Hrsg.): 6th workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2015): Workshop proceedings, 17 September 2015, Lisboa, Portugal, Red Hook; NY: Association for Computational Linguistics, 2015 , S. 77-84
URL:
http://www.emnlp2015.org/proceedings/WASSA/WASSA-2015.pdf#page=89
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Automatisierung; Computerlinguistik; Emotion; Kommunikation; Lexikographie; Lexikon; Online; Qualität; Soziale Software; Textanalyse; Thesaurus
Abstract:
Contemporary sentiment analysis approaches rely on lexicon based methods. This is mainly due to their simplicity, although the best empirical results can be achieved by more complex techniques. We introduce a method to assess suitability of generic sentiment lexicons for a given domain, namely to identify frequent bigrams where a polar word switches polarity. Our bigrams are scored using Lexicographers Mutual Information and leveraging large automatically obtained corpora. Our score matches human perception of polarity and demonstrates improvements in classification results using our enhanced context-aware method. Our method enhances the assessment of lexicon based sentiment detection and can be further userd to quantify ambiguous words. (DIPF/Orig.)
Linking the thoughts. Analysis of argumentation structures in scientific publications
Kirschner, Christian; Eckle-Kohler, Judith; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 2nd Workshop on Argumentation Mining held in conjunction with the 2015 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT 2015) | Denver; CO: Association for Computational Linguistics | 2015
35503 Endnote
Autor*innen:
Kirschner, Christian; Eckle-Kohler, Judith; Gurevych, Iryna
Titel:
Linking the thoughts. Analysis of argumentation structures in scientific publications
Aus:
Association for Computational Linguistics (Hrsg.): Proceedings of the 2nd Workshop on Argumentation Mining held in conjunction with the 2015 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT 2015), Denver; CO: Association for Computational Linguistics, 2015 , S. 1-11
URL:
https://aclweb.org/anthology/W/W15/W15-05.pdf
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Argumentation; Automatisierung; Bildungsforschung; Computerlinguistik; Data Mining; Klassifikation; Textanalyse; Veröffentlichung
Abstract:
This paper presents the results of an annotation study focused on the fine-grained analysis of argumentation structures in scientific publications. Our new annotation scheme specifies four types of binary argumentative relations between sentences, resulting in the representation of arguments as small graph structures. We developed an annotation tool that supports the annotation of such graphs and carried out an annotation study with four annotators on 24 scientific articles from the domain of educational research. For calculating the inter-annotator agreement, we adapted existing measures and developed a novel graph based agreement measure which reflects the semantic similarity of different annotation graphs. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Constructive feedback, thinking process and cooperation. Assessing the quality of classroom […]
Sousa, Tahir; Flekova, Lucie; Mieskes, Margot; Gurevych, Iryna
Sammelbandbeitrag
| Aus: Möller, Sebastian (Hrsg.): Proceedings of the Interspeech 2015 Conference Dresden | Berlin: Technische Universität | 2015
35635 Endnote
Autor*innen:
Sousa, Tahir; Flekova, Lucie; Mieskes, Margot; Gurevych, Iryna
Titel:
Constructive feedback, thinking process and cooperation. Assessing the quality of classroom interaction
Aus:
Möller, Sebastian (Hrsg.): Proceedings of the Interspeech 2015 Conference Dresden, Berlin: Technische Universität, 2015 , S. 2739-2743
Dokumenttyp:
4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Datenanalyse; Denken; Deutschland; Diskursanalyse; Feedback; Interaktionsanalyse; Klassifikation; Kooperation; Mathematikunterricht; Qualität; Schulklasse; Schweiz; Semantik; Soziale Interaktion; Sprachanalyse; Unterrichtsforschung; Video
Abstract:
Analyzing and assessing the quality of classroom lessons on a range of quality dimensions is a number one educational research topic, as this allows developing teacher trainings and interventions to improve lesson quality. We model this assessment as a text classification task, exploiting linguistic features to predict the scores in several lesson quality dimensions relevant for educational researchers. Our work relies on a variety of phenomena, amongst them paralinguistic features, such as laughter, from real classroom interactions. We used these features to train machine learning models to assess various quality dimensions of school lessons. Our results show, that especially features focusing on the discourse and semantics are beneficial for this classification task. (DIPF/Orig.)
DIPF-Abteilung:
Informationszentrum Bildung
Predicting the difficulty of language proficiency tests
Beinborn, Lisa; Zesch, Torsten; Gurevych, Iryna
Zeitschriftenbeitrag
| In: Transactions of the Association for Computational Linguistics | 2014
34990 Endnote
Autor*innen:
Beinborn, Lisa; Zesch, Torsten; Gurevych, Iryna
Titel:
Predicting the difficulty of language proficiency tests
In:
Transactions of the Association for Computational Linguistics, 2 (2014) , S. 517-529
URL:
http://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/414/88
Dokumenttyp:
3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Sprache:
Englisch
Schlagwörter:
Computerlinguistik; Datenanalyse; Deutschland; Fremdsprache; Kenntnisse; Lernerfolg; Prognose; Schwierigkeit; Sprachfertigkeit; Sprachtest; Student; Verfahren
Abstract:
Language proficiency tests are used to evaluate and compare the progress of language learners. We present an approach for automatic difficulty prediction of C-tests that performs on par with human experts. On the basis of detailed analysis of newly collected data, we develop a model for C-test difficulty introducing four dimensions: solution difficulty, candidate ambiguity, inter-gap dependency, and paragraph difficulty. We show that cues from all four dimensions contribute to C-test difficulty. (DIPF/Org.)
DIPF-Abteilung:
Informationszentrum Bildung
Markierungen aufheben
Alle Treffer markieren
Export
<
1
...
3
4
(aktuell)
5
...
11
>
Alle anzeigen
(110)