Search results in the DIPF database of publications

Your query:

(Schlagwörter: "Computerlinguistik")

The Open Linguistics Working Group Chiarcos, Christian; Hellmanny, Sebastian; Nordhoff, Sebastian; Moran, Steven; Littauer, Richard; […] Book Chapter | Aus: Calzolai, Nicoletta (Hrsg.): Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC) | Istanbul: European Language Resources Association | 2012 32694 Endnote: Author(s): Chiarcos, Christian; Hellmanny, Sebastian; Nordhoff, Sebastian; Moran, Steven; Littauer, Richard; Eckle-Kohler, Judith; Gurevych, Iryna; Hartmann, Silvana; Matuschek, Michael; Meyer, Christian M.
Title: The Open Linguistics Working Group
In: Calzolai, Nicoletta (Hrsg.): Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul: European Language Resources Association, 2012 , S. 3603--3610
URL: http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/LREC2012owlgCameraReady.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Arbeitsgruppe; Austausch; Computerlinguistik; Daten; Datenbank; Linguistik; Open Access; Text
Abstract (english): This paper describes the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation (OKFN). The OWLG is an initiative concerned with linguistic data by scholars from diverse fields, including linguistics, NLP, and information science. The primary goal of the working group is to promote the idea of open linguistic resources, to develop means for their representation and to encourage the exchange of ideas across different disciplines. This paper summarizes the progress of the working group, goals that have been identified, problems that we are going to address, and recent activities and ongoing developments. Here, we put particular emphasis on the development of a Linked Open Data (sub-)cloud of linguistic resources that is currently being pursued by several OWLG members.
DIPF-Departments: Informationszentrum Bildung

CSniper - Annotation-by-query for non-canonical constructions in large corpora Eckart de Castilho, Richard; Bartsch, Sabine; Gurevych, Iryna Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 50th Meeting of the Association for Computational Linguistics (ACL), Jeju, Republic of Korea, 8-14 July 2012 | Stroudsburg; PA: Association for Computational Linguistics | 2012 32869 Endnote: Author(s): Eckart de Castilho, Richard; Bartsch, Sabine; Gurevych, Iryna
Title: CSniper - Annotation-by-query for non-canonical constructions in large corpora
In: Association for Computational Linguistics (Hrsg.): Proceedings of the 50th Meeting of the Association for Computational Linguistics (ACL), Jeju, Republic of Korea, 8-14 July 2012, Stroudsburg; PA: Association for Computational Linguistics, 2012 , S. 85-90
URL: https://aclanthology.org/P12-3.pdf#page=97
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik
Abstract: We present CSNIPER (Corpus Sniper), a tool that implements (i) a web-based multiuser scenario for identifying and annotating non-canonical grammatical constructions in large corpora based on linguistic queries and (ii) evaluation of annotation quality by measuring inter-rater agreement. This annotationby- query approach efficiently harnesses expert knowledge to identify instances of linguistic phenomena that are hard to identify by means of existing automatic annotation tools.
DIPF-Departments: Informationszentrum Bildung

Standardizing Lexical-Semantic Resources - Fleshing out the abstract standard LMF Eckle-Kohler, Judith; Gurevych, Iryna Book Chapter | Aus: Jancsary, Jeremy (Hrsg.): Proceedings of the Workshop "SFLR 2012: Workshop on Standards for Language Resources" held in conjunction with the 11th Conference on Natural Language Processing (KONVENS 2012) | Wien; Österreich: Österreichische Gesellschaft für Artificial Intelligence | 2012 33095 Endnote: Author(s): Eckle-Kohler, Judith; Gurevych, Iryna
Title: Standardizing Lexical-Semantic Resources - Fleshing out the abstract standard LMF
In: Jancsary, Jeremy (Hrsg.): Proceedings of the Workshop "SFLR 2012: Workshop on Standards for Language Resources" held in conjunction with the 11th Conference on Natural Language Processing (KONVENS 2012), Wien; Österreich: Österreichische Gesellschaft für Artificial Intelligence, 2012 , S. 496-505
URL: http://www.oegai.at/konvens2012/proceedings/75_ecklekohler12w/75_ecklekohler12w.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik
Abstract (english): This paper describes the application of the Lexical Markup Framework (LMF) for standardizing lexical-semantic resources in the context of NLP. More specifically, we highlight the question how lexical-semantic resources can be made semantically interoperable by means of LMF and ISOCat. The LMF model UBY-LMF, an instantiation of LMF specifically for NLP, serves as an example to illustrate the path towards semantic interoperability of lexical resources.
DIPF-Departments: Informationszentrum Bildung

Subcat-LMF: Fleshing out a standardized format for subcategorization frame interoperability Eckle-Kohler, Judith; Gurevych, Iryna Book Chapter | Aus: The Association for Computational Linguistics (Hrsg.): Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012) | Avignon: Association for Computational Linguistics | 2012 32697 Endnote: Author(s): Eckle-Kohler, Judith; Gurevych, Iryna
Title: Subcat-LMF: Fleshing out a standardized format for subcategorization frame interoperability
In: The Association for Computational Linguistics (Hrsg.): Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon: Association for Computational Linguistics, 2012 , S. 550--560
URL: http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/EACL2012subcatLMFcamera.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Deutsch; Englisch; Lexikon; Sprachvergleich
Abstract (english): This paper describes Subcat-LMF, an ISOLMF compliant lexicon representation format featuring a uniform representation of subcategorization frames (SCFs) for the two languages English and German. Subcat-LMF is able to represent SCFs at a very fine-grained level. We utilized Subcat- LMF to standardize lexicons with largescale SCF information: the English Verb- Net and two German lexicons, i.e., a subset of IMSlex and GermaNet verbs. To evaluate our LMF-model, we performed a crosslingual comparison of SCF coverage and overlap for the standardized versions of the English and German lexicons. The Subcat- LMF DTD, the conversion tools and the standardized versions of VerbNet and IMSlex subset are publicly available.
DIPF-Departments: Informationszentrum Bildung

UBY-LMF - A uniform model for standardizing heterogeneous lexical-semantic resources in ISO-LMF Eckle-Kohler, Judith; Gurevych, Iryna; Hartmann, Silvana; Matuschek, Michael; Meyer, Christian M. Book Chapter | Aus: Calzolar, Nicoletta (Hrsg.): Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC) | Istanbul: European Language Resources Association | 2012 32693 Endnote: Author(s): Eckle-Kohler, Judith; Gurevych, Iryna; Hartmann, Silvana; Matuschek, Michael; Meyer, Christian M.
Title: UBY-LMF - A uniform model for standardizing heterogeneous lexical-semantic resources in ISO-LMF
In: Calzolar, Nicoletta (Hrsg.): Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul: European Language Resources Association, 2012 , S. 275--282
URL: http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/LREC2012_ubyLMFcamera-Ready.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Computerunterstütztes Verfahren; Deutsch; Englisch; Information; Lexikon; Mehrsprachigkeit; Modell; Ontologie; Semantic Web; Softwaretechnologie; Soziale Software; Sprachanalyse; Standard
Abstract (english): We present UBY-LMF, an LMF-based model for large-scale, heterogeneous multilingual lexical-semantic resources (LSRs). UBY-LMF allows the standardization of LSRs down to a fine-grained level of lexical information by employing a large number of Data Categories from ISOCat. We evaluate UBY-LMF by converting nine LSRs in two languages to the corresponding format: the English WordNet, Wiktionary, Wikipedia, OmegaWiki, FrameNet and VerbNet and the German Wikipedia, Wiktionary and GermaNet. The resulting LSR, UBY (Gurevych et al., 2012), holds interoperable versions of all nine resources which can be queried by an easy to use public Java API. UBY-LMF covers a wide range of information types from expert-constructed and collaboratively constructed resources for English and German, also including links between different resources at the word sense level. It is designed to accommodate further resources and languages as well as automatically mined lexical-semantic knowledge.
DIPF-Departments: Informationszentrum Bildung

UKP-UBC entity linking at TAC-KBP Erbs, Nicolai; Agirre, Eneko; Soroa, Aitor; Barrena, Ander; Etxebarria, Ugaitz; Gurevych, Iryna; […] Book Chapter | Aus: Kay, Martin; Boitet, Christian (Hrsg.): Proceedings of the Fifth Text Analysis Conference (TAC 2012), November 5-6, 2012, National Institute of Standards and Technology, Gaithersburg, Maryland, USA | Gaithersburg; MD: National Institute of Standards and Technology (NIST) | 2012 33144 Endnote: Author(s): Erbs, Nicolai; Agirre, Eneko; Soroa, Aitor; Barrena, Ander; Etxebarria, Ugaitz; Gurevych, Iryna; Zesch, Torsten
Title: UKP-UBC entity linking at TAC-KBP
In: Kay, Martin; Boitet, Christian (Hrsg.): Proceedings of the Fifth Text Analysis Conference (TAC 2012), November 5-6, 2012, National Institute of Standards and Technology, Gaithersburg, Maryland, USA, Gaithersburg; MD: National Institute of Standards and Technology (NIST), 2012 , S. 6
URL: https://tac.nist.gov/publications/2012/participant.papers/UKP_UBC.proceedings.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik
Abstract: This paper describes our system for the entity linking task at TAC KBP 2012. We developed a supervised system using dictionarybased, similarity-based, and graph-based features. As a global feature, we apply Personalized PageRank withWikipedia to weight the list of entity candidates. We use two Wikipedia versions with different timestamps to enrich the knowledge base and develop an algorithm for mapping between the two Wikipedia versions. We observed a large drop in system performance when moving from training data to test data. Our error analysis showed that the guidelines for mention annotation were not followed by annotators. An additional mention detection component should improve performance to the expected level. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

Behind the article: Recognizing dialog acts in Wikipedia talk pages Ferschke, Oliver; Gurevych, Iryna; Chebotar, Yevgen Book Chapter | Aus: The Association for Computer Linguistics (ACL) (Hrsg.): Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012) | Avignon: Association for Computational Linguistics | 2012 32695 Endnote: Author(s): Ferschke, Oliver; Gurevych, Iryna; Chebotar, Yevgen
Title: Behind the article: Recognizing dialog acts in Wikipedia talk pages
In: The Association for Computer Linguistics (ACL) (Hrsg.): Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon: Association for Computational Linguistics, 2012 , S. 777--786
URL: http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/EACL_2012_OF.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Dialog; Sprachanalyse; Sprache; Text
Abstract (english): In this paper, we propose an annotation schema for the discourse analysis of Wikipedia Talk pages aimed at the coordination efforts for article improvement. We apply the annotation schema to a corpus of 100 Talk pages from the Simple English Wikipedia and make the resulting dataset freely available for download1. Furthermore, we perform automatic dialog act classification on Wikipedia discussions and achieve an average F1-score of 0:82 with our classification pipeline.
DIPF-Departments: Informationszentrum Bildung

Uby - a large-scale unified lexical-semantic resource based on LMF Gurevych, Iryna; Eckle-Kohler, Judith; Hartmann, Silvana; Matuschek, Michael; Meyer, Christian M.; […] Book Chapter | Aus: Association for Computational Linguistics (Hrsg.): Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012) | Avignon: Association for Computational Linguistics | 2012 32696 Endnote: Author(s): Gurevych, Iryna; Eckle-Kohler, Judith; Hartmann, Silvana; Matuschek, Michael; Meyer, Christian M.; Wirth, Christian
Title: Uby - a large-scale unified lexical-semantic resource based on LMF
In: Association for Computational Linguistics (Hrsg.): Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon: Association for Computational Linguistics, 2012 , S. 580-590
URL: http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/uby_eacl2012_cameraready.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Computerlinguistik; Computerunterstütztes Verfahren; Deutsch; Englisch; Information; Lexikon; Mehrsprachigkeit; Modell; Ontologie; Semantic Web; Softwaretechnologie; Soziale Software; Sprachanalyse; Standard
Abstract (english): We present UBY, a large-scale lexical semantic resource combining a wide range of information from expert-constructed and collaboratively constructed resources for English and German. It currently contains nine resources in two languages: English WordNet, Wiktionary, Wikipedia, FrameNet and VerbNet, German Wikipedia, Wiktionary and GermaNet, and multilingual OmegaWiki modeled according to the LMF standard. For FrameNet, VerbNet and all collaboratively constructed resources, this is done for the first time. Our LMF model captures lexical information at a fine-grained level by employing a large number of Data Categories from ISOCat and is designed to be directly extensible by new languages and resources. All resources in UBY can be accessed with an easy to use publicly available API.
DIPF-Departments: Informationszentrum Bildung

OntoWiktionary - constructing an ontology from the collaborative online dictionary Wiktionary Meyer, Christian M.; Gurevych, Iryna Book Chapter | Aus: Pazienza, Maria Teresa; Stellato, Armando (Hrsg.): Semi-automatic ontology development: Processes and resources | Hershey; PA; : IGI Global | 2012 33100 Endnote: Author(s): Meyer, Christian M.; Gurevych, Iryna
Title: OntoWiktionary - constructing an ontology from the collaborative online dictionary Wiktionary
In: Pazienza, Maria Teresa; Stellato, Armando (Hrsg.): Semi-automatic ontology development: Processes and resources, Hershey; PA; : IGI Global, 2012 , S. 131-161
URL: http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/igi-saod2011-meyer-ontowiktionary.pdf
Publication Type: 4. Beiträge in Sammelwerken; Sammelband (keine besondere Kategorie)
Language: Englisch
Keywords: Computerlinguistik; Computerunterstütztes Verfahren; Evaluation; Konzeption; Lexikon; Mehrsprachigkeit; Ontologie; Soziale Software; World wide web 2.0
Abstract: The semi-automatic development of ontologies is an important field of research, since existing ontologies often suffer from their small size, unaffordable construction cost, and limited quality of ontology learning systems. The main objective of this chapter is to introduce Wiktionary, which is a collaborative online dictionary encoding information about words, word senses, and relations between them, as a resource for ontology construction. The authors find that a Wiktionary-based ontology can exceed the size of, for example, OpenCyc and OntoWordNet. One particular advantage of Wiktionary is its multilingual nature, which allows the construction of ontologies for different languages. Additionally, its collaborative construction approach means that novel concepts and domain-specific knowledge are quick to appear in the dictionary. For constructing their ontology OntOWiktiOnary, the authors present a two-step approach that involves (1) harvesting structured knowledge from Wiktionary and (2) ontologizing this knowledge (i.e., the formation of ontological concepts and relationships from the harvested knowledge). They evaluate their approach based on human judgments and find their new ontology to be of overall good quality. To encourage further research in this field, the authors make the final OntOWiktiOnary publicly available and suggest integrating this novel resource with the linked data cloud as well as other existing ontology projects.
DIPF-Departments: Informationszentrum Bildung

Wiktionary - a new rival for expert-built lexicons? Exploring the possibilities of collaborative […] Meyer, M. Christian; Gurevych, Iryna Book Chapter | Aus: Granger, Sylviane; Paquot, Magali (Hrsg.): Electronic lexicography | Oxford: Oxford University Press | 2012 33098 Endnote: Author(s): Meyer, M. Christian; Gurevych, Iryna
Title: Wiktionary - a new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography
In: Granger, Sylviane; Paquot, Magali (Hrsg.): Electronic lexicography, Oxford: Oxford University Press, 2012 , S. 259-291
URL: http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2011/oup-elex2012-meyer-wiktionary.pdf
Publication Type: 4. Beiträge in Sammelwerken; Sammelband (keine besondere Kategorie)
Language: Englisch
Keywords: Analyse; Computerlinguistik; Gemeinschaft; Lexikographie; Lexikon; Mehrsprachigkeit; Online; Soziale Software; Vergleich; World wide web 2.0
Abstract (english): With the rise of the Web 2.0, collaboratively constructed language resources are rivalling expert-built lexicons. The collaborative construction process of these resources is driven by what is called the "Wisdom of Crowds" phenomenon, which offers very promising research opportunities in the context of electronic lexicography. The vast number and broad diversity of authors yield, for instance, quickly growing and constantly updated resources. While expert-built lexicons have been extensively studied in the past, there is yet a gap in researching collaboratively constructed lexicons. We therefore provide a comprehensive description of Wiktionary - a freely available, collaborative online lexicon. We study the variety of encoded lexical, semantic, and cross-lingual knowledge of three different language editions of Wiktionary and compare the coverage of terms, lexemes, word senses, domains, and registers to multiple expert-built lexicons. We conclude our work by discussing several findings and pointing out Wiktionary's future directions and impact on lexicography.
DIPF-Departments: Informationszentrum Bildung