Search results in the DIPF database of publications

Your query:

(Schlagwörter: "Automatisierung")

Semi-supervised neural networks for nested named entity recognition Nam, Jinseok Book Chapter | Aus: Faaß, Getrud;Ruppenhofer, Josef (Hrsg.): Workshop proceedings of the 12th edition of the KONVENS Conference | Hildesheim: Universitätsverlag Hildesheim | 2014 35048 Endnote: Author(s): Nam, Jinseok
Title: Semi-supervised neural networks for nested named entity recognition
In: Faaß, Getrud;Ruppenhofer, Josef (Hrsg.): Workshop proceedings of the 12th edition of the KONVENS Conference, Hildesheim: Universitätsverlag Hildesheim, 2014 , S. 144-148
URL: http://www.uni-hildesheim.de/konvens2014/data/konvens2014-workshop-proceedings.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Automatisierung; Computerlinguistik; Daten; Indexierung; Lernen; Netzwerk; Text
Abstract (english): In this paper, we investigate a semi-supervised learning approach based on neural networks for nested named entity recognition on the GermEval 2014 dataset. The dataset consists of triples of a word, a named entity associated with that word in the first-level and one in the second-level. Additionally, the tag distribution is highly skewed, that is, the number of occurrences of certain types of tags is too small. Hence, we present a unified neural network architecture to deal with named entities in both levels simultaneously and to improve generalization performance on the classes that have a small number of labelled examples. (DIPF/Autor)
DIPF-Departments: Informationszentrum Bildung

Knowledge discovery in scientific literature Nam, Jinseok; Kirschner, Christian; Ma, Zheng; Erbs, Nicolai; Neumann, Susanne; Oelke, Daniela; […] Book Chapter | Aus: Ruppenhofer, Josef; Faaß, Gertrud (Hrsg.): Proceedings of the 12th edition of the KONVENS Conference | Hildesheim: Universitätsverlag Hildesheim | 2014 34993 Endnote: Author(s): Nam, Jinseok; Kirschner, Christian; Ma, Zheng; Erbs, Nicolai; Neumann, Susanne; Oelke, Daniela; Remus, Steffen; Biemann, Chris; Eckle-Kohler, Judith; Fürnkranz, Johannes; Gurevych, Iryna; Rittberger, Marc; Weihe, Karsten
Title: Knowledge discovery in scientific literature
In: Ruppenhofer, Josef; Faaß, Gertrud (Hrsg.): Proceedings of the 12th edition of the KONVENS Conference, Hildesheim: Universitätsverlag Hildesheim, 2014 , S. 66-76
URN: urn:nbn:de:0111-dipfdocs-182323
URL: http://www.dipfdocs.de/volltexte/2020/18232/pdf/Knowledge_Discovery_in_Scientific_Literature_A.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Argumentation; Automatisierung; Bildungsforschung; Elektronische Bibliothek; Indexierung; Information Retrieval; Klassifikation; Semantik; Struktur; Text; Wissen; Wissenschaftliche Literatur
Abstract: Digital libraries allow us to organize a vast amount of publications in a structured way and to extract information of user's interest. In order to support customized use of digital libraries, we develop novel methods and techniques in the Knowledge Discovery in Scientific Literature (KDSL) research program of our graduate school. It comprises several sub-projects to handle specific problems in their own fields. The sub-projects are tightly connected by sharing expertise to arrive at an integrated system. To make consistent progress towards enriching digital libraries to aid users by automatic search and analysis engines, all methods developed in the program are applied to the same set of freely available scientific articles. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

GermEval-2014. Nested named entity recognition with neural networks Reimers, Nils; Eckle-Kohler, Judith; Schnober, Carsten; Kim, Jungi; Gurevych, Iryna Book Chapter | Aus: Faaß, Gertrud; Ruppenhofer, Josef (Hrsg.): Workshop Proceedings of the 12th edition of the KONVENS Conference | Hildesheim: Universitätsverlag Hildesheim | 2014 34989 Endnote: Author(s): Reimers, Nils; Eckle-Kohler, Judith; Schnober, Carsten; Kim, Jungi; Gurevych, Iryna
Title: GermEval-2014. Nested named entity recognition with neural networks
In: Faaß, Gertrud; Ruppenhofer, Josef (Hrsg.): Workshop Proceedings of the 12th edition of the KONVENS Conference, Hildesheim: Universitätsverlag Hildesheim, 2014 , S. 117-120
URL: http://www.uni-hildesheim.de/konvens2014/data/konvens2014-workshop-proceedings.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Daten; Evaluation; Information; Modell; Netzwerk; Sprachanalyse; Textanalyse; Wissen
Abstract: Collobert et al. (2011) showed that deep neural network architectures achieve state-of-the-art performance in many fundamental NLP tasks, including Named Entity Recognition (NER). However, results were only reported for English. This paper reports on experiments for German Named Entity Recognition, using the data from the GermEval 2014 shared task on NER. Our system achieves an F1-measure of 75.09% according to the official metric. (DIPF/Orig.)
DIPF-Departments: Informationszentrum Bildung

The people's web meets NLP. Collaboratively Constructed Language Resources Gurevych, Iryna; Kim, Jungi (Hrsg.) Compilation Book | Dordrecht: Springer | 2013 32811 Endnote: Editor(s) Gurevych, Iryna; Kim, Jungi
Title: The people's web meets NLP. Collaboratively Constructed Language Resources
Published: Dordrecht: Springer, 2013 (Theory and applications of natural language processing)
DOI: 10.1007/978-3-642-35085-6
URL: https://link.springer.com/book/10.1007/978-3-642-35085-6
Publication Type: 2. Herausgeberschaft; Sammelband (keine besondere Kategorie)
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Computerspiel; Data Mining; Forschung; Gemeinschaft; Indexierung; Kooperation; Mehrsprachigkeit; Methodologie; Nachschlagewerk; Ontologie; Schreiben; Semantic Web; Soziale Software; Sprachanalyse; Sprache; Textanalyse; Textverarbeitung; Wissen; World wide web 2.0
Abstract (english): The application of collective intelligence in the domain of language yielded collaboratively constructed language resources (CCLR) that can be used in a variety of ways. For example, Wikipedia, Wiktionary, and other language resources constructed through crowdsourcing such as Games with a Purpose and Mechanical Turk have been used in many ways in NLP. Researchers started using such resources to substitute for or supplement conventional lexical semantic resources such as WordNet or linguistically annotated corpora in different NLP tasks. Another research direction is to utilize NLP techniques to enhance the collaboration process and its outcome. Overall the emergence of CCLRs has generated new challenges to the research field that are to be addressed in the present book. As the research field of CCLRs matures, it has become necessary to summarize a set of results to advance and focus the further research effort.
DIPF-Departments: Informationszentrum Bildung

Multilingual knowledge in aligned Wiktionary and OmegaWiki for translation applications Matuschek, Michael; Meyer, Christian M.; Gurevych, Iryna Journal Article | In: Translation: Corpora, Computation, Cognition (TC3) | 2013 33808 Endnote: Author(s): Matuschek, Michael; Meyer, Christian M.; Gurevych, Iryna
Title: Multilingual knowledge in aligned Wiktionary and OmegaWiki for translation applications
In: Translation: Corpora, Computation, Cognition (TC3), 3 (2013) 1, S. 87-118
URL: https://www.blogs.uni-mainz.de/fb06-tc3/files/2015/11/20-149-1-PB.pdf
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Mehrsprachigkeit; Quelle; Semantik; Übersetzung; World wide web 2.0; Wortschatz
Abstract: Multilingual lexical-semantic resources play an important role in translation applications. However, multilingual resources with sufficient quality and coverage are rare as the effort of manually constructing such a resource is substantial. In recent years, the emergence of Web 2.0 has opened new possibilities for constructing large-scale lexical-semantic resources. We identified Wiktionary and OmegaWiki as two important multilingual initiatives where a community of users ("crowd") collaboratively edits and refines the lexical information. They seem especially appropriate in the multilingual domain as users from all languages and cultures can easily contribute. However, despite their advantages such as open access and coverage of multiple languages, these resources have hardly been systematically investigated and utilized until now. Therefore, the goals of our contribution are threefold: (1) We analyze how these resources emerged and characterize their content and structure; (2) We propose an alignment at the word sense level to exploit the complementary information contained in both resources for increased coverage; (3) We describe a mapping of the resources to a standardized, unified model (UBY-LMF) thus creating a large freely available multilingual resource designed for easy integration into applications such as machine translation or computer-aided translation environments.
DIPF-Departments: Informationszentrum Bildung

Bringing order to digital libraries. From keyphrase extraction to index term assignment Erbs, Nicolai; Gurevych, Iryna; Rittberger, Marc Journal Article | In: D-lib magazine | 2013 33962 Endnote: Author(s): Erbs, Nicolai; Gurevych, Iryna; Rittberger, Marc
Title: Bringing order to digital libraries. From keyphrase extraction to index term assignment
In: D-lib magazine, 19 (2013) 9
DOI: 10.1045/september2013-erbs
URL: http://www.dlib.org/dlib/september13/erbs/09erbs.html
Publication Type: 3b. Beiträge in weiteren Zeitschriften; wissenschaftsorientiert
Language: Englisch
Keywords: Automatisierung; Bildung; Computerlinguistik; Dokument; Elektronische Bibliothek; Indexierung; Klassifikation; Volltext
Abstract: Collections of topically related documents held by digital libraries are valuable resources for users; however, as collections grow, it becomes more difficult to search them for specific information. Structure needs to be introduced to facilitate searching. Assigning index terms is helpful, but it is a tedious task even for professional indexers, requiring knowledge about the collection in general, and the document in particular. Automatic index term assignment (ITA) is considered to be a great improvement. In this paper we present a hybrid approach to index term assignment, using a combination of keyphrase extraction and multi-label classification. Keyphrase extraction efficiently assigns infrequently used index terms, while multi-label classification assigns frequently used index terms. We compare results to other state-of-the-art approaches for related tasks. The assigned index terms allow for a clustering of the document collection. Using hybrid and individual approaches, we evaluate a dataset consisting of German educational documents that was created by professional indexers, and is the first one with German data that allows estimating performance of ITA on languages other than English.
DIPF-Departments: Informationszentrum Bildung

Automatically classifying edit categories in wikipedia revisions Daxenberger, Johannes; Gurevych, Iryna Book Chapter | Aus: Yarowsky, David;Baldwin, Timothy;Korhonen, Anna;Livescu, Karen;Bethard, Steven (Hrsg.): Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) | Stroudsburg; PA: Association for Computational Linguistics | 2013 34053 Endnote: Author(s): Daxenberger, Johannes; Gurevych, Iryna
Title: Automatically classifying edit categories in wikipedia revisions
In: Yarowsky, David;Baldwin, Timothy;Korhonen, Anna;Livescu, Karen;Bethard, Steven (Hrsg.): Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), Stroudsburg; PA: Association for Computational Linguistics, 2013 , S. 578-589
URL: https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2013/EMNLP2013_DaxenbergerGurevych.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Evaluation; Korrektur; Nachschlagewerk; Qualität; Taxonomie; Textanalyse; World wide web 2.0
Abstract: In this paper, we analyze a novel set of features for the task of automatic edit category classification. Edit category classification assigns categories such as spelling error correction, paraphrase or vandalism to edits in a document. Our features are based on differences between two versions of a document including meta data, textual and language properties and markup. In a supervised machine learning experiment, we achieve a micro-averaged F1 score of .62 on a corpus of edits from the English Wikipedia. In this corpus, each edit has been multi-labeled according to a 21-category taxonomy. A model trained on the same data achieves state-of-the-art performance on the related task of fluency edit classification. We apply pattern mining to automatically labeled edits in the revision histories of different Wikipedia articles. Our results suggest that high-quality articles show a higher degree of homogeneity with respect to their collaboration patterns as compared to random articles.
DIPF-Departments: Informationszentrum Bildung

Automatically assigning research methods to journal articles in the domain of social sciences Eckle-Kohler, Judith; Nghiem, Tri-Duc; Gurevych, Iryna Book Chapter | Aus: Grove, Andrew (Hrsg.): Proceedings of the 76th Annual Meeting of the Association for Information Science and Technology | Silver Spring; MD; USA: Association for Information Science and Technology | 2013 34042 Endnote: Author(s): Eckle-Kohler, Judith; Nghiem, Tri-Duc; Gurevych, Iryna
Title: Automatically assigning research methods to journal articles in the domain of social sciences
In: Grove, Andrew (Hrsg.): Proceedings of the 76th Annual Meeting of the Association for Information Science and Technology, Silver Spring; MD; USA: Association for Information Science and Technology, 2013 , S. 1-8
URL: http://www.asis.org/asist2013/proceedings/submissions/papers/45paper.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Indexierung; Metadaten; Methode; Sozialwissenschaften
Abstract: We investigate the automatic assignment of research methods to journal articles from the domain of Social Sciences. We employ Computer Science and Computational Linguistics methodology to perform this automatic assignment of metadata. The multi-label classification system we present uses only abstracts and titles of journal articles as input. Our best system is able to assign the important research methods empirical and quantitative empirical with F-scores of 0.67 and 0.68. These research methods are in the focus of many recent manual analyses of publications databases. Our classification approach could be applied to automatically analyze large publications databases and databases of bibliographic references according to the use of empirical and quantitative empirical methods.
DIPF-Departments: Informationszentrum Bildung

UBY - A large-scale lexical-semantic resource [Abstract] Gurevych, Iryna; Eckle-Kohler, Judith; Hartmann, Silvana; Matuschek, Michael; Meyer, Christian M.; […] Book Chapter | Aus: Theune, M. ; Nijholt, A. (Hrsg.): Book of abstracts of the 23rd Meeting of Computational Linguistics in the Netherlands (CLIN 2013) | Enschede: Universiteit Twente | 2013 33317 Endnote: Author(s): Gurevych, Iryna; Eckle-Kohler, Judith; Hartmann, Silvana; Matuschek, Michael; Meyer, Christian M.; Nghiem, Tri-Duc
Title: UBY - A large-scale lexical-semantic resource [Abstract]
In: Theune, M. ; Nijholt, A. (Hrsg.): Book of abstracts of the 23rd Meeting of Computational Linguistics in the Netherlands (CLIN 2013), Enschede: Universiteit Twente, 2013 , S. 81
URL: http://hmi.ewi.utwente.nl/clin2013-dir/bookofabstracts.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Deutsch; Englisch; Lexikon
Abstract: We present UBY, a large-scale lexical-semantic resource combining a wide range of information from expert-constructed and collaboratively created resources for English and German. It currently contains nine resources in two languages: English WordNet, Wiktionary, Wikipedia, FrameNet and VerbNet, German Wikipedia, Wiktionary, and GermaNet, and the multilingual OmegaWiki. The main contributions of our work can be summarised as follows. First, we define a standardised format for modelling the heterogeneous information coming from the various lexical-semantic resources (LSRs) and languages included in UBY. For this purpose, we employ the ISO standard Lexical Markup Framework and Data Categories selected from ISOCat. In this way, all types of information provided by the LSRs in UBY are easily accessible on a fine-grained level. Further, this standardised format facilitates the extension of UBY with new languages and resources. This is different from previous efforts in combining LSRs which usually targeted particular applications and thus focused on aligning specific types of information only. Second, UBY contains nine pairwise sense alignments between resources. Through these alignments, we provide access to the complementary information for a word sense in different resources. For example, if one looks up a particular verb sense in UBY, one has simultaneous access to the sense in WordNet and to the corresponding sense in FrameNet. Third, UBY is freely available and we have developed an easy-to-use Java API which provides unified access to all types of information contained in UBY. This facilitates the utilization of UBY for a variety of NLP tasks.
DIPF-Departments: Informationszentrum Bildung

UKP at CrossLink2. CJK-to-English Subtasks Kim, Jungi; Gurevych, Iryna Book Chapter | Aus: Kando, Noriko; Kishida, Kazuaki (Hrsg.): Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies | Tokio: NTCIR | 2013 34044 Endnote: Author(s): Kim, Jungi; Gurevych, Iryna
Title: UKP at CrossLink2. CJK-to-English Subtasks
In: Kando, Noriko; Kishida, Kazuaki (Hrsg.): Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, Tokio: NTCIR, 2013 , S. 57-61
URL: http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings10/pdf/NTCIR/CrossLink-2/05-NTCIR10-CROSSLINK2-KimJ.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Automatisierung; Computerlinguistik; Computerunterstütztes Verfahren; Information Retrieval; Mehrsprachigkeit; Nachschlagewerk; Online; Sprachanalyse
Abstract: This paper describes UKP's participation in the cross-lingual link discovery task at NTCIR-10 (CrossLink2). The task addressed in our work is to find valid anchor texts from a Chinese, Japanese, and Korean (CJK) Wikipedia page and retrieve the corresponding target Wiki pages in the English language. The CrossLink framework was developed based on our previous CrossLink system that works on the opposite directions of the language pairs, i.e. discovered anchor texts from English Wikipedia pages and their corresponding targets in CJK languages. The framework consists of anchor selection, anchor ranking, anchor translation, and target discovery sub-modules. Each sub-module in the framework has been shown to work well both in monolingual settings and English to CJK language pairs. We seek to find out whether the approach that worked very well for English to CJK would still work for CJK to English. We use the same experimental settings that were used in our previous participation, and our experimental runs show that the CJK-to- English CrossLink task is a much harder task when using the same resources as the English-to-CJK one.
DIPF-Departments: Informationszentrum Bildung