Search results in the DIPF database of publications

Your query:

(Schlagwörter: "Algorithmus")

Dijkstra-WSA: A graph-based approach to word sense alignment Matuschek, Michael; Gurevych, Iryna Journal Article | In: Transactions of the Association for Computational Linguistics (TACL) | 2013 33524 Endnote: Author(s): Matuschek, Michael; Gurevych, Iryna
Title: Dijkstra-WSA: A graph-based approach to word sense alignment
In: Transactions of the Association for Computational Linguistics (TACL), 1 (2013) , S. 151-164
URL: http://www.transacl.org/wp-content/uploads/2013/05/paper151.pdf
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Algorithmus; Computerlinguistik; Computerunterstütztes Verfahren; Evaluation; Methode; Semantik; Textanalyse
Abstract (english): In this paper, we present Dijkstra-WSA, a novel graph-based algorithm for word sense alignment. We evaluate it on four different pairs of lexical-semantic resources with different characteristics (WordNet-OmegaWiki, WordNet-Wiktionary, GermaNet-Wiktionary and WordNet-Wikipedia) and show that it achieves competitive performance on 3 out of 4 datasets. Dijkstra-WSA outperforms the state of the art on every dataset if it is combined with a back-off based on gloss similarity. We also demonstrate that Dijkstra-WSA is not only flexibly applicable to different resources but also highly parameterizable to optimize for precision or recall.
DIPF-Departments: Informationszentrum Bildung

Hierarchy identification for automatically generating table-of-contents Erbs, Nicolai; Gurevych, Iryna; Zesch, Torsten Book Chapter | Aus: Galia Angelova, Kalina Bontcheva, Ruslan Mitkov (Hrsg.): Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP) 2013 | Shoumen; Bulgarien: RANLP | 2013 33860 Endnote: Author(s): Erbs, Nicolai; Gurevych, Iryna; Zesch, Torsten
Title: Hierarchy identification for automatically generating table-of-contents
In: Galia Angelova, Kalina Bontcheva, Ruslan Mitkov (Hrsg.): Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP) 2013, Shoumen; Bulgarien: RANLP, 2013 , S. 252-260
URL: http://lml.bas.bg/ranlp2013/docs/RANLP_main.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Analyse; Inhalt; Inhaltsanalyse; Stuktur; Text
Abstract (english): A table-of-contents (TOC) provides a quick reference to a document's content and structure. We present the first study on identifying the hierarchical structure for automatically generating a TOC using only textual features instead of structural hints e.g. from HTML-tags. We create two new datasets to evaluate our approaches for hierarchy identification. We find that our algorithm performs on a level that is sufficient for a fully automated system. For documents without given segment titles, we extend our work by automatically generating segment titles. We make the datasets and our experimental framework publicly available in order to foster future research in TOC generation.
DIPF-Departments: Informationszentrum Bildung

The impact of topic bias on quality flaw prediction in Wikipedia Ferschke, Oliver; Gurevych, Iryna; Rittberger, Marc Book Chapter | Aus: Association of Computational Linguistics (Hrsg.): 51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference System Demonstrations | Stroudsburg; PA: Association for Computational Linguistics | 2013 33527 Endnote: Author(s): Ferschke, Oliver; Gurevych, Iryna; Rittberger, Marc
Title: The impact of topic bias on quality flaw prediction in Wikipedia
In: Association of Computational Linguistics (Hrsg.): 51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference System Demonstrations, Stroudsburg; PA: Association for Computational Linguistics, 2013 , S. 721-730
URN: urn:nbn:de:0111-dipfdocs-184570
URL: http://www.dipfdocs.de/volltexte/2020/18457/pdf/The_impact_of_topic_bias_on_quality_flaw_prediction_in_Wikipedia_A.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Computerunterstütztes Verfahren; Evaluation; Nachschlagewerk; Online; Qualität; Qualitätssicherung; Reliabilität; Soziale Software; Standard; World wide web 2.0
Abstract: With the increasing amount of user generated reference texts in the web, automatic quality assessment has become a key challenge. However, only a small amount of annotated data is available for training quality assessment systems. Wikipedia contains a large amount of texts annotated with cleanup templates which identify quality flaws. We show that the distribution of these labels is topically biased, since they cannot be applied freely to any arbitrary article. We argue that it is necessary to consider the topical restrictions of each label in order to avoid a sampling bias that results in a skewed classifier and overly optimistic evaluation results. We factor out the topic bias by extracting reliable training instances from the revision history which have a topic distribution similar to the labeled articles. This approach better reflects the situation a classifier would face in a real-life application.
DIPF-Departments: Informationszentrum Bildung

Reporting differentiated literacy results in PISA by using multidimensional adaptive testing Frey, Andreas; Kröhne, Ulf Book Chapter | Aus: Prenzel, Manfred; Kobarg, Mareike; Schöps, Katrin; Rönnebeck, Silke (Hrsg.): Research on PISA: Research outcomes of the PISA Research Conference 2009 | Dordrecht: Springer | 2013 33709 Endnote: Author(s): Frey, Andreas; Kröhne, Ulf
Title: Reporting differentiated literacy results in PISA by using multidimensional adaptive testing
In: Prenzel, Manfred; Kobarg, Mareike; Schöps, Katrin; Rönnebeck, Silke (Hrsg.): Research on PISA: Research outcomes of the PISA Research Conference 2009, Dordrecht: Springer, 2013 , S. 103-120
Publication Type: 4. Beiträge in Sammelwerken; Sammelband (keine besondere Kategorie)
Language: Englisch
Keywords: Adaptives Testen; Algorithmus; Deutschland; Itemanalyse; Item-Response-Theory; Längsschnittuntersuchung; Leistungsmessung; Lesekompetenz; Mathematische Kompetenz; Naturwissenschaftliche Kompetenz; PISA <Programme for International Student Assessment>; Reliabilität; Schülerleistung; Technologiebasiertes Testen; Testaufgabe; Testauswertung; Testkonstruktion; Testtheorie
Abstract: Multidimensional adaptive testing (MAT) allows for substantial increases in measurement efficiency. It was examined whether this capability can be used to report reliable results for all 10 subdimensions of students' literacy in reading, mathematics and science considered in PISA. The responses of N=14,624 students who participated in the PISA assessments of the years 2000, 2003 and 2006 in Germany were used to simulate unrestricted MAT, MAT with the multidimensional maximum priority index method (MMPI), and MAT with MMPI taking typical restrictions of the PISA assessments (treatment of link items, treatment of open items, grouping of items to units) into account. For MAT with MMPI the reliability coefficients for all subdimensions were lager than .80, as opposed to sequential testing based on the booklet design of PISA 2006. These advantages slightly lessened with the incorporation of PISA-typical restrictions. The findings demonstrate that MAT with MMPI can successfully be used for subdimensional reporting in PISA.
DIPF-Departments: Bildungsqualität und Evaluation

Finding similar movements in positional data streams Haase, Jens; Brefeld, Ulf Book Chapter | Aus: ECML/PKDD (Hrsg.): Proceedings of the ECML/PKDD Workshop on Machine Learning and Data Mining for Sports Analytics (ECML/PKDD 2013) | Prag: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases | 2013 34024 Endnote: Author(s): Haase, Jens; Brefeld, Ulf
Title: Finding similar movements in positional data streams
In: ECML/PKDD (Hrsg.): Proceedings of the ECML/PKDD Workshop on Machine Learning and Data Mining for Sports Analytics (ECML/PKDD 2013), Prag: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2013 , S. 1-9
URL: http://www.kma.informatik.tu-darmstadt.de/fileadmin/user_upload/Group_KMA/kma_publications/paper_01.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Computerspiel; Computerunterstütztes Verfahren; Daten; Datenanalyse; Evaluation; Informatik
Abstract: In this paper, we study the problem of efficiently finding similar movements in positional data streams, given a query trajectory. Our approach is based on a translation-, rotation-, and scale-invariant representation of movements. Near- neighbours given a query trajectory are then efficiently computed using dynamic time warping and locality sensitive hashing. Empirically, we show the efficiency and accuracy of our approach on positional data streams recorded from a real soccer game.
DIPF-Departments: Informationszentrum Bildung

Uncertainty detection for natural language watermarking Szarvas, György; Gurevych, Iryna Book Chapter | Aus: Mitkov, Ruslan; Park, Jong C. (Hrsg.): Proceedings of the Sixth International Joint Conference on Natural Language Processing (IJCNLP 2013) | Nagoya: Asian Federation of Natural Language Processing | 2013 34038 Endnote: Author(s): Szarvas, György; Gurevych, Iryna
Title: Uncertainty detection for natural language watermarking
In: Mitkov, Ruslan; Park, Jong C. (Hrsg.): Proceedings of the Sixth International Joint Conference on Natural Language Processing (IJCNLP 2013), Nagoya: Asian Federation of Natural Language Processing, 2013 , S. 1188-1194
URL: https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2013/IJCNLP_2013_Szarvas.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Computerlinguistik; Daten; Information; Synonym; Text; Veränderung; Wort
Abstract: In this paper we investigate the application of uncertainty detection to text watermarking, a problem where the aim is to produce individually identifiable copies of a source text via small manipulations to the text (e.g. synonym substitutions). As previous attempts showed, accurate paraphrasing is challenging in an open vocabulary setting, so we propose the use of the closed word class of uncertainty cues. We demonstrate that these words are promising for text watermarking as they can be accurately disambiguated (from the noncue uses of the same words) and their substitution with other cues has marginal impact to the meaning of the text.
DIPF-Departments: Informationszentrum Bildung

Learning shortest paths for word graphs Tzouridis, Emmanouil; Brefeld, Ulf Book Chapter | Aus: Atzmueller, Martin ; Scholz, Christoph (Hrsg.): The Fourth International Workshop on Mining Ubiquitous and Social Environments: MUSE' 13. September 23, 2013 (ECML/PKDD 2013) | Prag: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases | 2013 34023 Endnote: Author(s): Tzouridis, Emmanouil; Brefeld, Ulf
Title: Learning shortest paths for word graphs
In: Atzmueller, Martin ; Scholz, Christoph (Hrsg.): The Fourth International Workshop on Mining Ubiquitous and Social Environments: MUSE' 13. September 23, 2013 (ECML/PKDD 2013), Prag: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2013 , S. 45-57
URL: http://www.kde.cs.uni-kassel.de/ws/muse2013/proceedings.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Grafische Darstellung; Informatik; Lernen; Satz; Struktur; Wort
Abstract: The vast amount of information on the Web drives the need for aggregation and summarisation techniques. We study event extrac- tion as a text summarisation task using redundant sentences which is also known as sentence compression. Given a set of sentences describing the same event, we aim at generating a summarisation that is (i) a single sentence, (ii) simply structured and easily understandable, and (iii) minimal in terms of the number of words/tokens. Existing approaches for sentence compression are often based on finding the shortest path in word graphs that is spanned by related input sentences. These approaches, however, deploy manually crafted heuristics for edge weights and lack theoretical justification. In this paper, we cast sentence compression as a structured prediction problem. Edges of the compression graph are represented by features drawn from adjacent nodes so that corresponding weights are learned by a generalised linear model. Decoding is performed in polynomial time by a generalised shortest path algorithm using loss augmented inference. We report on preliminary results on artificial and real world data.
DIPF-Departments: Informationszentrum Bildung

Learning shortest paths in word graphs Tzouridis, Emmanouil; Brefeld, Ulf Book Chapter | Aus: Henrich, Andreas ; Sperker, Hans-Christian (Hrsg.): LWA 2013: Lernen, Wissen & Adaptivität. Workshop Proceedings, Bamberg, 7.-9. Oktober 2013 | Bamberg: KDML | 2013 34026 Endnote: Author(s): Tzouridis, Emmanouil; Brefeld, Ulf
Title: Learning shortest paths in word graphs
In: Henrich, Andreas ; Sperker, Hans-Christian (Hrsg.): LWA 2013: Lernen, Wissen & Adaptivität. Workshop Proceedings, Bamberg, 7.-9. Oktober 2013, Bamberg: KDML, 2013 , S. 113-116
URL: http://www.minf.uni-bamberg.de/lwa2013/proceedings/proceedings_lwa1013.pdf
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Grafische Darstellung; Informatik; Lernen; Mapping; Satz; Struktur; Wort
Abstract: In this paper we briefly sketch our work on text summarisation using compression graphs. The task is described as follows: Given a set of related sentences describing the same event, we aim at generating a single sentence that is simply structured, easily understandable, and minimal in terms of the number of words/tokens. Traditionally, sentence compression deals with finding the shortest path in word graphs in an unsupervised setting. The major drawback of this approach is the use of manually crafted heuristics for edge weights. By contrast, we cast sentence compression as a structured prediction problem. Edges of the compression graph are represented by features drawn from adjacent nodes so that corresponding weights are learned by a generalised linear model. Decoding is performed in polynomial time by a generalised shortest path algorithm using loss augmented inference. We report on preliminary results on artificial and real world data.
DIPF-Departments: Informationszentrum Bildung

Insights from classifying visual concepts with Multiple Kernel Learning Binder, Alexander; Nakajima, Shinichi; Kloft, Marius; Müller, Christina; Samek, Wojciech; […] Journal Article | In: PLoS ONE | 2012 33575 Endnote: Author(s): Binder, Alexander; Nakajima, Shinichi; Kloft, Marius; Müller, Christina; Samek, Wojciech; Brefeld, Ulf; Müller, Klaus-Robert; Kawanabe, Motoaki
Title: Insights from classifying visual concepts with Multiple Kernel Learning
In: PLoS ONE, 7 (2012) 8, S. e38897
DOI: 10.1371/journal.pone.0038897
URL: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0038897
Publication Type: 3a. Beiträge in begutachteten Zeitschriften; Aufsatz (keine besondere Kategorie)
Language: Englisch
Keywords: Algorithmus; Bild; Computer; Daten; Experimentelle Untersuchung; Klassifikation; Lernen; Mustererkennung; Objekt
Abstract: Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, 1-norm regularized MKL variants are often observed to be outperformed by an unweighted sum kernel. The main contributions of this paper are the following: we apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks from the application domain of computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum-kernel SVM and sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. Data sets (kernel matrices) as well as further information are available at http://doc.ml.tu-berlin.de/image_mkl/(Accessed 2012 Jun 25).
DIPF-Departments: Informationszentrum Bildung

Discriminative clustering for market segmentation Haider, Peter; Chiarandini, Luca; Brefeld, Ulf Book Chapter | Aus: Association of Computational Linguistics (ACL) (Hrsg.): Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2012 | New York: Association for Computing Machinery | 2012 33576 Endnote: Author(s): Haider, Peter; Chiarandini, Luca; Brefeld, Ulf
Title: Discriminative clustering for market segmentation
In: Association of Computational Linguistics (ACL) (Hrsg.): Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2012, New York: Association for Computing Machinery, 2012 , S. 417-425
DOI: 10.1145/2339530.2339600
URL: http://dl.acm.org/citation.cfm?id=2339530.2339600&coll=DL&dl=GUIDE&CFID=343017233&CFTOKEN=79756621
Publication Type: 4. Beiträge in Sammelwerken; Tagungsband/Konferenzbeitrag/Proceedings
Language: Englisch
Keywords: Algorithmus; Computerunterstütztes Verfahren; Datenanalyse; Evaluation; Interaktion; Internet; Logdatei; Marktwirtschaft; Nutzerverhalten; Prognose; Suchmaschine
Abstract: We study discriminative clustering for market segmentation tasks. The underlying problem setting resembles discriminative clustering, however, existing approaches focus on the prediction of univariate cluster labels. By contrast, market segments encode complex (future) behavior of the individuals which cannot be represented by a single variable. In this paper, we generalize discriminative clustering to structured and complex output variables that can be represented as graphical models. We devise two novel methods to jointly learn the classifier and the clustering using alternating optimization and collapsed inference, respectively. The two approaches jointly learn a discriminative segmentation of the input space and a generative output prediction model for each segment. We evaluate our methods on segmenting user navigation sequences from Yahoo! News. The proposed collapsed algorithm is observed to outperform baseline approaches such as mixture of experts. We showcase exemplary projections of the resulting segments to display the interpretability of the solutions.
DIPF-Departments: Informationszentrum Bildung