Wikiditions are online editions of text corpora and associated lexica based on a wiki. Each token of a text is lemmatized, tagged and linked to a syntactic word of a lexicon which is also part of a Wikidition. Furthermore several similarity measures are implemented which provide links to similar texts, sentences lemmas and syntactic words.

Name Language Texts Publication
Capitularies Wiki Latin 307 [1]
Kafka Wiki German 2 [2]
[1] [doi] A. Mehler, R. Gleim, T. vor der Brück, W. Hemati, T. Uslu, and S. Eger, “Wikidition: Automatic Lexiconization and Linkification of Text Corpora,” Information Technology, pp. 70-79, 2016.
  abstract       = {We introduce a new text technology, called Wikidition, which automatically generates large
scale editions of corpora of natural language texts. Wikidition combines a wide range of
text mining tools for automatically linking lexical, sentential and textual units. This
includes the extraction of corpus-specific lexica down to the level of syntactic words and
their grammatical categories. To this end, we introduce a novel measure of text reuse and
exemplify Wikidition by means of the capitularies, that is, a corpus of Medieval Latin
[2] A. Mehler, B. Wagner, and R. Gleim, “Wikidition: Towards A Multi-layer Network Model of Intertextuality,” in Proceedings of DH 2016, 12-16 July, 2016.
  abstract = {The paper presents Wikidition, a novel text mining tool for generating online editions of text corpora. It explores lexical, sentential and textual relations to span multi-layer networks (linkification) that allow for browsing syntagmatic and paradigmatic relations among the constituents of its input texts. In this way, relations of text reuse can be explored together with lexical relations within the same literary memory information system. Beyond that, Wikidition contains a module for automatic lexiconisation to extract author specific vocabularies. Based on linkification and lexiconisation, Wikidition does not only allow for traversing input corpora on different (lexical, sentential and textual) levels. Rather, its readers can also study the vocabulary of authors on several levels of resolution including superlemmas, lemmas, syntactic words and wordforms. We exemplify Wikidition by a range of literary texts and evaluate it by means of the apparatus of quantitative network analysis.}