Linguistic corpus research software at the Leibniz-Institute for the German Language (IDS)
DOI:
https://doi.org/10.14279/eceasst.v85.2692Keywords:
Corpus Linguistics, Language Resources, User Interface Design, Legacy SoftwareAbstract
Empirical linguistic research requires access to richly annotated and metadata-enhanced language corpora. This paper presents the ongoing development of corpus search and analysis platforms at the Leibniz-Institute for the German Language (IDS), which provide access to DeReKo, the world’s largest collection of contemporary written German corpora, and the Archive for Spoken German (AGD) among others. We describe our platforms, especially focusing on improving, extending and evaluating their user interfaces. Challenges addressed include legal constraints, handling large and heterogeneous datasets, ensuring reproducibility, and especially meeting accessibility and usability standards for a diverse scientific audience from the humanities. This work contributes to the broader effort of advancing research infrastructure in linguistics and offers insights into sustainable and user-friendly corpus technology design.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nils Diewald, Franck Bodmer, Peter M. Fischer, Elena Frick, Marc Kupietz, Mark-Christoph Müller, Helge Stallkamp, Uyen-Nhu Tran

This work is licensed under a Creative Commons Attribution 4.0 International License.
