Corpus of Early Modern Icelandic

The project utilises Icelandic language models in Transkribus, developed with support from the Infrastructure Fund.

Preparations for the project took place in the spring of 2024, and work on the database began in September of the same year. The project is expected to be completed by the spring of 2025, at which point the database will be made publicly accessible.

The project has three main objectives:

  1. To digitise and make accessible a vast collection of manuscripts, documents, and printed materials from the period 1540–1850 through optical character recognition (OCR).
  2. To linguistically annotate this extensive body of text and use it as the foundation for a planned historical language corpus.
  3. To improve access to a significant part of Iceland’s cultural heritage, which has remained largely inaccessible until now, through linguistic annotation carried out as part of the corpus development.

This project was supported by the Infrastructure Fund.

Lógó - Innviðasjóður
Lógó - rannís