Backend repository to store and index corpora with metadata and versions.


  • Store texts of a corpus in a uniform domain model:
    • Keep track of file versions
    • Link all file types to the same source document
    • Store metadata about documents, files and versions
  • Use Rest API to create, read, update and delete texts
  • Search files using stock and custom elasticsearch indices
  • Explore API with concordion and swagger


Schematic overview of the Text Repository: a) search in corpus; b+c) manage file types, versions and metadata of corpus; d+e) build custom indices that are automatically kept in sync with corpus; f+g) store all data in a unified database model.


Prerequisites: docker-compose.

To run the Text Repository locally, run in a new directory:

git clone https://github.com/knaw-huc/textrepo .
cd examples/production
curl -o scripts/wait-for-it.sh https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh
chmod +x scripts/wait-for-it.sh

Read more on basic usage


