Skip to content
Stefan Weil edited this page Oct 19, 2021 · 19 revisions

eScriptorium for UB Mannheim

UB Mannheim tests eScriptorium in its digitisation and OCR workflow. The test installation is available online.

Installation

A test installation was done on the servers ocr-01 and ub-blade-10.

Preconditions

  • tested with Debian bullseye

Podman

Preconditions

  • using Podman 3.3.1 from Debian bookworm instead of Docker
  • sufficient free disk space for /var/lib/containers
  • sufficient free disk space for /var/tmp (19 GiB is not enough)

Running with docker-compose (root)

This did not work and was not examined closer.

sudo apt install podman
sudo systemctl start podman
python3 -m venv ~/venv
source ~/venv/bin/activate
pip install docker-compose

docker-compose up -d --build

Running with podman-compose (no root)

This seems to work, but required a patch for podman-compose. Meanwhile that patch was accepted, so it is no longer necessary to apply it locally.

podman-compose does not get unqualified container images from docker.io by default, but docker-compose.yml for eScriptorium contains several such entries. Therefore either change these entries to qualified ones or add the line unqualified-search-registries = ['docker.io'] to /etc/containers/registries.conf.

python3 -m venv ~/venv
source ~/venv/bin/activate

# The stable podman-compose from PyPI fails.
# See https://github.com/containers/podman-compose/issues/235.
pip install podman-compose
podman-compose up -d --build

# The suggested newer version of podman-compose works,
# but requires a recent version of podman (>= 3.3.0).
pip install https://github.com/containers/podman-compose/archive/devel.tar.gz
podman-compose up -d --build

Open issues

The installation with Podman works, but it was not possible to use it behind a web proxy in a non-root URL.

Full installation

This is the current installation which works on https://ocr-bw.bib.uni-mannheim.de/escriptorium/.

It is based on the official instructions for a full installation.

Preconditions

The Python modules used by eScriptorium require Python 3.7 which is not provided by Debian bullseye. Therefore it is necessary to build your own Python 3.7 and use that for the installation.

Installation

git clone git@gitlab.inria.fr:scripta/escriptorium.git
cd escriptorium

python3.7 -m venv venv3.7
source venv3.7/bin/activate
pip install --upgrade pip setuptools
pip install -r app/requirements.txt
pip install -r app/requirements-dev.txt

export DJANGO_SETTINGS_MODULE=escriptorium.local_settings

Open issues

  • Sending e-mails with the current full installation does not work (no e-mail is sent).
  • Running a full installation with Apache2 and WSGI does not work because Debian bullseye provides a libapache2-mod-wsgi-py3 based on Python 3.9 instead of the require 3.7.

Closed issues

The following error was caused by a wrong column line_offset in database table core_documents. Removing that column fixed the issue.

psycopg2.IntegrityError: FEHLER:  NULL-Wert in Spalte »line_offset« von Relation »core_document« verletzt Not-Null-Constraint
DETAIL:  Fehlgeschlagene Zeile enthält (1, Max Mustermann, 0, 2021-10-19 11:15:05.445587+02, 2021-10-19 11:15:05.445608+02, 1, null, ltr, 86, 2, null).

Links