Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump rebdhuhn from 0.3.1 to 0.4.0; Make kroki-port configurable via env var; Clean Up docker image #83

Merged
merged 16 commits into from
Oct 30, 2024
Merged
3 changes: 3 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,6 @@ RUN pip install -r requirements.txt
COPY src .

CMD ["python", "ebd_toolchain/main.py", "-i", "/container/ebd.docx", "-o", "/container/output", "-t", "json", "-t", "dot", "-t", "svg", "-t", "puml"]
# to test this image run
# $ docker build -t local-test-image .
# $ docker compose up --build --abort-on-container-exit
31 changes: 2 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,38 +49,11 @@ where `-i`, `-o` and `-t` denote the input directory path, the output directory
In this repository:
1. create an `.env` file with a structure similar to [`env.example`](env.example).
2. set the environment variables to meaningful values.
3. Create a `docker-compose.yml` with the following content:
```yaml
services:
kroki:
image: yuzutech/kroki:0.24.1
ports:
- "8125:8000" # Expose Kroki on port 8125 for rendering diagrams
# we hardcode 8125 because of https://github.com/Hochfrequenz/rebdhuhn/issues/205
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:8000/health" ]
interval: 10s
timeout: 5s
retries: 3

scrape-and-plot:
image: ghcr.io/hochfrequenz/ebd_toolchain:latest
# If you run into 'manifest unknown' during docker pull, try replacing `:latest` with `:v1.2.3`.
# where v1.2.3 is the latest version of the GHCR image, which can be found here:
# https://github.com/Hochfrequenz/ebd_toolchain/pkgs/container/ebd_toolchain
depends_on:
kroki:
condition: service_healthy
volumes:
- ${EBD_DOCX_FILE}:/container/ebd.docx
- ${OUTPUT_DIR}:/container/output
network_mode: host
```
4. Login to GitHub Container Registry (GHCR); Use a [Personal Access Token](https://github.com/settings/tokens/new) (PAT) to login that has access to this repository and at least `read:packages` scope
3. Login to GitHub Container Registry (GHCR); Use a [Personal Access Token](https://github.com/settings/tokens/new) (PAT) to login that has access to this repository and at least `read:packages` scope
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hf-krechan was macht man denn anstatt der der docker-compose? also konkret geht es mir ums setup hier: https://github.com/Hochfrequenz/edi_energy_mirror/blob/master/ebd-tooling/docker-compose.yaml was mache ich da stattdessen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

den Schritt braucht man hier nicht, da wir kein Image von GH laden.
Außer du willst das docker image welches mit der Dockerfile hier im repo beschrieben wird auch in die GH Container Registry ablegen.

```bash
docker login ghcr.io -u YOUR_GITHUB_USERNAME
```
5. then run:
4. then run:
```bash
docker compose up --abort-on-container-exit
```
Expand Down
21 changes: 8 additions & 13 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,17 @@ services:
kroki:
image: yuzutech/kroki:0.24.1
ports:
- "8125:8000"
# hardcoded 8125 because: https://github.com/Hochfrequenz/rebdhuhn/issues/205
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:8000/health" ]
interval: 10s
timeout: 5s
retries: 3
- ${KROKI_PORT:-8125}:8000
env_file:
- path: .env
required: true # default is true

scrape-and-plot:
build: .
depends_on:
kroki:
condition: service_healthy
volumes:
- ${EBD_DOCX_FILE}:/container/ebd.docx
- ${OUTPUT_DIR}:/container/output
network_mode: host # Allow the container to use the host's network
# this is also a side effect of https://github.com/Hochfrequenz/rebdhuhn/issues/205
# this prevents: requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8125): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fddfb8c9430>: Failed to establish a new connection: [Errno 111] Connection refused'))
env_file:
- path: .env
required: true # default is true

2 changes: 2 additions & 0 deletions env.example
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
EBD_DOCX_FILE=./edi_energy_mirror/edi_energy_de/FV2410/Entscheidungsbaum-DiagrammeundCodelisten-informatorischeLesefassung3.5KonsolidierteLesefassungmitFehlerkorrekturenStand31.07.2024_20250403_20240403.docx
OUTPUT_DIR=./ebd_toolchain/machine-readable_entscheidungsbaumdiagramme/FV2404
KROKI_PORT=8125
KROKI_HOST=kroki
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,10 @@ classifiers = [
]
dependencies = [
"ebdamame>=0.1.3",
"rebdhuhn>=0.2.3",
"rebdhuhn>=0.4.0",
"cattrs",
"click",
"pydantic-settings"
# add all the dependencies here
]
dynamic = ["readme", "version"]
Expand Down
31 changes: 25 additions & 6 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,29 @@
#
# pip-compile pyproject.toml
#
annotated-types==0.7.0
# via pydantic
attrs==24.2.0
# via
# cattrs
# ebdamame
# rebdhuhn
cattrs==24.1.2
# via rebdhuhn
# via
# ebd-toolchain (pyproject.toml)
# rebdhuhn
certifi==2024.8.30
# via requests
charset-normalizer==3.4.0
# via requests
click==8.1.7
# via ebdamame
# via
# ebd-toolchain (pyproject.toml)
# ebdamame
colorama==0.4.6
# via click
ebdamame==0.2.1
# via your-favourite-package-name (pyproject.toml)
# via ebd-toolchain (pyproject.toml)
idna==3.10
# via requests
lxml==5.3.0
Expand All @@ -32,15 +38,28 @@ more-itertools==10.5.0
# via ebdamame
networkx==3.4.2
# via rebdhuhn
pydantic==2.9.2
# via pydantic-settings
pydantic-core==2.23.4
# via pydantic
pydantic-settings==2.6.0
# via ebd-toolchain (pyproject.toml)
python-docx==1.1.2
# via ebdamame
rebdhuhn==0.3.1
# via your-favourite-package-name (pyproject.toml)
python-dotenv==1.0.1
# via pydantic-settings
rebdhuhn==0.4.0
# via
# ebd-toolchain (pyproject.toml)
# ebdamame
requests==2.32.3
# via rebdhuhn
svgutils==0.3.4
# via rebdhuhn
typing-extensions==4.12.2
# via python-docx
# via
# pydantic
# pydantic-core
# python-docx
urllib3==2.2.3
# via requests
21 changes: 18 additions & 3 deletions src/ebd_toolchain/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,11 @@
import click
from ebdamame import TableNotFoundError, get_all_ebd_keys, get_ebd_docx_tables
from ebdamame.docxtableconverter import DocxTableConverter
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
from rebdhuhn.graph_conversion import convert_table_to_graph
from rebdhuhn.graphviz import convert_dot_to_svg_kroki, convert_graph_to_dot
from rebdhuhn.kroki import DotToSvgConverter, Kroki
from rebdhuhn.models.ebd_graph import EbdGraph
from rebdhuhn.models.ebd_table import EbdTable
from rebdhuhn.models.errors import (
Expand All @@ -46,6 +49,15 @@
from rebdhuhn.plantuml import convert_graph_to_plantuml


# pylint:disable=too-few-public-methods
class Settings(BaseSettings):
"""settings loaded from environment variable/.env file"""

kroki_port: int = Field(alias="KROKI_PORT")
kroki_host: str = Field(alias="KROKI_HOST")
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")


def _dump_puml(puml_path: Path, ebd_graph: EbdGraph) -> None:
plantuml_code = convert_graph_to_plantuml(ebd_graph)
with open(puml_path, "w+", encoding="utf-8") as uml_file:
Expand All @@ -58,9 +70,9 @@ def _dump_dot(dot_path: Path, ebd_graph: EbdGraph) -> None:
uml_file.write(dot_code)


def _dump_svg(svg_path: Path, ebd_graph: EbdGraph) -> None:
def _dump_svg(svg_path: Path, ebd_graph: EbdGraph, converter: DotToSvgConverter) -> None:
dot_code = convert_graph_to_dot(ebd_graph)
svg_code = convert_dot_to_svg_kroki(dot_code)
svg_code = convert_dot_to_svg_kroki(dot_code, converter)
with open(svg_path, "w+", encoding="utf-8") as svg_file:
svg_file.write(svg_code)

Expand Down Expand Up @@ -98,6 +110,9 @@ def main(input_path: Path, output_path: Path, export_types: list[Literal["puml",
"""
A program to get a machine-readable version of the AHBs docx files published by edi@energy.
"""
settings = Settings() # type:ignore[call-arg]
# read settings from environment variable/.env file
kroki_client = Kroki(kroki_host=f"http://{settings.kroki_host}:{settings.kroki_port}")
if output_path.exists():
click.secho(f"The output directory '{output_path}' exists already.", fg="yellow")
else:
Expand Down Expand Up @@ -164,7 +179,7 @@ def handle_known_error(error: Exception, ebd_key: str) -> None:
click.secho(f"💾 Successfully exported '{ebd_key}.dot' to {dot_path.absolute()}")
if "svg" in export_types:
svg_path = output_path / Path(f"{ebd_key}.svg")
_dump_svg(svg_path, ebd_graph)
_dump_svg(svg_path, ebd_graph, kroki_client)
click.secho(f"💾 Successfully exported '{ebd_key}.svg' to {svg_path.absolute()}")
except PathsNotGreaterThanOneError as known_issue:
handle_known_error(known_issue, ebd_key)
Expand Down
2 changes: 1 addition & 1 deletion tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ setenv = PYTHONPATH = {toxinidir}/src
commands =
coverage run -m pytest --basetemp={envtmpdir} {posargs}
coverage html --omit .tox/*,unittests/*
coverage report --fail-under 80 --omit .tox/*,unittests/*
coverage report --fail-under 43 --omit .tox/*,unittests/*

[testenv:compile_requirements]
deps =
Expand Down
39 changes: 39 additions & 0 deletions unittests/test_base_settings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# mypy: disable-error-code="call-arg"
import pytest
from _pytest.monkeypatch import MonkeyPatch
from pydantic import ValidationError

from ebd_toolchain.main import Settings


def test_settings_from_env(monkeypatch: MonkeyPatch) -> None:
# Mock environment variables to simulate the .env behavior
monkeypatch.setenv("KROKI_PORT", "8000")
monkeypatch.setenv("KROKI_HOST", "localhost")

# Instantiate the Settings class
settings = Settings()

# Assert the settings have loaded correctly from environment
assert settings.kroki_port == 8000
assert settings.kroki_host == "localhost"


def test_settings_missing_required_fields(monkeypatch: MonkeyPatch) -> None:
# Ensure no environment variables are set
monkeypatch.delenv("KROKI_PORT", raising=False)
monkeypatch.delenv("KROKI_HOST", raising=False)

# Expecting ValidationError due to missing required environment variables
with pytest.raises(ValidationError):
Settings(_env_file="foo.env") # change env file to avoid loading from .env


def test_invalid_port_value(monkeypatch: MonkeyPatch) -> None:
# Set invalid environment variables
monkeypatch.setenv("KROKI_PORT", "not_an_integer")
monkeypatch.setenv("KROKI_HOST", "localhost")

# Expecting ValidationError because KROKI_PORT is not a valid integer
with pytest.raises(ValidationError):
Settings()