Skip to content

Receives the data from the rd-commondata-api and loads into the database

License

Notifications You must be signed in to change notification settings

hmcts/rd-commondata-dataload

Repository files navigation

rd-commondata-dataload

Reference data commondata dataload

RDCD is batch application and RDCD batch is scheduled with Kubernetes which runs once in a day per cluster.

RDCD consume data files from an external source, transform that data into the destination format and load the data into RDCD database.

RDCD Uses Data-Ingestion-Library (https://github.com/hmcts/data-ingestion-lib) and configure camel routes with data-ingestion-lib & RDCD custom configuration files & crates camel integration framework which reads CSV files & Stores the CSV files data in RDCD Database.

Library should be included with build.gradle like follows compile group: 'uk.gov.hmcts.reform', name: 'data-ingestion-lib', version: '0.5.2.4' And release versions library can be found in bintray (https://bintray.com/hmcts/hmcts-maven/data-ingestion-lib)

Consumption of files from a SFTP server

The files received from SFTP server are encrypted using GPG encryption (which complies with OpenPGP standards).

An internal SFTP server (behind a F5 Load balancer) will poll the files at periodic intervals from the external SFTP server. It will forward the files onto the untrusted network that Palo Alto is listening on.

The Palo Alto untrusted interfaces will form the Palo Alto backend pool, used by requests matched by the path-based rule.

The files are decrypted and then scanned and if everything is ok then trusted traffic is sent to a configured endpoint, in this case an Azure Blob Storage account.

Data Transformation and Load - This is achieved through a K8S scheduler and Apache Camel.

Kubernetes scheduler triggers Apache Camel routes which process files stored in Azure blob storage and persists it RDCD database.

Plugins

The Common-data-load contains the following plugins:

Building and deploying the application

Building the application The project uses Gradle as a build tool. It already contains ./gradlew wrapper script, so there's no need to install gradle.

Environment Vars

If running locally for development or testing you will need to add (Application.yml and application-camel-routes-common.yaml) and set the following environment variables

  • ACCOUNT_NAME:
  • ACCOUNT_KEY:
  • CONTAINER_NAME: jud-ref-data

Running the application

Please Make sure you are connected to the VPN before running the Application. (https://portal.platform.hmcts.net/vdesk/webtop.eui?webtop=/Common/webtop_full&webtop_type=webtop_full)

Create the image of the application by executing the following command:

  ./gradlew assemble

Create docker image:

  docker-compose build

Run the distribution (created in `build/install/rd-judicial-data-load directory) by executing the following command:

  docker-compose up

After, you can start the application from the current source files using Gradle as follows:

./gradlew clean bootRun

This will start the API container exposing the application's port (set to 8100 in this template app).

In order to test if the application is up, you can call its health endpoint:

  curl http://localhost:8100/health

To build the project execute the following command: ./gradlew build

The application exposes health endpoint (http://localhost:8100/health) and metrics endpoint (http://localhost:8100/metrics).