From 90c2c98b9c4ca456686e71ce73cc126710844152 Mon Sep 17 00:00:00 2001 From: Schalk Date: Wed, 12 Jun 2024 11:58:27 +0200 Subject: [PATCH] Update README. --- README.md | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 89 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 89808b4..5464f54 100644 --- a/README.md +++ b/README.md @@ -12,19 +12,104 @@ -A DS team repository for shared data ingestion utilities. +A DS team repository for shared data ingestion utilities. + +## setup + +### 1. Ensure the necessary global environment variables are set in your Python environment. + +- For aaq: + +``` +AAQ_API_KEY="" +AAQ_API_BASE_URL="" + +``` + +- For content_repo: + +``` +CONTENT_REPO_API_KEY="" +CONTENT_REPO_BASE_URL="" + +``` + +- For flow_results: + +``` +FLOW_RESULTS_API_KEY="" +FLOW_RESULTS_API_BASE_URL="" + +``` + +- For rapidpro: + +``` +RAPIDPRO_API_KEY="" +RAPIDPRO_API_BASE_URL="" + +``` +- For survey: + +``` +SURVEY_API_KEY="" +SURVEY_API_BASE_URL="" + +``` + +- For turn: (Original Data Export API) + +``` +TURN_API_KEY="" +TURN_API_BASE_URL="" + +``` + +If you want to use the s3 utilities (that allow you to read and write specific parquet files amongst other things), the following variables should be set: + +``` +S3_KEY="" +S3_SECRET="" + +``` + +### 2. Install the `rdw-ingestion-tools` package + +There are 2 ways of doing this. + +- Versioned install from github: + +`rdw-ingestion-tools` is public! + +``` +pip3 install git+https://github.com/praekeltfoundation/rdw-ingestion-tools@v0.3.4 +``` + +- From clone (with [poetry](https://python-poetry.org/docs/)). This is recommended: + +``` +git clone git@github.com:praekeltfoundation/rdw-ingestion-tools.git + +poetry install + +``` ## usage -To interact with an API. +For more examples on how to interact with particular API endpoints, see the `examples` file. These +contain examples for each supported third party service and the endpoint associated with each. + +For instance, to get flows from the Flow Results Specification API, the example is as follows: ``` from api.flow_results import pyFlows -pyFlows.flows.get_ids() +flows = pyFlows.flows.get_flows() + +print(flows.keys()) ``` -To access some of the s3 utilities used in ingestion. +To access some of the s3 utilities used in ingestion. ``` import os @@ -39,5 +124,3 @@ pyS3.s3.get_filenames(bucket=bucket, prefix=prefix) ## to-do - Add tests - yes, I am a bad developer for not having any yet. -- Add release pipeline when tests are done. -- Write a decent README.