Skip to content

Commit

Permalink
Update README.
Browse files Browse the repository at this point in the history
  • Loading branch information
Schalk1e committed Jun 12, 2024
1 parent 76e3ce8 commit 90c2c98
Showing 1 changed file with 89 additions and 6 deletions.
95 changes: 89 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,104 @@
</div>


A DS team repository for shared data ingestion utilities.
A DS team repository for shared data ingestion utilities.

## setup

### 1. Ensure the necessary global environment variables are set in your Python environment.

- For aaq:

```
AAQ_API_KEY="<secret>"
AAQ_API_BASE_URL="<url>"
```

- For content_repo:

```
CONTENT_REPO_API_KEY="<secret>"
CONTENT_REPO_BASE_URL="<url>"
```

- For flow_results:

```
FLOW_RESULTS_API_KEY="<secret>"
FLOW_RESULTS_API_BASE_URL="<url>"
```

- For rapidpro:

```
RAPIDPRO_API_KEY="<secret>"
RAPIDPRO_API_BASE_URL="<url>"
```
- For survey:

```
SURVEY_API_KEY="<secret>"
SURVEY_API_BASE_URL="<url>"
```

- For turn: (Original Data Export API)

```
TURN_API_KEY="<secret>"
TURN_API_BASE_URL="<url>"
```

If you want to use the s3 utilities (that allow you to read and write specific parquet files amongst other things), the following variables should be set:

```
S3_KEY="<key>"
S3_SECRET="<secret>"
```

### 2. Install the `rdw-ingestion-tools` package

There are 2 ways of doing this.

- Versioned install from github:

`rdw-ingestion-tools` is public!

```
pip3 install git+https://github.com/praekeltfoundation/rdw-ingestion-tools@v0.3.4
```

- From clone (with [poetry](https://python-poetry.org/docs/)). This is recommended:

```
git clone git@github.com:praekeltfoundation/rdw-ingestion-tools.git
poetry install
```

## usage

To interact with an API.
For more examples on how to interact with particular API endpoints, see the `examples` file. These
contain examples for each supported third party service and the endpoint associated with each.

For instance, to get flows from the Flow Results Specification API, the example is as follows:

```
from api.flow_results import pyFlows
pyFlows.flows.get_ids()
flows = pyFlows.flows.get_flows()
print(flows.keys())
```

To access some of the s3 utilities used in ingestion.
To access some of the s3 utilities used in ingestion.

```
import os
Expand All @@ -39,5 +124,3 @@ pyS3.s3.get_filenames(bucket=bucket, prefix=prefix)
## to-do

- Add tests - yes, I am a bad developer for not having any yet.
- Add release pipeline when tests are done.
- Write a decent README.

0 comments on commit 90c2c98

Please sign in to comment.