From ca20d9f33b4e164c0bbdc9fe1dd182acd5426694 Mon Sep 17 00:00:00 2001 From: Chris Riccomini Date: Thu, 5 Oct 2023 11:18:48 -0700 Subject: [PATCH] Add Gateway docs to README I also fixed the docs for the CLI, gateway, and API. --- README.md | 146 +++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 101 insertions(+), 45 deletions(-) diff --git a/README.md b/README.md index c7e6abb..514e2e2 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,12 @@
- recap + recap
## What is Recap? Recap reads and writes schemas from web services, databases, and schema registries in a standard format. -You can use Recap to build data contract tools, schema transpilers, compatibility checkers, data catalogs, schema registries, metadata caches, and a lot more. +⭐️ _If you like this project, please give it a star! It helps the project get more visibility._ ## Table of Contents @@ -16,6 +16,7 @@ You can use Recap to build data contract tools, schema transpilers, compatibilit * [Usage](#usage) * [CLI](#cli) * [Gateway](#gateway) + * [Registry](#registry) * [API](#api) * [Docker](#docker) * [Schema](#schema) @@ -55,33 +56,12 @@ See `pyproject.toml` for a list of optional dependencies. ### CLI -Recap comes with a command line interface that can list and read schemas. +Recap comes with a command line interface that can list and read schemas from external systems. -Configure Recap to connect to one or more of your systems: +List the children of a URL: ```bash -recap add my_pg postgresql://user:pass@host:port/dbname -``` - -List the paths in your system: - -```bash -recap ls my_pg -``` - -```json -[ - "postgres", - "template0", - "template1", - "testdb" -] -``` - -Recap models Postgres paths as `system/database/schema/table`. Keep drilling down: - -```bash -recap ls my_pg/testdb +recap ls postgresql://user:pass@host:port/testdb ``` ```json @@ -93,10 +73,10 @@ recap ls my_pg/testdb ] ``` -Now we have a path to a testdb's public schemas: +Keep drilling down: ```bash -recap ls my_pg/testdb/public +recap ls postgresql://user:pass@host:port/testdb/public ``` ```json @@ -105,10 +85,10 @@ recap ls my_pg/testdb/public ] ``` -Read the schema: +Read the schema for the `test_types` table as a Recap struct: ```bash -recap schema my_pg/testdb/public/test_types +recap schema postgresql://user:pass@host:port/testdb/public/test_types ``` ```json @@ -128,39 +108,95 @@ recap schema my_pg/testdb/public/test_types Recap comes with a stateless HTTP/JSON gateway that can list and read schemas. -Configure Recap to connect to one or more of your systems: +Start the server at [http://localhost:8000](http://localhost:8000): + +```bash +recap serve +``` + +List the schemas in a PostgreSQL database: + +```bash +curl http://localhost:8000/gateway/ls/postgresql://user:pass@host:port/testdb +``` + +```json +["pg_toast","pg_catalog","public","information_schema"] +``` + +And read a schema: ```bash -recap add my_pg postgresql://user:pass@host:port/dbname +curl http://localhost:8000/gateway/schema/postgresql://user:pass@host:port/testdb/public/test_types +``` + +```json +{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]} ``` +The gateway fetches schemas from external systems in realtime and returns them as Recap schemas. + +An OpenAPI schema is available at [http://localhost:8000/docs](http://localhost:8000/docs). + +### Registry + +You can store schemas in Recap's schema registry. + Start the server at [http://localhost:8000](http://localhost:8000): ```bash recap serve ``` -List the schemas in your system: +Put a schema in the registry: + +```bash +curl -X POST \ + -H "Content-Type: application/x-recap+json" \ + -d '{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]}' \ + http://localhost:8000/registry/some_schema +``` + +Get the schema (and version) from the registry: ```bash -$ curl http://localhost:8000/ls/my_pg +curl http://localhost:8000/registry/some_schema ``` ```json -["postgres","template0","template1","testdb"] +[{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]},1] ``` -And read a schema: +Put a new version of the schema in the registry: + +```bash +curl -X POST \ + -H "Content-Type: application/x-recap+json" \ + -d '{"type":"struct","fields":[{"type":"int32","name":"test_int","optional":true}]}' \ + http://localhost:8000/registry/some_schema +``` + +List schema versions: ```bash -curl http://localhost:8000/schema/my_pg/testdb/public/test_types +curl http://localhost:8000/registry/some_schema/versions ``` ```json -{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]} +[1,2] ``` -The gateway fetches schemas from external systems in realtime and returns them as Recap schemas. +Get a specific version of the schema: + +```bash +curl http://localhost:8000/registry/some_schema/versions/1 +``` + +```json +[{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]},1] +``` + +The registry uses [fsspec](https://filesystem-spec.readthedocs.io/en/latest/) to store schemas in a variety of filesystems like S3, GCS, ABS, and the local filesystem. See the [registry](https://recap.build/docs/registry/) docs for more details. An OpenAPI schema is available at [http://localhost:8000/docs](http://localhost:8000/docs). @@ -176,8 +212,8 @@ Read a schema from PostgreSQL: ```python from recap.clients import create_client -client = create_client("postgresql://user:pass@host:port/dbname") -struct = client.get_schema("testdb", "public", "test_types") +with create_client("postgresql://user:pass@host:port/testdb") as c: + c.schema("testdb", "public", "test_types") ``` Convert the schema to Avro, Protobuf, and JSON schemas: @@ -185,11 +221,11 @@ Convert the schema to Avro, Protobuf, and JSON schemas: ```python from recap.converters.avro import AvroConverter from recap.converters.protobuf import ProtobufConverter -from recap.converters.json_schema import JsonSchemaConverter +from recap.converters.json_schema import JSONSchemaConverter avro_schema = AvroConverter().from_recap(struct) protobuf_schema = ProtobufConverter().from_recap(struct) -json_schema = JsonSchemaConverter().from_recap(struct) +json_schema = JSONSchemaConverter().from_recap(struct) ``` Transpile schemas from one format to another: @@ -213,14 +249,34 @@ struct = JSONSchemaConverter().to_recap(json_schema) avro_schema = AvroConverter().from_recap(struct) ``` +Store schemas in Recap's schema registry: + +```python +from recap.storage.registry import RegistryStorage +from recap.types import StructType, IntType + +storage = RegistryStorage("file:///tmp/recap-registry-storage") +version = storage.put( + "postgresql://localhost:5432/testdb/public/test_table", + StructType(fields=[IntType(32)]) +) +storage.get("postgresql://localhost:5432/testdb/public/test_table") + +# Get all versions of a schema +versions = storage.versions("postgresql://localhost:5432/testdb/public/test_table") + +# List all schemas in the registry +schemas = storage.ls() +``` + ### Docker -Recap's gateway is also available as a Docker image: +Recap's gateway and registry are also available as a Docker image: ```bash docker run \ -p 8000:8000 \ - -e "RECAP_SYSTEMS__PG=postgresql://user:pass@localhost:5432/testdb" \ + -e RECAP_URLS=["postgresql://user:pass@localhost:5432/testdb"]' \ ghcr.io/recap-build/recap:latest ```