A webapp to explore Citi Bike system data. Composed of four services:
pipeline
: a data pipeline that extracts historical data from thetripdata
bucket, transforms the data into Parquet, and uploads it to S3clickhouse
: a Clickhouse server that reads the Parquet data and further normalizes it for queryingmap
: a Next.js app that exposes an interface to explore the data (including serverless API routes)mapdata
: provisions AWS infrastructure and lambda implementations for fetching static data thatmap
depends on
There are two compose files: docker-compose.etl.yml
and docker-compose.local.yml
. The former runs all three services while the latter skips the pipeline (as well as Clickhouse initialization, if already initialized).