Skip to content

Latest commit

 

History

History
41 lines (30 loc) · 2.06 KB

README.md

File metadata and controls

41 lines (30 loc) · 2.06 KB

llm-api

linting: pylint build.yml deploy.yml formatter.yml warmer.yml dependabot.yml

Open in Spaces Open a Pull Request

A fast CPU-based API for OpenChat 3.6, hosted on Hugging Face Spaces. To achieve faster executions, we are using CTranslate2 as our inference engine.

Usage

Simply cURL the endpoint like in the following.

curl -N 'https://winstxnhdw-llm-api.hf.space/api/v1/generate' \
     -H 'Content-Type: application/json' \
     -d \
     '{
         "instruction": "What is the capital of Japan?"
      }'

Development

First, install the required dependencies for your editor with the following.

poetry install

Now, you can access the Swagger UI at localhost:7860/api/docs after spinning the server up locally with the following.

docker build -f Dockerfile.build -t llm-api .
docker run --rm -e APP_PORT=7860 -p 7860:7860 llm-api