Skip to content

Commit

Permalink
Document how to send batched inputs (huggingface#222)
Browse files Browse the repository at this point in the history
Co-authored-by: OlivierDehaene <Olivier.dehaene@gmail.com>
  • Loading branch information
osanseviero and OlivierDehaene authored Apr 2, 2024
1 parent a50bb0a commit 9bd6428
Showing 1 changed file with 23 additions and 3 deletions.
26 changes: 23 additions & 3 deletions docs/source/en/quick_tour.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingf

<Tip>

Here we pass a `revision=refs/pr/5`, because the `safetensors` variant of this model is currently in a pull request.
Here we pass a `revision=refs/pr/5` because the `safetensors` variant of this model is currently in a pull request.
We also recommend sharing a volume with the Docker container (`volume=$PWD/data`) to avoid downloading weights every run.

</Tip>

Once you have deployed a model you can use the `embed` endpoint by sending requests:
Once you have deployed a model, you can use the `embed` endpoint by sending requests:

```bash
curl 127.0.0.1:8080/embed \
Expand Down Expand Up @@ -72,7 +72,7 @@ volume=$PWD/data
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.2 --model-id $model --revision $revision
```

Once you have deployed a model you can use the `rerank` endpoint to rank the similarity between a query and a list
Once you have deployed a model, you can use the `rerank` endpoint to rank the similarity between a query and a list
of texts:

```bash
Expand Down Expand Up @@ -101,3 +101,23 @@ curl 127.0.0.1:8080/predict \
-d '{"inputs":"I like you."}' \
-H 'Content-Type: application/json'
```

## Batching

You can send multiple inputs in a batch. For example, for embeddings

```bash
curl 127.0.0.1:8080/embed \
-X POST \
-d '{"inputs":["Today is a nice day", "I like you"]}' \
-H 'Content-Type: application/json'
```

And for Sequence Classification:

```bash
curl 127.0.0.1:8080/predict \
-X POST \
-d '{"inputs":[["I like you."], ["I hate pineapples"]]}' \
-H 'Content-Type: application/json'
```

0 comments on commit 9bd6428

Please sign in to comment.