Skip to content

Commit

Permalink
Documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Hugoch committed Sep 12, 2024
1 parent 3ed1515 commit f280ead
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 2 deletions.
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,7 @@ WORKDIR /usr/src/text-generation-inference-benchmark
COPY . .
RUN cargo install --path .
FROM debian:bullseye-slim
RUN mkdir -p /opt/text-generation-inference-benchmark/results
WORKDIR /opt/text-generation-inference-benchmark
COPY --from=builder /usr/local/cargo/bin/text-generation-inference-benchmark /usr/local/bin/text-generation-inference-benchmark
CMD ["text-generation-inference-benchmark"]
22 changes: 21 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,33 @@
# text-generation-inference-benchmark
# Text Generation Inference benchmarking tool

A lightweight benchmarking tool for inference servers.
Benchmarks using constant arrival rate or constant virtual user count.

![ui.png](assets%2Fui.png)

## TODO

- [ ] Check results
- [ ] Allow for multiturn prompts for speculation
- [ ] Push results to Optimum benchmark backend
- [ ] Script to generate plots from results

## Running a benchmark

```
# start a TGI/vLLM server somewhere, then run benchmark...
# ... we mount results to the current directory
$ docker run \
--rm \
-it \
--net host \
-v $(pwd):/opt/text-generation-inference-benchmark/results \
registry.internal.huggingface.tech/api-inference/text-generation-inference-benchmark:latest \
text-generation-inference-benchmark \
--tokenizer-name "Qwen/Qwen2-7B" \
--max-vus 800 \
--url http:/localhost:8080 \
--warmup 20s
```

Results will be saved in `results.json` in current directory.
2 changes: 1 addition & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ pub async fn run(url: String,
Ok(results) => {
info!("Throughput is {requests_throughput} req/s",requests_throughput = results.get_results()[0].successful_request_rate().unwrap());
let report = benchmark.get_report();
let path = "results.json".to_string();
let path = "results/results.json".to_string();
BenchmarkReportWriter::json(report, &path).await.unwrap();
},
Err(e) => {
Expand Down

0 comments on commit f280ead

Please sign in to comment.