Release 0.6.0 · b4rtaz/distributed-llama

This version changes the name of the main application into dllama. From now to run the root node or a worker you need to compile dllama and run the dllama application.

make dllama
./dllama inference --model ... --tokenizer ...

Also this version introduces an early stage HTTP api compatible with the OpenAI api (only the /v1/chat/completions endpoint). How to run the api you can find here. A big shout out to @DifferentialityDevelopment for implementing this feature. #39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.6.0

Contributors