How to run a local LLM with a chat UI in four easy steps

This is the repo for the article Set up a local LLM on CPU with chat UI in 15 minutes.

The process consists of these simple steps:

Select a model on Huggingface, e.g. "RJuro/munin-neuralbeagle-7b"
Quantize the model by running quantize.py
Wrap model in Ollama image
Build and run a Docker container that wraps the GUI, e.g. Chatbot Ollama