From e90bf8bc6eb408b4722883925d521b731b902680 Mon Sep 17 00:00:00 2001
From: turboderp <11859846+turboderp@users.noreply.github.com>
Date: Fri, 16 Aug 2024 21:00:20 +0200
Subject: [PATCH] Update README.md

---
 README.md | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/README.md b/README.md
index 3d6b7ec6..4294f206 100644
--- a/README.md
+++ b/README.md
@@ -2,6 +2,12 @@
 
 ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
 
+The official and recommended backend server for ExLlamaV2 is [TabbyAPI](https://github.com/theroyallab/tabbyAPI/),
+which provides an OpenAI-compatible API for local or remote inference, with extended features like HF model
+downloading, embedding model support and support for HF Jinja2 chat templates.
+
+See the [wiki](https://github.com/theroyallab/tabbyAPI/wiki/1.-Getting-Started) for help getting started.
+
 
 ## New in v0.1.0+: