Skip to content

Azure bicep script to deploy and host TabbyML in the cloud using Azure Cognitive Services Account models.

License

Notifications You must be signed in to change notification settings

Mornante/TabbyML-AzDeploy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Welcome to TabbyML-AzDeploy

Introduction

This repository contains Azure Bicep scripts to deploy TabbyML on Microsoft Azure. The goal is to make it easier to run this amazing open-source project in a scalable, cloud-native environment.

Acknowledgment

๐Ÿš€ Huge thanks to the original developers of TabbyML! ๐Ÿš€

TabbyML is an incredible open-source project, and this repository exists solely to provide an Azure deployment method. All credit for the core application goes to the TabbyML team. If you find this useful, please consider supporting or contributing to TabbyML directly!


Deployment Instructions

Prerequisites

Before deploying, ensure you have:

Deployment Steps

  1. Open your favourite tarminal

  2. Clone this repository

git clone https://github.com/Mornante/TabbyML-AzDeploy.git
  1. Navigate to the cloned respository
cd TabbyML-AzDeploy
  1. Open "params.json" in a editor of your choice and configure the params to your needs
"parameters": {
    "required": {
        "value": {
        //The default location all your azure resources will be created in
        "location": "South Africa North",
        //This will be the root name of all your azure resources (you can leave this as is if you want)
        "serviceAlias": "tabbyml",
        //Any adional tags you want assigned to your resources for management/billing reasons
        "tags": {
            "CreatedBy": "Mornante Basson"
        }
        }
    },
    //Storage Account params
    "storageAccountParams": {
        "value": {
        //Postfix name to append to the resouces name (this is optional and you can leave this as is)
        "name": "",
        //Target SKU name (if you want to run on the lowest possible option for billing purposes you can leave this as is)
        "skuName": "Standard_LRS",
        //The name of the File share name that will be used to mount your container app too. (can leave this as is if you want)
        "containerAppFileSharesName": "container-app-volume-mounts"
        }
    },
    //Container App Managed Environment params
    "managedEnvironmentParams": {
        "value": {
        //Postfix name to append to the resouces name (this is optional and you can leave this as is)
        "name": ""
        }
    },
    //Container App Managed Environment Storage params
    "managedEnvironmentStorageParams": {
        "value": {
        //Postfix name to append to the resouces name (this is optional and you can leave this as is)
        "name": ""
        }
    },
    //Container App params
    "containerAppParams": {
        "value": {
        //Postfix name to append to the resouces name (this is optional and you can leave this as is)
        "name": "",
        //The unique container app file share mount name that will be used for this container
        "volumeName": "tabbyml-volume-mount",
        //The target TabbyML docker image to use when deploying TabbyML to your container app
        "registeryImage": "registry.tabbyml.com/tabbyml/tabby:0.24.0-rc.0",
        //Container App compute settings for the container 
        //(if you want to run on the lowest possible option for billing purposes you can leave this as is)
        "cpu": "0.25",
        "memory": ".5Gi"
        }
    },
    //Cognitive Services Account params
    "cognitiveServicesAccountParams": {
        "value": {
        //Postfix name to append to the resouces name (this is optional and you can leave this as is)
        "name": "",
        //The target region you want your Congnative services to be deployed too.
        //   Keep in mind, Azure continues to push new model APIs so you will need to make sure if the 
        //   models you are configuring below are available for this region.
        "location": "eastus",
        //Target SKU name (if you want to run on the lowest possible option for billing purposes you can leave this as is)
        "skuName": "S0"
        }
    },
    //Cognitive Services Account Model Deployment (chat completion) params
    "chatCompletionDeploymentParams": {
        "value": {
        //Postfix name to append to the resouces name (this is optional and you can leave this as is)
        "name": "",
        //Target SKU name (if you want to run on the lowest possible option for billing purposes you can leave this as is)
        "skuName": "Standard",
        //Model specific settings below (if you want to use GPT-4o, feel free to leave this as is for the lowest costs)
        "capacity": 8,
        "currentCapacity": 8,
        "modelName": "gpt-4o",
        "modelFormat": "OpenAI",
        "modelVersion": "2024-08-06"
        }
    },
    //Cognitive Services Account Model Deployment (embedding) params
    "embeddingDeploymentParams": {
        "value": {
        //Postfix name to append to the resouces name (this is optional and you can leave this as is)
        "name": "",
        //Target SKU name (if you want to run on the lowest possible option for billing purposes you can leave this as is)
        "skuName": "Standard",
        //Model specific settings below (if you want to use text-embedding-3-large, feel free to leave this as is for the lowest costs)
        "capacity": 350,
        "currentCapacity": 350,
        "modelName": "text-embedding-3-large",
        "modelFormat": "OpenAI",
        "modelVersion": "1"
        }
    }
}
  1. Login to Azure
az login
  1. Set your desired Azure subscription (optional, if you have multiple subscriptions)
az account set --subscription "YOUR_SUBSCRIPTION_ID"
  1. Preview your changes before we deploy
az deployment group what-if --resource-group YOUR_RESOURCE_GROUP_HERE --template-file TabbyML.bicep --parameters params.json
  1. Deploy to Azure ๐Ÿš€
az deployment group create --resource-group YOUR_RESOURCE_GROUP_HERE --template-file TabbyML.bicep --parameters params.json

You are almost there

Unfortunatly this stage requires some manual intervension.

Becuase we deployed Azure Cognitive services and Deployment models above we are going to make use of TabbyML HTTP endpoints to communicate to these models.

Additionally

Currently for some reason when the Container App starts up the SQLite file that gets created seems to lock and prevents TabbyML to start up correctly.

So we are going to Fix this now.

  1. Go to your deployed Container App and Stop it

  1. Go to the created file share mount and delete all the initialized directories + files this is important because we don't have our HTTP endpoints configured yet with our config.toml file TabbyML will try and initialize with models running locally on the container app and will fail to do so, since this container app does not have CUDA support.

Next we are going to create the config.toml file and fix the SQLite locking issue.

for this step we are going to be deploying TabbyML locally using Docker. We do this to create the config.toml file and to copy the created SQLite file.

To clarify, I have no idea why this SQLite locking issue is persistant on Azure file shares. Hence the reason for creating the SQLite db locally and then using it in our container app.

  1. Open folder explorer under %USERPROFILE% and create a new folder ".tabby" Here we will create the toml file next but also find the SQLite db when it gets created later.

  2. Create a new file called "config.toml" and configure the file with your Cognitive Services Account Deployment models

[model.chat.http]
kind = "azure/chat"
model_name = "" # for example "gpt-4o"
api_endpoint = "https://YOUR_DEPLOYED_AZURE_COGNITIVE_SERVICE_ACOUNT.openai.azure.com"
api_key = "YOUR_API_KEY"

[model.embedding.http]
kind = "azure/embedding"
model_name = "" # for example "text-embedding-3-large"
api_endpoint = "https://YOUR_DEPLOYED_AZURE_COGNITIVE_SERVICE_ACOUNT.openai.azure.com"
api_key = "YOUR_API_KEY"

You can retrieve these details from your deployed azure Cognitive services account as seen below:

  1. Deploy TabbyML locally so that we can get the SQLite files Run the following command after you have created the config.toml file above inside the ".tabby" directory.
docker run -it -p 8080:8080 -v %USERPROFILE%/.tabby:/data registry.tabbyml.com/tabbyml/tabby:0.24.0-rc.0 serve

Once you see TabbyML has been deployed succesfully and is visible when opening the default local port you can stop the container and go to the same .tabby directoy and see a new folder has been created "ee" that will contain the SQLite files as seen below

  1. Now that we have our config and SQLite files we can stop our locally running TabbyML Open Docker desktop and stop the container and then afterwords delete it since we don't need it anymore

  1. Upload the created config.toml file and "ee" directory to our created Azure File share if you have trouble uploading the "ee" folder as is just create a new "ee" directory in the share and upload the individual SQLite files into that new directory.

  1. Go back to your deployed Container App and Start it

  1. Open the Container App Application Url After the Container app had some time to properly start up you should see TabbyML running and should be using your configured config.toml file to communicate to the deployed azure Cognitive services account models.


Contributing

Feel free to open issues or pull requests if youโ€™d like to improve this deployment method!

License

This repository is licensed under MIT, but TabbyML itself may have different licensing terms. Please check their repository for details.

๐Ÿ“ข Reminder: This repository is not affiliated with the TabbyML team. Itโ€™s just a deployment script to help the community use their work on Azure!


Happy Coding! ๐Ÿš€

About

Azure bicep script to deploy and host TabbyML in the cloud using Azure Cognitive Services Account models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages