Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCS-1186 AI Accelerator updates for 2.1.1 #6445

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ __OUTPUT__
List of installed extensions
Name | Version | Schema | Description
------------------+---------+------------+------------------------------------------------------------
aidb | 1.0.7 | aidb | aidb: makes it easy to build AI applications with postgres
pgfs | 1.0.4 | pgfs | pgfs: enables access to filesystem-like storage locations
aidb | 2.1.1 | aidb | aidb: makes it easy to build AI applications with postgres
pgfs | 1.0.6 | pgfs | pgfs: enables access to filesystem-like storage locations
vector | 0.8.0 | public | vector data type and ivfflat and hnsw access methods
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,3 @@ Pipelines is delivered as a set of extensions. Depending on how you are deployin
- [Manually installing pipelines packages](packages)

Once the packages are installed, you can [complete the installation](complete) by activating the extensions within Postgres.

11 changes: 11 additions & 0 deletions advocacy_docs/edb-postgres-ai/ai-accelerator/limitations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,14 @@ The impact of this depends on what type of embedding is being performed.
### Data Formats

* Pipelines currently only supports Text and Image formats. Other formats, including structured data, video, and audio, are not currently supported.

### Upgrading

When upgrading the aidb and pgfs extension, there is currently no support for Postgres extension upgrades. You must therefor drop and recreate the extensions when upgrading to a new version of the extensions.

```sql
DROP EXTENSION aidb CASCADE;
DROP EXTENSION pgfs CASCADE;
CREATE EXTENSION aidb CASCADE;
CREATE EXTENSION pgfs CASCADE;
```
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ navTitle: "OpenAI Compatible Models"
description: "Using an OpenAI compatible API with Pipelines by setting options and credentials."
---

To make use of an OpenAI compliant API, you can use the openai_embeddings or openai_completions model providers. Note that a retriever will need to encode first so you can only use the embeddings model provider with a retriever.
To make use of an OpenAI compliant API, you can use the embeddings or completions model providers. Note that a retriever will need to encode first so you can only use the embeddings model provider with a retriever.

## Why use an OpenAI compatible API?

Expand All @@ -21,7 +21,7 @@ The starting point for this process is creating a model. When you create a model
```sql
select aidb.create_model(
'my_local_ollama',
'openai_embeddings',
'embeddings',
'{"model":"llama3.3", "url":"http://llama.local:11434/v1/embeddings", "dimensions":8192}'::JSONB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to add a warning somewhere after this example about the pgvector limitation. Since it doesn't support storing over 2000 vectors. Otherwise, this example can be misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to remember how I got to 8192... will bring the local box up and see about shrinking it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code may indicate the maximum is 8192, but there. is a non-obvious technical limitation of 2000.

'{"api_key":""}'::JSONB);
```
Expand All @@ -30,7 +30,7 @@ select aidb.create_model(

The model name is the first parameter and set to “my_local_ollama” which we will use later.

We specify the model provider as “openai_embeddings” which is the provider that defaults to using OpenAI servers, but can be overridden by the configuration (the next parameter), to talk to any compliant server.
We specify the model provider as “embeddings” which is the provider that defaults to using OpenAI servers, but can be overridden by the configuration (the next parameter), to talk to any compliant server.

### Configuration

Expand All @@ -56,7 +56,7 @@ That completes the configuration parameter.

### Credentials

The last parameter is the credentials parameter, which is another JSON string. It’s usually used for carrying the `api_key` for the OpenAI service and any other necessary credential information. It is not part of the configuration and by being separate, it can be securely hidden from users with lesser permissions. For our ollama connection, we don’t need an api\_key, but the model provider currently requires that one is specified. We can specify an empty string for the api\_key to satisfy this requirement.
The last parameter is the credentials parameter, which is another JSON string. It’s usually used for carrying the `api_key` for the OpenAI service and any other necessary credential information. It is not part of the configuration and by being separate, it can be securely hidden from users with lesser permissions. For our ollama connection, we don’t need an `api_key`, but the model provider currently requires that one is specified. We can specify an empty string for the `api_key` to satisfy this requirement.

## Using the model

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,17 @@ select aidb.decode_text_batch('my_bert_model', ARRAY[
'summarize: The missile knows where it is at all times. It knows this because it knows where it isn''t. By subtracting where it is from where it isn''t, or where it isn''t from where it is (whichever is greater), it obtains a difference, or deviation. The guidance subsystem uses deviations to generate corrective commands to drive the missile from a position where it is to a position where it isn''t, and arriving at a position where it wasn''t, it now is.'
]);
```

## Rerank Text

Call aidb.rerank_text to get text reranking logits.

```sql
SELECT aidb.rerank_text('my_reranking_model',
'What is the best open source database?',
ARRAY[
'PostgreSQL',
'The quick brown fox jumps over the lazy dog.',
'Hercule Poirot'
]);
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
title: "Completions"
navTitle: "Completions"
description: "Completions is a text completion model that enables use of any OpenAI API compatible text generation model."
---

Model name: `completions`

Model aliases:

* `openai_completions`
* `nim_completions`

## About Completions

Completions is a text completion model that enables use of any OpenAI API compatible text generation model.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is an interface for a common API format I would probably drop the first half of this sentence and say something like:

Enables use of any OpenAI API compatible text generation model.


It is suitable for chat/text transforms, text completion, and other text generation tasks.

Depending on the name of the model, the model provider will set defaults accordingly.

When invoked as `completions` or `openai_completions`, the model provider will default to using the OpenAI API.

When invoked as `nim_completions`, the model provider will default to using the NVIDIA NIM API.


## Supported aidb operations

* decode_text
* decode_text_batch

## Supported models

* Any text generation model that is supported by the provider.

## Supported OpenAI models

See a list of supported OpenAI models [here](https://platform.openai.com/docs/models#models-overview).

## Supported NIM models

* [ibm/granite-guardian-3.0-8b](https://build.nvidia.com/ibm/granite-guardian-3_0-8b)
* [ibm/granite-3.0-8b-instruct](https://build.nvidia.com/ibm/granite-3_0-8b-instruct)
* [ibm/granite-3.0-3b-a800m-instruct](https://build.nvidia.com/ibm/granite-3_0-3b-a800m-instruct)
* [meta/llama-3.3-70b-instruct](https://build.nvidia.com/meta/llama-3_3-70b-instruct)
* [meta/llama-3.2-3b-instruct](https://build.nvidia.com/meta/llama-3.2-3b-instruct)
* [meta/llama-3.2-1b-instruct](https://build.nvidia.com/meta/llama-3.2-1b-instruct)
* [meta/llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct)
* [meta/llama-3.1-70b-instruct](https://build.nvidia.com/meta/llama-3_1-70b-instruct)
* [meta/llama-3.1-8b-instruct](https://build.nvidia.com/meta/llama-3_1-8b-instruct)
* [meta/llama3-70b-instruct](https://build.nvidia.com/meta/llama3-70b)
* [meta/llama3-8b-instruct](https://build.nvidia.com/meta/llama3-8b)
* [nvidia/llama-3.1-nemotron-70b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-70b-instruct)
* [nvidia/llama-3.1-nemotron-51b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-51b-instruct)
* [nvidia/nemotron-mini-4b-instruct](https://build.nvidia.com/nvidia/nemotron-mini-4b-instruct)
* [nvidia/nemotron-4-340b-instruct](https://build.nvidia.com/nvidia/nemotron-4-340b-instruct)
* [google/shieldgemma-9b](https://build.nvidia.com/google/shieldgemma-9b)
* [google/gemma-7b](https://build.nvidia.com/google/gemma-7b)
* [google/codegemma-7b](https://build.nvidia.com/google/codegemma-7b)

## Creating the default model

There is no default model for Completions. You can create any supported model using the `aidb.create_model` function.

## Creating an OpenAI model

You can create any supported OpenAI model using the `aidb.create_model` function.

In this example, we are creating a GPT-4o model with the name `my_openai_model`:

```sql
SELECT aidb.create_model(
'my_openai_model',
'openai_completions',
'{"model": "gpt-4o"}'::JSONB,
'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB
);
```

## Creating a NIM model

```sql
SELECT aidb.create_model(
'my_nim_completions',
'nim_completions',
'{"model": "meta/llama-3.2-1b-instruct"}'::JSONB,
credentials=>'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"'::JSONB);
```

## Model configuration settings

The following configuration settings are available for OpenAI models:

* `model` - The model to use.
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL.
* If `openai_completions` (or `completions`) is the `model`, `url` defaults to `https://api.openai.com/v1/chat/completions`.
* If `nim_completions` is the `model`, `url` defaults to `https://integrate.api.nvidia.com/v1/chat/completions`.
* `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`.

## Model credentials

The following credentials are required for these models:

* `api_key` - The API key to use for authentication.
Original file line number Diff line number Diff line change
@@ -1,16 +1,25 @@
---
title: "OpenAI Embeddings"
navTitle: "OpenAI Embeddings"
description: "OpenAI Embeddings is a text embedding model that enables use of any OpenAI text embedding model."
title: "Embeddings"
navTitle: "Embeddings"
description: "Embeddings is a text embedding model that enables use of any OpenAI API compatible text embedding model."
---

Model name: `openai_embeddings`
Model name: `embeddings`

## About OpenAI Embeddings
Model aliases:

OpenAI Embeddings is a text embedding model that enables use of any supported OpenAI text embedding model. It is suitable for text classification, clustering, and other text embedding tasks.
* `openai_embeddings`
* `nim_embeddings`

See a list of supported OpenAI models [here](https://platform.openai.com/docs/guides/embeddings#embedding-models).
## About Embeddings

OpenAI Embeddings is a text embedding model that enables use of any OpenAI API complatible text embedding model. It is suitable for text classification, clustering, and other text embedding tasks.

Depending on the name of the model, the model provider will set defaults accordingly.

When invoked as `embeddings` or `openai_embeddings`, the model provider will default to using the OpenAI API.

When invoked as `nim_embeddings`, the model provider will default to using the NVIDIA NIM API.

## Supported aidb operations

Expand All @@ -19,10 +28,18 @@ See a list of supported OpenAI models [here](https://platform.openai.com/docs/gu

## Supported models

* Any text embedding model that is supported by OpenAI. This includes `text-embedding-3-small`, `text-embedding-3-large`, and `text-embedding-ada-002`.
* Any text embedding model that is supported by the provider.

### Supported OpenAI models

* Any text embedding model that is supported by OpenAI. This includes `text-embedding-3-small`, `text-embedding-3-large`, and `text-embedding-ada-002`. See a list of supported OpenAI models [here](https://platform.openai.com/docs/guides/embeddings#embedding-models).
* Defaults to `text-embedding-3-small`.

## Creating the default model
### Supported NIM models

* [nvidia/nv-embedqa-e5-v5](https://build.nvidia.com/nvidia/nv-embedqa-e5-v5) (default)

## Creating the default with OpenAI model

```sql
SELECT aidb.create_model('my_openai_embeddings',
Expand Down Expand Up @@ -52,23 +69,11 @@ Because we are passing the configuration options and the credentials, unlike the
The following configuration settings are available for OpenAI models:

* `model` - The OpenAI model to use.
* `url` - The URL of the OpenAI model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://api.openai.com/v1/chat/completions`.
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL.
* If `openai_completions` (or `completions`) is the `model`, `url` defaults to `https://api.openai.com/v1/chat/completions`.
* If `nim_completions` is the `model`, `url` defaults to `https://integrate.api.nvidia.com/v1/chat/completions`.
* `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`.

## Available OpenAI Embeddings models

* sentence-transformers/all-MiniLM-L6-v2 (default)
* sentence-transformers/all-MiniLM-L6-v1
* sentence-transformers/all-MiniLM-L12-v1
* sentence-transformers/msmarco-bert-base-dot-v5
* sentence-transformers/multi-qa-MiniLM-L6-dot-v1
* sentence-transformers/paraphrase-TinyBERT-L6-v2
* sentence-transformers/all-distilroberta-v1
* sentence-transformers/all-MiniLM-L6-v2
* sentence-transformers/multi-qa-MiniLM-L6-cos-v1
* sentence-transformers/paraphrase-multilingual-mpnet-base-v2
* sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

## Model credentials

The following credentials are required for OpenAI models:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,11 @@ navigation:

This section provides details of the supported models in EDB Postgres AI - AI Accelerator - Pipelines and their capabilities.

* [T5](t5)
* [OpenAI Embeddings](openai-embeddings)
* [OpenAI Completions](openai-completions)
* [BERT](bert)
* [CLIP](clip)
* [T5](t5).
* [Embeddings](embeddings), including openai-embeddings and nim-embeddings.
* [Completions](completions), including openai-completions and nim-completions.
* [BERT](bert).
* [CLIP](clip).
* [NIM_CLIP](nim_clip).
* [NIM_RERANKING](nim_reranking).

Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title: "CLIP"
navTitle: "CLIP"
description: "CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision."
---

Model name: `nim_clip`

## About CLIP

CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision. It is a zero-shot learning model that can be used for a wide range of vision and language tasks.

This specific model runs on NVIDIA NIM. More information about CLIP on NIM can be found [here](https://build.nvidia.com/nvidia/nvclip).


## Supported aidb operations

* encode_text
* encode_text_batch
* encode_image
* encode_image_batch

## Supported models

### NVIDIA NGC

* nvidia/nvclip (default)


## Creating the default model

```sql
SELECT aidb.create_model(
'my_nim_clip_model',
'nim_clip',
credentials=>'{"api_key": "<API_KEY_HERE>"'::JSONB
);
```

There is only one model, the default `nvidia/nvclip`, so we do not need to specify the model in the configuration.

## Model configuration settings

The following configuration settings are available for CLIP models:

* `model` - The NIM model to use. The default is `nvidia/nvclip` and is the only model available.
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://integrate.api.nvidia.com/v1/embeddings`.
* `dimensions` - Model output vector size, defaults to 1024

## Model credentials

The following credentials are required if executing inside NVIDIA NGC:

* `api_key` - The NVIDIA Cloud API key to use for authentication.
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: "Reranking (NIM)"
navTitle: "reranking"
description: "Reranking is a method in text search that sorts results by relevance to make them more accurate."
---

Model name: `nim_reranking`

## About Reranking

Reranking is a method in text search that sorts results by relevance to make them more accurate. It gives scores to documents using cross-attention mechanisms, improving the initial search results.

## Supported aidb operations

* rerank_text

## Supported models

### NVIDIA NGC

* nvidia/llama-3.2-nv-rerankqa-1b-v2 (default)



## Creating the default model

```sql
SELECT aidb.create_model(
'my_nim_reranker',
'nim_reranking',
credentials=>'{"api_key": "<API_KEY_HERE>"'::JSONB
);
```

There is only one model, the default `nvidia/nvclip`, so we do not need to specify the model in the configuration.

## Model configuration settings

The following configuration settings are available for CLIP models:

* `model` - The NIM model to use. The default is `nvidia/llama-3.2-nv-rerankqa-1b-v2` and is the only model available.
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://ai.api.nvidia.com/v1/retrieval`.

## Model credentials

The following credentials are required if executing inside NVIDIA NGC:

* `api_key` - The NVIDIA Cloud API key to use for authentication.
Loading