Skip to content

A demo of a basic bedrock response streaming setup with lambda, Api Gateway websocket, react, and bedrock through aws

Notifications You must be signed in to change notification settings

ASUCICREPO/BedrockResponseStreaming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Bedrock Streaming Response

Table of Contents

What it does

The Response Streaming setup for bedrock is the same pipeline as the standard Bedrock invocation process, with one major difference. Rather than returning the full text in a single request, it returns each token as they're outputted. This allows for quicker response times, and more interesting frontend UIs.

The specific demo uses a React hosted frontend utilizing the Chatbot Template (https://github.com/ASUCICREPO/ChatbotTemplate), this invokes the backend through an Api Gateway Websocket, rather than the standard REST api gateway. Through this the request is routed to a lambda function, the "connection opener" lambda. The purpose of this lambda is to allow the initial opening of the websocket connection, and to call the main function. (TODO: I believe this can be done with step functions to simplify to one lambda) Next the main lambda function will call bedrock through the converse stream API, for each token returned, it will be sent back to the api gateway to the frontend.

Bedrock Response Streaming

How we can use it at the CIC

The main way that we can use it at the CIC is to improve the Time to First Token. For most LLM pipelines the process looks like:

Request Pipeline Comparison

Request->Processing->Generation->Response in Full

By streaming the response we can fully get rid of the Generation step, since the first response is received as soon as the first token is generated.

Request->Processing->Response of First Token

Use Cases

This is especially useful for projects that require more complex thought (Longer responses). The longer the average response length, the more time implementing streaming can save. This saved time can be used to do additional pre-processing (RAG, more complex thought time, etc), or use a slower but more intelligent model, such as Haiku -> Sonnet/Opus. The other major use for streaming the response is to create more interesting frontend UI's, since you receive each token, you can create a cursor effect that shows the output as its generated, boosting the appeal of the design.

In summary, it is best used for projects with complex thinking processes, and long responses to decrease the time to the first token, and for better looking UIs.

Demo Deployment Instructions

Prerequisites

Before beginning development, ensure you have the following tools installed on your system:

  1. Git

    git --version
  2. Python and pip

    python --version
    pip --version
  3. Node.js and npm

    node --version
    npm --version
  4. AWS CLI

    aws --version
  5. AWS CDK

  cdk --version
  ```

### Repository Setup

#### Initial Setup
1. First, clone the required repositories. Open your terminal and run the following commands:

```bash
# Clone the ChatbotTemplate repository
git clone https://github.com/ASUCICREPO/ChatbotTemplate

# Clone the BedrockResponseStreaming repository
git clone https://github.com/ASUCICREPO/BedrockResponseStreaming

These repositories contain the necessary components:

  • ChatbotTemplate: Contains the React frontend template
  • BedrockResponseStreaming: Contains the backend infrastructure and streaming implementation

Backend Setup

AWS Configuration

  • Ensure you have an active AWS account
  • Configure AWS credentials either through AWS CLI or by exporting tokens
  • Verify your AWS configuration is working correctly

Backend Environment Setup

# Navigate to the BedrockResponseStreaming directory
cd BedrockResponseStreaming

# Create and activate Python virtual environment
python -m venv .venv
source .venv/bin/activate  # For Windows use: .venv\Scripts\activate

# Install required dependencies
pip install -r requirements.txt

AWS CDK Deployment

# Bootstrap AWS CDK (only needed once per AWS account/region)
cdk bootstrap

# Deploy the infrastructure
cdk deploy

Important: After the deployment completes, save the WebSocket URL that is output in the terminal. This URL will be needed for the frontend configuration.

Note: All these steps should be performed within the BedrockResponseStreaming repository before proceeding to the frontend setup below.

Frontend Setup

Configure Frontend Settings

# Navigate to the constants.js file
cd ChatbotTemplate/frontend/src/utilities

In constants.js:

  • Set WEBSOCKET_API to the WebSocket URL obtained from the backend deployment NOTE: you also have to add /dev/ to the end for it to work
  • Configure the optional features:
    // Disable optional features
    export const ALLOW_FILE_UPLOAD = false;        // File upload feature
    export const ALLOW_VOICE_RECOGNITION = false;  // Voice recognition feature
    export const ALLOW_MULTLINGUAL_TOGGLE = false; // Multilingual support
    export const ALLOW_LANDING_PAGE = false;       // Landing page

Build the Frontend

# Return to the frontend directory
cd ../../

# Install dependencies and build
npm install
npm run build

Prepare for Deployment

  • Navigate to the build folder
  • Create a ZIP file containing the contents of the build folder

Important: Zip the contents of the build folder, not the folder itself

Deploy to AWS Amplify

  • Open the AWS Console and navigate to AWS Amplify
  • Click "Create app"
  • Select "Deploy without Git"
  • Upload the ZIP file created in step 3
  • Follow the Amplify deployment wizard to complete the setup

After deployment completes, Amplify will provide a URL where your application is hosted.

About

A demo of a basic bedrock response streaming setup with lambda, Api Gateway websocket, react, and bedrock through aws

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages