-
Notifications
You must be signed in to change notification settings - Fork 32
Home
This Overview outlines the process for successfully performing an end-to-end, zero-downtime migration. The solution offered in this repository caters to several specific scenarios:
- Metadata Migration - Migrating cluster metadata, such as index settings, aliases, and templates.
- Backfill Migration - Migrating existing or historical data from a source to a target cluster.
- Live Traffic Migration - Transferring ongoing or live traffic between clusters.
- Comparative Tooling - Comparing an existing cluster with a prospective new one.
In this guide, we focus on scenarios 1-3, guiding you through a backfill from a source cluster while concurrently handling live production traffic, which will be captured and redirected to a target cluster.
It's crucial to note that migration strategies are not universally applicable. This guide provides a detailed methodology, based on certain assumptions detailed throughout, emphasizing the importance of robust engineering practices to ensure a successful migration.
Your source cluster in this solution operates on Elasticsearch or OpenSearch, hosted on EC2 instances or similar computing environments. A proxy is set up to interact with this source cluster, either positioned in front of or directly on the coordinating nodes of the cluster.
This component is designed for HTTP RESTful traffic, playing a dual role. It not only forwards traffic to the source cluster but also splits and channels this traffic to a stream-processing service for later playback.
Acting as a traffic simulation tool, the Traffic Replayer replays recorded request traffic to a target cluster, mirroring source traffic patterns. It links original requests and their responses to those directed at the target cluster, facilitating comparative analysis.
Reindexing data from an existing snapshot on Elastic Container Service (ECS) workers that coordinate the migration of documents from an existing snapshot, reindexing the documents in parallel to a target cluster.
A console that provides a migration-specific CLI and offers a variety of tools to streamline the migration process.
This architecture is based on the use of AWS cloud infrastructure, but most tools are designed to be cloud-independent. A local containerized version of this solution is also available.
The design deployed in AWS is as follows:

- Traffic is directed to the existing cluster.
- Traffic is redirected to an ALb with Traffic Capture Proxies replicating live traffic to be stored on the source cluster and in Amazon Managed Streaming for Apache Kafka (MSK) for playback.
- With continuous traffic capture in place, a backfill is initiated with a reindex-from-snapshot.
- Once a backfill has been completed, traffic captured is replayed by the user using a Traffic Replayer.
- The user evaluates the outcomes from routing traffic to both the original and the new cluster.
- After confirming the target cluster’s functionality meets expectations, the user dismantles all related stacks, retaining only the new cluster’s setup. Additionally, the user may retire and discard the old cluster’s legacy infrastructure.
Encountering a compatibility issue or missing feature?
- Search existing issues to see if it’s already reported. If it is, feel free to upvote and comment.
- Can’t find it? Create a new issue to let us know.
- Migration Assistant Overview
- Is Migration Assistant Right for You?
- Existing Data Migration - Quick Start Guide
- A. Snapshot Creation Verification
- B. Client Traffic Switchover Verification
- C. Traffic Capture Verification
- D. System Reset Before Migration