Modern applications run inside containers, which encapsulate software and its dependencies into a portable, consistent execution environment. Unlike virtual machines, containers share the host OS kernel, making them lightweight and efficient.
Managing hundreds or thousands of containers manually is impractical. Kubernetes automates container deployment, scaling, and orchestration across clusters of machines. The fundamental unit in Kubernetes is a Pod, which groups containers that work together.
However, Kubernetes does not support live pod migration—relocating a running Pod from one node to another without downtime. This is a major limitation for stateful workloads such as AI inference, databases, and real-time analytics.
Currently, when a Pod must be moved due to node failures, autoscaling, or resource rebalancing, Kubernetes follows a terminate-and-recreate model:
- The existing Pod is stopped, losing all in-memory data and active network connections.
- A new Pod is created on another node, requiring applications to restart from scratch.
For stateful applications, this results in downtime, performance degradation, and potential data loss. Live migration would:
✔ Preserve application state, network connections, and execution progress.
✔ Enable smoother autoscaling and resource optimization.
✔ Improve fault tolerance without service interruption.
- When a node fails, Kubernetes reschedules Pods on a new node.
- However, in-memory state and connections are lost.
- Kubelet requests checkpointing through the CRI-O Daemon.
runc
and CRIU capture the container state (memory, CPU, open connections).- A checkpoint archive is created, storing process snapshots for later analysis.
- Designed for debugging and forensic analysis, but lacks live recovery support.
- Existing tools like MyceDrive (DMTCP) attempt live migration but require additional infrastructure.
- Performance suffers in low-bandwidth environments.
No Kubernetes-native solution for live pod migration exists today.
Our research aims to close this gap.
We analyzed Kubernetes Enhancement Proposals (KEPs) to understand how new features integrate with the ecosystem.
- CRI-O (Container Runtime Interface): Manages container execution.
- CRIU (Checkpoint/Restore in Userspace): Captures and restores process states.
- CNI (Container Network Interface): Preserves networking across migrations.
- Kubelet (Node Agent): Handles Pod lifecycle on nodes.
A proof-of-concept live migration system that:
- Uses gRPC and CRI APIs to capture and transfer running Pods.
- Preserves process state, memory, and metadata across nodes.
- Supports multi-container Pods without intrusive modifications in Kubernetes.
The design document for one of the PoCs is here: https://docs.google.com/document/d/1n4tEj2LaNzL7lkq6jqTy4O-3v2dTfrn0lIdWDaeifnM/edit?tab=t.0#heading=h.nxlv0abv8hql
We are currently improving:
- Optimizing migration performance in low-bandwidth environments.
- Ensuring compatibility with different Kubernetes versions and workloads.
- Integrating with native Kubernetes APIs for broader adoption.
Join the discussion and contribute to the project! 🚀