In cloud-based deployments, Kubernetes clusters are often spread across multiple availability zones for redundancy and scalability. However, by default, Kubernetes services distribute traffic randomly between pods, which can lead to inefficiencies. Traffic might travel between zones unnecessarily, increasing latency and potentially incurring extra costs. In this post, let’s understand about Kubernetes traffic shaping with Topology Aware Routing.
What is Topology-Aware Routing?
Topology-aware routing (TAR) is a Kubernetes feature that optimizes network traffic flow by considering the physical or logical layout of the cluster’s infrastructure. With TAR enabled, Kubernetes can make informed routing decisions to prioritize keeping traffic within the zone it originated from. This reduces latency, improves performance, and potentially minimizes network egress costs.
How Does TAR Work?
Here’s a breakdown of the process:
- Pod Communication: Pods within your application communicate with each other through services. These services act as load balancers, directing traffic to the appropriate pods based on selectors and labels.
- Endpoint Slices: When a service is created, Kubernetes creates EndpointSlices, which are essentially service endpoint information divided by zone. These EndpointSlices contain details about pods that belong to the service, including their zone location.
- Topology Hints: The EndpointSlice controller incorporates the zone information into “topology hints.” These hints are attached to the EndpointSlices, indicating the preferred zone for the endpoints.
- Kube-proxy: Kube-proxy, a network proxy running on each node, utilizes these topology hints when selecting endpoints for service traffic. It prioritizes selecting pods within the same zone as the requesting pod whenever possible.
Benefits of Topology-Aware Routing
- Reduced Latency: By keeping traffic local, TAR minimizes the distance packets need to travel, resulting in faster communication between pods.
- Improved Performance: Lower latency translates to a more responsive and performant application.
- Cost Optimization: In certain cloud environments, data transfer between zones incurs additional charges. TAR helps minimize these costs by keeping traffic within the same zone.
Example Scenario
Imagine a two-zone Kubernetes cluster (Zone A and Zone B) hosting a microservices application. Service A in Zone A frequently communicates with Service B in Zone B. Without TAR, traffic between these services might traverse zones, introducing unnecessary latency and potentially incurring extra costs.
With TAR it keeps communication local, enhancing performance and potentially reducing costs. Let’s check with some examples:
Detailed Example
Assume, you have multiple nodes running in different zones in the cloud (e.g., us-east-1a, us-east-1b), You have two applications: fox-App1 and fox-App2. Each application has multiple containers, and the containers are running in different zones.
In this fox-App1 frequently sends requests to fox-App2. How do you make sure that fox-App1 container make requests to fox-App2 container, which are situated in the same zone?
To be clear, fox-App1-Container1 sends the request to fox-App2’s containers via the Service of fox-App2.
The Kubernetes Service component distributes traffic at random by design. This means that if fox-App1-Container1 makes a request to fox-App2, it can end up in a different availability zone. Due to cross-zone communication, this could result in network slowness and extra expenses.
How can we prevent it?
Since version 1.21, Kubernetes has got a solution for this issue: Topology Aware Routing
let’s deploy an example application:
fox-app1-deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fox-app1
spec:
replicas: 2
selector:
matchLabels:
app: fox-app1
template:
metadata:
labels:
app: fox-app1
spec:
containers:
- image: nginx
name: fox-app1
Apply:
# kubectl apply -f fox-app1-deployment.yaml
deployment.apps/fox-app1 created
fox-app2-deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: fox-app2
spec:
replicas: 2
selector:
matchLabels:
app: fox-app2
template:
metadata:
labels:
app: fox-app2
spec:
containers:
- image: nginx
name: fox-app2
ports:
- containerPort: 80
Apply:
# kubectl apply -f fox-app2-deployment.yaml
deployment.apps/fox-app2 created
Create a service for fox-App2:
fox-app2-svc.yaml
---
apiVersion: v1
kind: Service
metadata:
name: fox-app2-svc
spec:
selector:
app: fox-app2
ports:
- protocol: TCP
port: 80
targetPort: 80
Here’s my setup:
fox-App1 and fox-App2 both have several containers operating in distinct availability zones.
Concepts
- Endpoints: Endpoints are Kubernetes objects that represent a set of pods that implement a service. They list the pod IP addresses and ports that can be used to access the service.
- Endpoint Slices: Endpoint slices are a newer concept in Kubernetes that divides endpoints by zone. They provide more granular information about pod locations, enabling features like topology-aware routing (TAR).
The endpoint and endpoint slices setup is as follows:
you can try to send a request from fox-App1 to fox-App2 using curl command, and check the response body with zone details, if your application exposes. This setup may familiar to you, as this is normal one. Now lets understand how we can enable the TAR an what difference it brings.
Endpoint Slices and TAR
With TAR enabled, Kubernetes creates endpoint slices for the service, each containing pod information for a specific zone. Kube-proxy, the network proxy, uses these endpoint slices to prioritize selecting pods from the same zone as the requesting pod whenever possible. This reduces network traffic between zones and improves performance.
With TAR enabled:
- The EndpointSlices for Service B would contain topology hints indicating their location in Zone B.
- Kube-proxy, aware of these hints, would prioritize directing traffic from Service A pods in Zone A to Service B pods in the same zone (Zone B).
The requests are spread at random among the various availability zones. Let’s make our fox-app2-svc
Service topology aware
fox-app2-svc.yaml
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.kubernetes.io/topology-mode: Auto
name: fox-app2-svc
spec:
selector:
app: fox-app2
ports:
- protocol: TCP
port: 80
targetPort: 80
Apply:
# kubectl apply -f fox-app2-svc.yaml
service/fox-app2-svc configured
Let’s look at the service:
Note: My case, the setup is single zone one, hence i am getting the endpointslices error. If you have multi zone setup, you will following message: Topology Aware Hints has been enabled, addressType: IPv4
Topology Aware Routing ensures that requests from fox-App1 to fox-App2 remain within the same availability zone.
What happens if fox-App2 is using a single container?
All queries from fox-App1 to fox-App2 will be routed to a single fox-App2 container.
Drawbacks
Topology-Aware Routing (TAR) offers performance and cost benefits, but it also comes with some drawbacks:
- Reduced Reliability: By keeping traffic within a zone, TAR can limit your ability to recover from failures in that zone. If all your pods serving a service are in the same zone that goes down, the service becomes unavailable.
- Limited Scalability: Managing complex routing rules for each service with many pods across zones can become cumbersome and computationally expensive, especially in large clusters.
- Uneven Traffic Distribution: TAR assumes traffic originates roughly proportional to the capacity of each zone. This might not be true, leading to a single pod receiving all traffic in a zone and potential bottlenecks (https://kubernetes.io/docs/concepts/services-networking/topology-aware-routing/).
- Limited Functionality: TAR doesn’t work well with services with most traffic coming from a specific zone or when internal traffic is set to “Local” within a service (https://kubernetes.io/docs/concepts/services-networking/topology-aware-routing/).
Overall, TAR is a valuable tool for optimizing performance and cost, but it’s crucial to consider its limitations and understand your specific traffic patterns before deploying it.
Conclusion
Topology-aware routing is a valuable tool for optimizing network traffic flow in multi-zone Kubernetes deployments. By leveraging zone awareness, TAR helps ensure efficient communication between pods, leading to better application performance and potentially lower network costs.