Introduction
Any modern website on the internet today receives thousands of hits, if not millions. Without any scalability strategy, the website is either going to crash or significantly degrade in performance. A situation we want to avoid. As a known fact, adding more powerful hardware or scaling vertically will only delay the problem. However, adding multiple servers or scaling horizontally, without a well-thought-through approach, may not reap the benefits to their full extent.
The best way for creating a highly scalable system in any domain is to use proven software architecture patterns. Software architecture patterns enable us to create cost-effective systems that can handle billions of requests and petabytes of data. Her we describe the most basic and popular scalability pattern known as Load Balancing. The concept of Load Balancing is essential for any developer building a high-volume traffic site in the cloud. The article first introduces the Load balancer, then discusses the type of Load balancers, next is load balancing in the cloud, followed by Open-source options and finally a few pointers to choose load balancers.
What is a Load Balancer?
Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.
Modern high‑traffic websites must serve hundreds of thousands, if not millions, of concurrent requests from users or clients and return the correct text, images, video, or application data, all in a fast and reliable manner. To cost‑effectively scale to meet these high volumes, modern computing best practice generally requires adding more servers.
A load balancer acts as the “traffic cop” sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance. If a single server goes down, the load balancer redirects traffic to the remaining online servers. When a new server is added to the server group, the load balancer automatically starts to send requests to it.
In this manner, a load balancer performs the following functions:
- Distributes client requests or network load efficiently across multiple servers.
- Ensures high availability and reliability by sending requests only to servers that are online.
- Provides the flexibility to add or subtract servers as demand dictates.
Types of Load Balancers
After getting the basics of Load Balancer, the next is to familiarize with the load balancing algorithms. There are broadly 2 types of load balancing algorithms.
Static Load Balancers
Static load balancers distribute the incoming traffic equally as per the algorithms.
- Round Robin is the most fundamental and default algorithm to perform load balancing. It distributes the traffic sequentially to a list of servers in a group. The algorithm assumes that the application is stateless and each request from the client can be handled in isolation. Whenever a new request comes in it goes to the next available server in the sequence. As the algorithm is basic, it is not suited for most cases.
- Weighted Round Robin is a variant of round robin where administrators can assign weightage to servers. A server with a higher capacity will receive more traffic than others. The algorithm can address the scenario where a group has servers of varying capacities.
- Sticky Session also known as the Session Affinity algorithm is best suited when all the requests from a client need to be served by a specific server. The algorithm works by identifying the requests coming in from a particular client. The client can be identified either by using the cookies or by the IP address. The algorithm is more efficient in terms of data, memory and using cache but can degrade heavily if a server start getting stuck with excessively long sessions. Moreover, if a server goes down, the session data will be lost.
- IP Hash is another way to route the requests to the same server. The algorithm uses the IP address of the client as a hashing key and dispatches the request based on the key. Another variant of this algorithm uses the request URL to determine the hash key.
Dynamic Load Balancers
Dynamic load balancers, as the name suggests, consider the current state of each server and dispatch incoming requests accordingly.
- Least Connection dispatches the traffic to the server with the fewest number of connections. The assumption is that all the servers are equal and the server having a minimum number of connections would have the maximum resources available.
- Weighted Least Connection is another variant of least connection. It provides an ability for an administrator to assign weightage to servers with higher capacity so that requests can be distributed based on the capacity.
- Least Response Time considers the response time along with the number of connections. The requests are dispatched to the server with the fewest connections and minimum average response time. The principle is to ensure the best service to the client.
- Adaptive or Resource-based dispatches the load and makes decisions based on the resources i.e., CPU and memory available on the server. A dedicated program or agent runs on each server that measures the available resources on a server. The load balancer queries the agent to decide and allocate the incoming request.
Load Balancing in Cloud
A successful cloud strategy is to use load balancers with Auto Scaling. Typically, cloud applications are monitored for network traffic, memory consumption and CPU utilization. These metrics and trends can help define the scaling policies to add or remove the application instances dynamically. A load balancer in the cloud considers the dynamic resizing and dispatches the traffic based on available servers. The section below describes a few of the popularly known solutions in the cloud:
AWS – Elastic Load Balancing (ELB)
Amazon ELB is highly available and scalable load balancing solution. It is ideal for applications running in AWS. Below are 4 different choices of Amazon ELB to pick from:
- Application Load Balancer used for load balancing of HTTP and HTTPS traffic.
- Network Load Balancer is used for load balancing both TCP, UDP and TLS traffic.
- Gateway Load Balancer is used to deploy, scale, and manage third-party virtual appliances.
- Classic Load Balancer is used for load balancing across multiple EC2 instances.
- Use AWS Application Load Balancer and Network Load Balancer with ECS (foxutech.com)
GCP – Cloud Load Balancing
Google Cloud Load Balancing is a highly performant and scalable offering from Google. It can support up to 1 million+ queries per second. It can be divided into 2 major categories i.e., internal, and external. Each major category is further classified based on the incoming traffic. Below are a few load balancer types.
- Internal HTTP(S) Load Balancing
- Internal TCP/UDP Load Balancing
- External HTTP(S) Load Balancing
- External TCP/UDP Network Load Balancing
A complete guide to compare all the available load balancers can be found on the Google load balancer page.
Microsoft Azure Load Balancer
Microsoft Azure load balancing solution provides 3 different types of load balancers:
- Standard Load Balancer – Public and internal Layer 4 load balancer
- Gateway Load Balancer – High performance and high availability load balancer for third-party Network Virtual Appliances.
- Basic Load Balancer – Ideal for small scale application
Open-Source Load Balancing Solution
Although a default choice is always to use the vendor specific cloud load balancer, there are a few open-source load balancer options available. Below is a couple of those.
NGINX
NGINX provides NGINX Plus and NGINX, modern load balancing solutions. There are many popular websites including Dropbox, Netflix and Zynga, that are using load balancers from NGINX. The NGINX load balancing solutions are high performance and can help improve the efficiency and reliability of a high traffic website.
HAProxy
HAProxy (High Availability Proxy) is open source proxy and load balancing server software. It provides high availability at the network (TCP) and application (HTTP/S) layers, improving speed and performance by distributing workload across multiple servers.
Cloudflare
Cloudflare is another popular load balancing solution. It offers different tiers of load balancer to meet specific customer needs. Pricing plans are based on the services, health checks and security provided.
- Zero Trust platform plans
- Websites & application services plans
- Developer platform plans
- Enterprise plan
- Network services
Choosing Load Balancer
It is evident from the sections above that a load balancer can have a big impact on the applications. Thus, picking up the right solution is essential. Below are a few considerations to make the decision.
- Identifying the short term and long-term goals of a business can help drive the decision. The business requirements should help identify the expected traffic, growth regions and region of the service. Business considerations should also include the level of availability, the necessity of encryptions or any other security concerns that need to be addressed.
- There are ample options available in the market. Identifying the necessary features for the application can help pick the right solution. As an example, the load balancer should be able to handle the incoming traffic for the application such as HTTP/HTTPS or SSL or TCP. Another example is a load balancer used for internal traffic has different security concerns than external load balancers.
- Cloud vendors provide various support tiers and pricing plans. A detailed comparison of the total cost of ownership, features and support tier can help identify the right choice for the project.
Example
When you select load-balancing options in azure, we should consider as following,
- Traffic type: Is it a web (HTTP/HTTPS) application? Is it public facing or a private application?
- Global vs. regional: Do you need to load balance VMs or containers within a virtual network, or load balance scale unit/deployments across regions, or both?
- Availability: What’s the service-level agreement?
- Features and limits: What are the overall limitations of each service? For more information, see Service limits.
The following flowchart helps you to choose a load-balancing solution for your application. The flowchart guides you through a set of key decision criteria to reach a recommendation.
Treat this flowchart as a starting point. Every application has unique requirements, so use the recommendation as a starting point. Then perform a more detailed evaluation.
If your application consists of multiple workloads, evaluate each workload separately. A complete solution might incorporate two or more load-balancing solutions.
Most experts agree that it is a best practice to use a load balancer to manage the traffic which is critical to cloud applications. With the use of a load balancer, applications can serve the requests better and also save costs.