An SBC service in the cloud comprises many SBC SWe instances. This introduces the need to distribute the load across these instances, so that the resource is utilized (compute, memory, and so on) and evenly balanced to maximize the overall capacity and performance of the service. There are multiple approaches to perform load balancing.
The supported approaches are:
- To use a SIP-aware Front-End Load Balancer (FE LB). In this approach, the FE LB distributes (at the SIP level) new requests to the set of SBC instances.
- To use RFC 3263 DNS-based load balancing. In this approach, when the peers resolve the FQDN for the SBC service, the IP addresses for multiple SBC instances are provided. The peers select the IP addresses from the set of IP addresses.
Irrespective of the approach used for initial load balancing, hot spots or other imbalances within the SBC cluster can occur for a variety of reasons. For example, load is imbalanced when a new SBC instance is relatively underutilized compared to the long running instances. Load imbalance also occurs when the cost or lifetime of different request types is mismatched and causes uneven distribution of requests, which results in uneven overall loading. However, the SBC SWe instances implement a secondary level of load balancing to correct any deficiencies in loading.
The following image depicts secondary load balancing in an SBC cluster.
The key aspects of the intra-cluster load balancing feature in the SBC SWe cloud deployments are:
- Dynamically joining a load balancing cluster on start.
- Distribution of local availability across all members of the cluster.
- Retargeting a portion of inbound INVITE and/or REGISTER requests to the underutilized nodes.
Joining a Cluster
When a new SBC SWe instance starts, it joins its associated cluster to participate in clustering operations such as load-balancing. The SBC achieves this through the use of one or more seed nodes. These seed nodes are identified in the configuration by an FQDN, which is resolved by the SBC to one more peer SBC instances. The new SBC then selects a seed node randomly from the list and exchanges the protocol information necessary to join the cluster. If the selected SBC is not available, this process is repeated using different SBCs resolved from the seed FQDN until the SBC successfully joins the cluster.
The seed nodes are necessary only during the cluster joining operation. Once the SBC instance has joined the cluster, the failure of one or more seed nodes does not affect any clustering functionality, including the load balancing feature.
Load balancing determines an SBC’s availability by its current use and its total capacity. An SBC's availability is multi-dimensional, which includes availability for new calls, new registrations, and other request types. When the additional call and registration capacity reaches the minimum thresholds and the SBC is not congested, the load balancing function determines SBC’s aggregate local availability. The load balancing is based on the SBC’s current call and registration load. This aggregate availability is normalized (to allow comparison among nodes of different absolute capacity) and periodically distributed to the other SBC nodes within the same cluster.
For distribution, the load balancing function uses a leader-based distribution mechanism. When an SBC cluster is formed, a leader is elected. All the SBC SWe instances in the cluster periodically provide their availability to the leader and, in turn, obtain the availability information for the other SBC instances. In operation, only the nodes with the highest level of availability are distributed. The leader election method is resilient to faults. For example, when the current leader fails or becomes unavailable, a new leader is elected and the information about the availability is provided to the replacement leader.
When an SBC instance receives a request, it determines whether to accept the request based on its current load information. For an SBC participating in load balancing, the SBC also decides whether to handle the request locally or to retarget the request to an alternate SBC within the cluster. This decision is based on the relative availability on the local SBC versus the other SBCs in the cluster. If the current load on the SBC is low (that is, there is significant local slack capacity) or if retargeting is disabled for the request type or for the associated zone, the request is not targeted. Else, the SBC selects the target SBC from the set comprising itself and the SBCs with higher availability. If the remote SBC is selected, the request is retargeted to that SBC.
From the set of SBCs, the selection process is weighed and randomized. The weight-age is based roughly on the availability of the SBCs. However, the local SBC is given a higher weight-age. This higher weight-age reduces unnecessary retargeting when the differences of availability between the local node and remote nodes is small. The selection process is randomized to ensure retargeted requests are evenly distributed. This avoids roving overloads, which can occur when SBC nodes simultaneously retarget requests to the most under-utilized SBCs.
The requests are retargeted using the SIP 302 Moved Temporarily response code. This response code indicates that the request must be retried to the given address. It also indicates the SIP signaling address of the desired SBC as the contact address. The SBC also includes a Sonus proprietary parameter in the contact address of the SIP 302 Moved Temporarily response code. The peer retries the request to the given contact address (which includes the added parameter) towards the selected SBC. The SBC instance receives the resubmitted request and avoids further retargeting (due to the presence of the Sonus-proprietary parameter), and accepts the request.