Packets that cause a direct SBC fault can lead to a catastrophic failure of an SBC service, which is known as a packet-stimulated fault avalanche. These packets appear for various reasons, such as: the SBC adds a new Session Initiation Protocol (SIP) endpoint, upgrades or replaces a peering endpoint or gateway (GW), changes a configuration on a peer, or introduces a new call scenario. The SBC does not currently check for double faults, which is when the SBC has a failover and then another failover. Double faults cause call loss.
The goal of Avalanche Fault Detection and Control is not to prevent individual crashes, but to detect and control continuous "avalanche" faults that can lead to complete service outages. This feature uses the information from the existing faults to attempt to prevent future faults.
The fault avalanche feature tracks potentially problematic values of key types in SIP packets. To track these values the fault avalanche feature extracts and saves values from the following fields of the SIP packet that causes a crash:
- Call ID
- Called Party Address
- Calling Party Address (From or P-Asserted-Identity header)
- Source IP of the Packet
Each key type associates with a threshold value. The threshold value indicates the maximum amount of crashes allowed for a particular value of the key type before the SBC blocks that key value. The SBC defines the threshold values so that they ensure the threshold for more specific blocks trigger before less specific blocks. The following table shows the default key type thresholds.
Caption |
---|
0 | Table |
---|
1 | Default Key Thresholds |
---|
|
| |
---|
Call ID | 0 | Calling + Called | 1 | Calling Party | 3 | Called Party | 3 | Source IP | 5 |
|
The SBC determines the calling party for a SIP packet in the following order:
- If present, the P-Asserted-Identity header user part or telephone number.
- If present, the P-Preferred-Identity header user part or telephone number.
- If not anonymous, the From header user part or telephone number.
The SBC obtains the called party for a SIP packet from either a Request URI of a SIP request, or the To header URL of a SIP response.
Info |
---|
|
The SBC cannot monitor the fault count for user(s) if no usable calling party information is available; however, the SBC can control faults based on the source IP. |
Info |
---|
Ribbon recommends that you set the thresholds according to how strict (or lenient) you prefer to be with faults in a cluster. If you do not want the SBC to block a specific key element or source, instruct the SBC using CLI commands. |
The Layer 3 source IP address of the packet determines the peer IP address.
Command Syntax
Use the following command to set and configure the faultAvalancheControl
parameter.
Code Block |
---|
% set system faultAvalancheControl callIdThreshold <0-999> calledPartyThreshold <0-999> callingNCalledPartyThreshold <0-999> callingPartyThreshold <0-999> sourceIpThreshold <0-999> faultRecAgeingTimeOut <15-60> |
Use the following command to enable or disable the faultAvalancheControl
parameter.
Code Block |
---|
% set system faultAvalancheControl facState <disabled | enabled> |
Use the following command to block or allow future SIP messages.
Code Block |
---|
% set system faultAvalancheControl facBlockSuspects <disabled | enabled> |
Command Parameters
Caption |
---|
0 | Table |
---|
1 | faultAvalancheControl Parameter |
---|
|
Parameter | Length/Range | Default | Description | M/O |
---|
faultAvalancheControl | N/A | N/A | This parameter controls the fault avalanche issue. | O | callIdThreshold | 0-999 | 0 | <0-999> - The number of crashes the specific call-ID causes, after which the SBC drops the SIP messages that carry the same call-ID.
| O | calledPartyThreshold | 0-999 | 3 | <0-999> - The number of crashes the specific called party causes, after which the SBC drops the SIP messages that carry the same called party address.
| O | callingPartyThreshold | 0-999 | 3 | <0-999> - The number of crashes the specific calling party causes, after which the SBC drops the SIP messages that carry the same calling party address.
| O | callingNCalledPartyThreshold | 0-999 | 1 | <0-999> - The number of crashes the specific calling & called party causes, after which the SBC drops the SIP messages that carry the same calling and called party address. | O | faultRecAgeingTimeOut | 15-60 | 30 | <15-60> - Configure this parameter with the timeout (in minutes) of the fault record aging. | O | sourceIpThreshold | 0-999 | 5 | <0-999> - The number of crashes the SIP messages from a specific source IP address cause, after which the SBC drops the SIP messages from the same source IP address. | O | facState | N/A | enabled | Use this flag to enable or disable the Fault Avalanche Control feature. When you update this flag from enabled to disabled, the system deletes the existing fault records and blocking entries. This update does not impact the fault records that this system might have previously broadcast to other SBCs in the cluster. enabled (default)disabled - The SBC does not perform tracking or blocking.
| O | facBlockSuspects | N/A | disabled | Determines if future SIP messages are blocked. enabled disabled (default)
|
|
|
Info |
---|
|
If you disable the facState flag, the system: - Must not perform any extra parsing of SIP Protocol Data Units (PDUs) to implement this functionality,
- Must not create, distribute, or request fault records.
- Must not check any received packets for blocking.
- Must discard any received fault records.
|
Command Examples
The following command is an example of how to set and configure faultAvalancheControl
.
Code Block |
---|
|
% set system faultAvalancheControl callIdThreshold 2 calledPartyThreshold 5 callingNCalledPartyThreshold 2 callingPartyThreshold 5 sourceIpThreshold 5 faultRecAgeingTimeOut 35 |
The following command is an example of how to enable faultAvalancheControl
.
Code Block |
---|
|
% set system faultAvalancheControl facState enabled |
The following command is an example of how to view faultAvalancheControl
.
Code Block |
---|
> show status system faultAvalancheControl
|