In this section:

Ribbon uses OpenTelemetry to implement the Ribbon Observability Stack and supports the selected components from the OpenTelemetry community. Ribbon's aim is to support all CNF-based Ribbon products in order to stream the observability data. With the introduction of this new stack, all logs and metrics and in future traces are normalized and pre-processed, which will help to correlate multiple data points captured from each microservice and deliver them to configured backends.

Logging

The SBC CNe logs from all microservices/containers are streamed to the Observability backend. Some examples of Observability backends for centralized logging are EFK, Kafka, and so on. The logs are streamed to the Observability backend in a uniform format. In CNF deployments, a new format is introduced for some of the logs, as explained below.

Old Format

Format: Size, filter flag, month, day, year, hour, minute, second, tenths of seconds, shelf, slot, instance, sequence number, level, subsystem, trace type, trace name, event text.

For example: "206 05022023 131215.414913:1.01.00.00004.MAJOR   .SM: *HwModuleServer::procNodeCeProductCapability: serverName: vsbc1, capName: SPS100, capValue: not present, checkType: Minimum Required ActualCeName vsbc1"

New Format

Format: year, month, day, hour, minute, second, tenths of seconds, time zone, level, Size, shelf, slot, instance, sequence number, trace type, subsystem, trace name, event text.

For example: "2023-05-02 11:52:12,367076 UTC MAJOR    132 1.01.00.00001 .SM: *ConfigManager::configUpdater: ALARM CLEARED - Able to process config updates"

For backward compatibility of the logging format, a CLI configuration is added for the debug, system and security logs. You can switch the files of these log formats at runtime using the following CLI commands.

% set oam eventLog typeAdmin debug cnfLogFormat <disable | enable>
% set oam eventLog typeAdmin system cnfLogFormat <disable | enable>
% set oam eventLog typeAdmin security cnfLogFormat <disable | enable>

Some points to note:

  • By default, the cnfLogFormat flag is enabled in the CNF environment and disabled in non-CNF environments.
  • You can configure the debug, system and security formats to have new CNF log format based on this flag.
  • The trace and pkt files always follow the old format.
  • The audit and mem files always follow the new format.

System and Application Performance Metrics  

As part of the SBC CNe solution, the system metrics (such as CPU, Memory, and Disk read/write) and application performance metrics (for example, the number of calls per Trunkgroup, active calls, and attempted calls) are collected and monitored using Prometheus. The metrics are sent in the Prometheus format. In the backend (for example, the Grafana dashboard), we can have multiple queries to display these metrics in a graphical format such as time series format.


CNe Prometheus Metrics - Call Traffic (Example)


CNe Prometheus Metrics - CPU and Memory Usage (Example)

Interval Statistics

The interval statistics files are generated by I-SBC/SLB/CS pods at the configured time interval. The files are shared with OAM pod via PVC.

  • The OAM Pod aggregates the statistics files to provide a consolidated statistics pm files to RAMP
  • The OAM pod populates the DB to get the performance statistics per pod and aggregated statistics. With this, we can see the pod wise interval statistics and aggregated statistics via CLI.
  • The individual pod level interval statistics are also streamed to observability backend.

In a VNF environment, managed pods used to connect to EMS and stream intervals statistics individually to EMS, where as in CNF environment, only OAM is connected to RAMP and OAM streams aggregated/consolidated interval statistics from all pods to RAMP. All the existing performance statistics are supported with some of the statistics being replaced with the new statistics.

New CNe equivalent commands are introduced if statistics data from multiple pods can't be aggregated for the following reasons. A new key cnfPodName has been added in the cnf equivalent commands.

  • If one or more non integer type fields like Ip Address, Time, Average data are present.
  • If aggregation is possible but aggregated data will not add value as the statistics command is meant to reflect pod specific data (like memory and CPU utililization information).
    For above cases mentioned, 

The list of newly introduced stats in CNF deployment is captured in the table below:

Exisiting Stats Name

CNF Stats Name

IpAclOverallStats                      

CnfIpAclOverallStats

IpAclRuleStats

CnfIpAclRuleStats

IpGeneralGroupStats                      

CnfIpGeneralGroupStats

IpPolicingAclOffendersListIntStats

CnfIpPolicingAclOffendersListIntStats

IpPolicingAggregateOffendersIntStats

CnfIpPolicingAggregateOffendersIntStats

IpPolicingArpOffendersListIntStats 

CnfIpPolicingArpOffendersListIntStats

IpPolicingBadEtherIpHdrOffendersIntStats

CnfIpPolicingBadEtherIpHdrOffendersIntStats

IpPolicingDiscardRuleOffendersIntStats

CnfIpPolicingDiscardRuleOffendersIntStats

IpPolicingIpSecDecryptOffendersIntStats 

CnfIpPolicingIpSecDecryptOffendersIntStats

IpPolicingMediaOffendersIntStats

CnfIpPolicingMediaOffendersIntStats

IpPolicingRogueMediaIntStats

CnfIpPolicingRogueMediaIntStats

IpPolicingSrtpDecryptOffendersIntStats

CnfIpPolicingSrtpDecryptOffendersIntStats

IpPolicingSystemIntStats

CnfIpPolicingSystemIntStats

IpPolicinguFlowOffendersListIntStats

CnfIpPolicinguFlowOffendersListIntStats

LinkDetectionGroupStats

CnfLinkDetectionGroupStats

SysCpuUtilIntStatsSts

CnfSysCpuUtilIntStatsSts

SysMemoryUtilIntStatsSts             

CnfSysMemoryUtilIntStatsSts

SystemCongestionIntervalStats             

CnfSystemCongestionIntervalStats

TcpGeneralGroupStats                     

CnfTcpGeneralGroupStats

DiamNodeRfIntervalStatistics

CnfDiamNodeRfIntervalStatistics

EthernetPortMgmtStats                     

CnfEthernetPortMgmtStats 

EthernetPortPacketStats

CnfEthernetPortPacketStats

IcmpGeneralGroupStats          

CnfIcmpGeneralGroupStats

sipOcsCallIntervalStatistics

cnfSipOcsCallIntervalStatistics

The CLI view of one of the CNe equivalent commands, cnfSipOcsCallIntervalStatistics, where pod level data is displayed.

admin@vsbc1> show status service SC podName ALL addressContext default zone PR_ZONE_INGRESS cnfSipOcsCallIntervalStatistics 24          

Possible completions:
  displaylevel                  - Depth to show
  prupgrade-sc-8695fcdc64-jd8xp - This object indicates the PodName.
  prupgrade-sc-8695fcdc64-mh42j - This object indicates the PodName.
  prupgrade-sc-8695fcdc64-mshm7 - This object indicates the PodName.
Possible match completions:
  attemptedCalls   - Current Attempted ocs Call statistics.
  establishedCalls - Current Established ocs Call statistics.
  failedCalls      - Current Failed ocs Call statistics.
  intervalValid    - The member indicating the validity of the interval.
  pendingCalls     - Current Pending ocs Call statistics.
  rejectedCalls    - Current SBX Rejected ocs Call statistics.
  relayedCalls     - Current Realyed ocs Invite to Engress side statistics.
  successfulCalls  - Current Successful ocs Call statistics.
  time             - The system up time when the interval statisitic is collected.

admin@vsbc1> show status service SC podName ALL addressContext default zone PR_ZONE_INGRESS cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp
cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp PR_INGRESS_TG {
    intervalValid    true;
    time             1683091200;
    attemptedCalls   0;
    relayedCalls     0;
    establishedCalls 0;
    successfulCalls  0;
    failedCalls      0;
    pendingCalls     0;
    rejectedCalls    0;
}
cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp PR_INGRESS_TG1 {
    intervalValid    true;
    time             1683091200;
    attemptedCalls   0;
    relayedCalls     0;
    establishedCalls 0;
    successfulCalls  0;
    failedCalls      0;
    pendingCalls     0;
    rejectedCalls    0;
}

CLI example that shows aggregated info of callCountIntervalStatistics:

admin@vsbc1> show table service SC podName ALL global callCountIntervalStatistics

                                                            ENHANCED  AMRNB  AMRWB  EVRC   NICE   MRF       SIP                      EV
               INTERVAL              CALL   ENCRYPT  SRTP   VIDEO     LEG    LEG    LEG    REC    SESSIONS  REC    TRANSCODE  PDCS   LE
NUMBER  NAME   VALID     TIME        COUNT  COUNT    COUNT  COUNT     COUNT  COUNT  COUNT  COUNT  COUNT     COUNT  COUNT      COUNT  CO
---------------------------------------------------------------------------------------------------------------------------------------
55      entry  true      1683100500  0      0        0      0         0      0      0      0      0         0      0          0      0
56      entry  true      1683100800  0      0        0      0         0      0      0      0      0         0      0          0      0
57      entry  true      1683101100  0      0        0      0         0      0      0      0      0         0      0          0      0
58      entry  true      1683101400  0      0        0      0         0      0      0      0      0         0      0          0      0

The data view from RAMP for some of the interval statistics is depicted below.

Example 1: CnfSysCpuUtilIntStatsSts streamed by OAM, where pod level view is retained as data aggregation is not possible.

Example 2: CallCountIntervalStats streamed by OAM where data from all pods are aggregated.

Current Statistics

In CNe environment, under service level  the statistics from all the SC pods are aggregated at OAM and are presented to the CLI. 

Under service level, clusterwise CurrentStatistics can be seen like below:

admin@vsbc1> show table service SC podName ALL global callCountCurrentStatistics

                              ENHANCED  AMRNB  AMRWB  EVRC   NICE   MRF       SIP                      EVS    SILK            SLB
       CALL   ENCRYPT  SRTP   VIDEO     LEG    LEG    LEG    REC    SESSIONS  REC    TRANSCODE  PDCS   LEG    LEG    LICENSE  SESSIONS
NAME   COUNT  COUNT    COUNT  COUNT     COUNT  COUNT  COUNT  COUNT  COUNT     COUNT  COUNT      COUNT  COUNT  COUNT  MODE     COUNT
---------------------------------------------------------------------------------------------------------------------------------------
entry  36129  0        0      0         0      0      0      0      0         0      0          0      0      0      domain   36129
[ok][2023-05-03 16:35:07]
admin@vsbc1>

Pod level view of current Statistics:

admin@vsbc1> show table service SC podName npbasedtones-sc-86647c7d94-djg2s global callCountCurrentStatistics

                              ENHANCED  AMRNB  AMRWB  EVRC   NICE   MRF       SIP                      EVS    SILK            SLB
       CALL   ENCRYPT  SRTP   VIDEO     LEG    LEG    LEG    REC    SESSIONS  REC    TRANSCODE  PDCS   LEG    LEG    LICENSE  SESSIONS
NAME   COUNT  COUNT    COUNT  COUNT     COUNT  COUNT  COUNT  COUNT  COUNT     COUNT  COUNT      COUNT  COUNT  COUNT  MODE     COUNT
---------------------------------------------------------------------------------------------------------------------------------------
entry  12173  0        0      0         0      0      0      0      0         0      0          0      0      0      domain   12173
[ok][2023-05-03 16:36:18]
admin@vsbc1>

Status Commands

Under service level, new key has been added to all status commands to prevent data aggregation as status commands are meant to reflect pod specific data. All the root level commands remains same.

admin@vsbc1> show table service SC podName ALL addressContext default zone MR_ZONE_INGRESS trunkGroupQoeStatus
                                                                   INBOUND                                   OUTBOUND
                                                                   RFACTOR    INBOUND                        RFACTOR    OUTBOUND
                                                          INBOUND  NUM        RFACTOR              OUTBOUND  NUM        RFACTOR
                                                          RFACTOR  CRITICAL   NUM MAJOR            RFACTOR   CRITICAL   NUM MAJOR
                                                 INBOUND  FROM     THRESHOLD  THRESHOLD  OUTBOUND  FROM      THRESHOLD  THRESHOLD  CURR
POD NAME                          NAME           RFACTOR  SBXBOOT  BREACHED   BREACHED   RFACTOR   SBXBOOT   BREACHED   BREACHED   ASR
---------------------------------------------------------------------------------------------------------------------------------------
npbasedtones-sc-86647c7d94-djg2s  MR_TG_INGRESS  94       94       0          0          94        94        0          0          90
npbasedtones-sc-86647c7d94-plxp6  MR_TG_INGRESS  94       94       0          0          94        94        0          0          90
npbasedtones-sc-86647c7d94-xf625  MR_TG_INGRESS  94       94       0          0          94        94        0          0          90
[ok][2023-05-03 17:02:06]
admin@vsbc1>

Action commands

Under Service level, podname has been added to action command response to differentiate responses from multiple pods.

Action commands at local level (as in the OAM, system, addressContext, etc.) remains same.

admin@vsbc1> request service SC podName ALL oam eventLog typeAdmin debug rolloverLogNow
response {
    podname npbasedtones-sc-86647c7d94-plxp6
    result success
    reason
}
response {
    podname npbasedtones-sc-86647c7d94-xf625
    result success
    reason
}
response {
    podname npbasedtones-sc-86647c7d94-djg2s
    result success
    reason
}
[ok][2023-05-03 17:37:52]

All CLI commands for unsupported features mentioned in UnsupportedfunctionalitiesinSBCCNesolution section has been hidden in CNF deployment.