In this section:
Ribbon uses OpenTelemetry to implement the Ribbon Observability Stack and supports the selected components from the OpenTelemetry community. Ribbon's aim is to support all CNF-based Ribbon products in order to stream the observability data. With the introduction of this new stack, all logs and metrics and in future traces are normalized and pre-processed, which will help to correlate multiple data points captured from each microservice and deliver them to configured backends.
Logging
The SBC CNe logs from all microservices/containers are streamed to the Observability backend. Some examples of Observability backends for centralized logging are EFK, Kafka, and so on. The logs are streamed to the Observability backend in a uniform format. In CNF deployments, a new format is introduced for some of the logs, as explained below.
Old Format
Format: Size, filter flag, month, day, year, hour, minute, second, tenths of seconds, shelf, slot, instance, sequence number, level, subsystem, trace type, trace name, event text. For example: "206 05022023 131215.414913:1.01.00.00004.MAJOR .SM: *HwModuleServer::procNodeCeProductCapability: serverName: vsbc1, capName: SPS100, capValue: not present, checkType: Minimum Required ActualCeName vsbc1"
New Format
Format: year, month, day, hour, minute, second, tenths of seconds, time zone, level, Size, shelf, slot, instance, sequence number, trace type, subsystem, trace name, event text. For example: "2023-05-02 11:52:12,367076 UTC MAJOR 132 1.01.00.00001 .SM: *ConfigManager::configUpdater: ALARM CLEARED - Able to process config updates"
For backward compatibility of the logging format, a CLI configuration is added for the debug, system and security logs. You can switch the files of these log formats at runtime using the following CLI commands.
% set oam eventLog typeAdmin debug cnfLogFormat <disable | enable> % set oam eventLog typeAdmin system cnfLogFormat <disable | enable> % set oam eventLog typeAdmin security cnfLogFormat <disable | enable>
Some points to note:
- By default, the cnfLogFormat flag is enabled in the CNF environment and disabled in non-CNF environments.
- You can configure the debug, system and security formats to have new CNF log format based on this flag.
- The trace and pkt files always follow the old format.
- The audit and mem files always follow the new format.
System and Application Performance Metrics
As part of the SBC CNe solution, the system metrics (such as CPU, Memory, and Disk read/write) and application performance metrics (for example, the number of calls per Trunkgroup, active calls, and attempted calls) are collected and monitored using Prometheus. The metrics are sent in the Prometheus format. In the backend (for example, the Grafana dashboard), we can have multiple queries to display these metrics in a graphical format such as time series format.
CNe Prometheus Metrics - Call Traffic (Example)
CNe Prometheus Metrics - CPU and Memory Usage (Example)
Interval Statistics
The interval statistics files are generated by I-SBC/SLB/CS pods at the configured time interval. The files are shared with OAM pod via PVC.
- The OAM Pod aggregates the statistics files to provide a consolidated statistics pm files to RAMP
- The OAM pod populates the DB to get the performance statistics per pod and aggregated statistics. With this, we can see the pod wise interval statistics and aggregated statistics via CLI.
- The individual pod level interval statistics are also streamed to observability backend.
In a VNF environment, managed pods used to connect to EMS and stream intervals statistics individually to EMS, where as in CNF environment, only OAM is connected to RAMP and OAM streams aggregated/consolidated interval statistics from all pods to RAMP. All the existing performance statistics are supported with some of the statistics being replaced with the new statistics.
New CNe equivalent commands are introduced if statistics data from multiple pods can't be aggregated for the following reasons. A new key cnfPodName has been added in the cnf equivalent commands.
- If one or more non integer type fields like Ip Address, Time, Average data are present.
- If aggregation is possible but aggregated data will not add value as the statistics command is meant to reflect pod specific data (like memory and CPU utililization information).
For above cases mentioned,
The list of newly introduced stats in CNF deployment is captured in the table below:
Exisiting Stats Name | CNF Stats Name |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The CLI view of one of the CNe equivalent commands, cnfSipOcsCallIntervalStatistics
, where pod level data is displayed.
admin@vsbc1> show status service SC podName ALL addressContext default zone PR_ZONE_INGRESS cnfSipOcsCallIntervalStatistics 24 Possible completions: displaylevel - Depth to show prupgrade-sc-8695fcdc64-jd8xp - This object indicates the PodName. prupgrade-sc-8695fcdc64-mh42j - This object indicates the PodName. prupgrade-sc-8695fcdc64-mshm7 - This object indicates the PodName. Possible match completions: attemptedCalls - Current Attempted ocs Call statistics. establishedCalls - Current Established ocs Call statistics. failedCalls - Current Failed ocs Call statistics. intervalValid - The member indicating the validity of the interval. pendingCalls - Current Pending ocs Call statistics. rejectedCalls - Current SBX Rejected ocs Call statistics. relayedCalls - Current Realyed ocs Invite to Engress side statistics. successfulCalls - Current Successful ocs Call statistics. time - The system up time when the interval statisitic is collected. admin@vsbc1> show status service SC podName ALL addressContext default zone PR_ZONE_INGRESS cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp PR_INGRESS_TG { intervalValid true; time 1683091200; attemptedCalls 0; relayedCalls 0; establishedCalls 0; successfulCalls 0; failedCalls 0; pendingCalls 0; rejectedCalls 0; } cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp PR_INGRESS_TG1 { intervalValid true; time 1683091200; attemptedCalls 0; relayedCalls 0; establishedCalls 0; successfulCalls 0; failedCalls 0; pendingCalls 0; rejectedCalls 0; }
CLI example that shows aggregated info of callCountIntervalStatistics
:
admin@vsbc1> show table service SC podName ALL global callCountIntervalStatistics ENHANCED AMRNB AMRWB EVRC NICE MRF SIP EV INTERVAL CALL ENCRYPT SRTP VIDEO LEG LEG LEG REC SESSIONS REC TRANSCODE PDCS LE NUMBER NAME VALID TIME COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT CO --------------------------------------------------------------------------------------------------------------------------------------- 55 entry true 1683100500 0 0 0 0 0 0 0 0 0 0 0 0 0 56 entry true 1683100800 0 0 0 0 0 0 0 0 0 0 0 0 0 57 entry true 1683101100 0 0 0 0 0 0 0 0 0 0 0 0 0 58 entry true 1683101400 0 0 0 0 0 0 0 0 0 0 0 0 0
The data view from RAMP for some of the interval statistics is depicted below.
Example 1: CnfSysCpuUtilIntStatsSts streamed by OAM, where pod level view is retained as data aggregation is not possible.
Example 2: CallCountIntervalStats streamed by OAM where data from all pods are aggregated.
Current Statistics
In CNe environment, under service level the statistics from all the SC pods are aggregated at OAM and are presented to the CLI.
Under service level, clusterwise CurrentStatistics can be seen like below:
admin@vsbc1> show table service SC podName ALL global callCountCurrentStatistics ENHANCED AMRNB AMRWB EVRC NICE MRF SIP EVS SILK SLB CALL ENCRYPT SRTP VIDEO LEG LEG LEG REC SESSIONS REC TRANSCODE PDCS LEG LEG LICENSE SESSIONS NAME COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT MODE COUNT --------------------------------------------------------------------------------------------------------------------------------------- entry 36129 0 0 0 0 0 0 0 0 0 0 0 0 0 domain 36129 [ok][2023-05-03 16:35:07] admin@vsbc1>
Pod level view of current Statistics:
admin@vsbc1> show table service SC podName npbasedtones-sc-86647c7d94-djg2s global callCountCurrentStatistics ENHANCED AMRNB AMRWB EVRC NICE MRF SIP EVS SILK SLB CALL ENCRYPT SRTP VIDEO LEG LEG LEG REC SESSIONS REC TRANSCODE PDCS LEG LEG LICENSE SESSIONS NAME COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT MODE COUNT --------------------------------------------------------------------------------------------------------------------------------------- entry 12173 0 0 0 0 0 0 0 0 0 0 0 0 0 domain 12173 [ok][2023-05-03 16:36:18] admin@vsbc1>
Status Commands
Under service level, new key has been added to all status commands to prevent data aggregation as status commands are meant to reflect pod specific data. All the root level commands remains same.
admin@vsbc1> show table service SC podName ALL addressContext default zone MR_ZONE_INGRESS trunkGroupQoeStatus INBOUND OUTBOUND RFACTOR INBOUND RFACTOR OUTBOUND INBOUND NUM RFACTOR OUTBOUND NUM RFACTOR RFACTOR CRITICAL NUM MAJOR RFACTOR CRITICAL NUM MAJOR INBOUND FROM THRESHOLD THRESHOLD OUTBOUND FROM THRESHOLD THRESHOLD CURR POD NAME NAME RFACTOR SBXBOOT BREACHED BREACHED RFACTOR SBXBOOT BREACHED BREACHED ASR --------------------------------------------------------------------------------------------------------------------------------------- npbasedtones-sc-86647c7d94-djg2s MR_TG_INGRESS 94 94 0 0 94 94 0 0 90 npbasedtones-sc-86647c7d94-plxp6 MR_TG_INGRESS 94 94 0 0 94 94 0 0 90 npbasedtones-sc-86647c7d94-xf625 MR_TG_INGRESS 94 94 0 0 94 94 0 0 90 [ok][2023-05-03 17:02:06] admin@vsbc1>
Action commands
Under Service level, podname has been added to action command response to differentiate responses from multiple pods.
Action commands at local level (as in the OAM, system, addressContext, etc.) remains same.
admin@vsbc1> request service SC podName ALL oam eventLog typeAdmin debug rolloverLogNow response { podname npbasedtones-sc-86647c7d94-plxp6 result success reason } response { podname npbasedtones-sc-86647c7d94-xf625 result success reason } response { podname npbasedtones-sc-86647c7d94-djg2s result success reason } [ok][2023-05-03 17:37:52]
All CLI commands for unsupported features mentioned in UnsupportedfunctionalitiesinSBCCNesolution section has been hidden in CNF deployment.