In this section:
All CNF applications produce logs and metrics that are essential to understanding their state and health. Because pods are typically ephemeral, it is essential to get this data to a centralized observability backend. The Ribbon CNFs support EFK and Kafka as observability backends for centralized logging, and Prometheus as the metrics logging backend. The interaction with the backends is typically through an integrated telemetry agent. The CNF framework logs of all the microservices/containers are directly streamed to an Elastic backend when Elastic is configured and streamed using RAMP when KAFKA is configured. When a pod restarts, crashes, or is evicted, any logs that are locally stored in an ephemeral file system are lost. The Observability framework streams the logs to the backend to ensure they remain available and are easily searchable. The new CNF logging format adheres to the Elastic Common Schema that allows you to interoperate with storage backends that handle structured data. Below is an example of the new format. Additional details are provided upon request. Supported logging mechanisms include: The following configuration example in the Helm configures the Observability backend server for logging. For SBC CNe debug, system and security logging format backward compatibility, use the following Event Log CLI commands. You may use these commands to switch the files of these log formats at runtime.Overview
Logging
Old Format
Format: Size, filter flag, month, day, year, hour, minute, second, tenths of seconds, shelf, slot, instance, sequence number, level, subsystem, trace type, trace name, event text.
Eg: "206 05022023 131215.414913:1.01.00.00004.MAJOR .SM: *HwModuleServer::procNodeCeProductCapability: serverName: vsbc1, capName: SPS100, capValue: not present, checkType: Minimum Required ActualCeName vsbc1"
New Format
Format: year, month, day, hour, minute, second, tenths of seconds, time zone, level, Size, shelf, slot, instance, sequence number, trace type, subsystem, trace name, event text.
Eg:"2023-05-02 11:52:12,367076 UTC MAJOR 132 1.01.00.00001 .SM: *ConfigManager::configUpdater: ALARM CLEARED - Able to process config updates"
Configuring Logs in Helm
SBC CNe Logging Backwards Compatibility
set oam eventLog typeAdmin debug cnfLogFormat <disable | enable>
set oam eventLog typeAdmin system cnfLogFormat <disable | enable>
set oam eventLog typeAdmin security cnfLogFormat <disable | enable>
cnfLogFormat
flag is enabled in the CNF environment and disabled in non-CNF environments.
As part of the Ribbon CNF solution, the system metrics (e.g., CPU, Memory, Disk read/write, etc.) and application performance metrics (e.g., number of calls per trunk group, active calls, attempted calls, etc.) are collected and monitored using Prometheus The metrics are sent in the Prometheus format. In the Observability backend (e.g., Grafana dashboard), multiple queries are available to display these metrics in a graphical format. Once the metrics are stored in a time-series database such as Prometheus, you can display these metrics. Ribbon provides the Grafana dashboard templates with the solution to display compelling metrics. You may optionally install the provided metrics-based alerting rules using the Prometheus query language (promql) directly through the Helm chart if using the prometheus-operator.Ribbon CNF Metrics
Metrics-based Alerts
The interval statistics files are generated by the isbc/slb/cs pods at the configured time interval. The files are shared with the OAM Pod via PVC.
In a VNF environment, managed pods used to connect to an EMS and stream intervals statistics individually to the EMS, whereas in a CNF environment, only OAM is connected to the RAMP and OAM streams aggregated/consolidated interval statistics from all pods to the RAMP. All existing performance statistics are supported with some of the statistics being replaced with the new statistics.
New CNF-equivalent commands are introduced if the SBC CNe cannot aggregate statistics data from multiple pods for the following reasons. A new key cnfPodName has been added in the CNF-equivalent commands.
Existing versus New CNF Stats:
Existing StatsName | CNF StatsName |
---|---|
IpAclOverallStats | CnfIpAclOverallStats |
IpAclRuleStats | CnfIpAclRuleStats |
IpGeneralGroupStats | CnfIpGeneralGroupStats |
IpPolicingAclOffendersListIntStats | CnfIpPolicingAclOffendersListIntStats |
IpPolicingAggregateOffendersIntStats | CnfIpPolicingAggregateOffendersIntStats |
IpPolicingArpOffendersListIntStats | CnfIpPolicingArpOffendersListIntStats |
IpPolicingBadEtherIpHdrOffendersIntStats | CnfIpPolicingBadEtherIpHdrOffendersIntStats |
IpPolicingDiscardRuleOffendersIntStats | CnfIpPolicingDiscardRuleOffendersIntStats |
IpPolicingIpSecDecryptOffendersIntStats | CnfIpPolicingIpSecDecryptOffendersIntStats |
IpPolicingMediaOffendersIntStats | CnfIpPolicingMediaOffendersIntStats |
IpPolicingRogueMediaIntStats | CnfIpPolicingRogueMediaIntStats |
IpPolicingSrtpDecryptOffendersIntStats | CnfIpPolicingSrtpDecryptOffendersIntStats |
IpPolicingSystemIntStats | CnfIpPolicingSystemIntStats |
IpPolicinguFlowOffendersListIntStats | CnfIpPolicinguFlowOffendersListIntStats |
LinkDetectionGroupStats | CnfLinkDetectionGroupStats |
SysCpuUtilIntStatsSts | CnfSysCpuUtilIntStatsSts |
SysMemoryUtilIntStatsSts | CnfSysMemoryUtilIntStatsSts |
SystemCongestionIntervalStats | CnfSystemCongestionIntervalStats |
TcpGeneralGroupStats | CnfTcpGeneralGroupStats |
DiamNodeRfIntervalStatistics | CnfDiamNodeRfIntervalStatistics |
EthernetPortMgmtStats | CnfEthernetPortMgmtStats |
EthernetPortPacketStats | CnfEthernetPortPacketStats |
IcmpGeneralGroupStats | CnfIcmpGeneralGroupStats |
sipOcsCallIntervalStatistics | cnfSipOcsCallIntervalStatistics |
Example CNF-equivalent CLI commands to display pod-level data using , cnfSipOcsCallIntervalStatistics
:
admin@vsbc1> show status service SC podName ALL addressContext default zone PR_ZONE_INGRESS cnfSipOcsCallIntervalStatistics 24 Possible completions: displaylevel - Depth to show prupgrade-sc-8695fcdc64-jd8xp - This object indicates the PodName. prupgrade-sc-8695fcdc64-mh42j - This object indicates the PodName. prupgrade-sc-8695fcdc64-mshm7 - This object indicates the PodName. Possible match completions: attemptedCalls - Current Attempted ocs Call statistics. establishedCalls - Current Established ocs Call statistics. failedCalls - Current Failed ocs Call statistics. intervalValid - The member indicating the validity of the interval. pendingCalls - Current Pending ocs Call statistics. rejectedCalls - Current SBX Rejected ocs Call statistics. relayedCalls - Current Realyed ocs Invite to Engress side statistics. successfulCalls - Current Successful ocs Call statistics. time - The system up time when the interval statisitic is collected. admin@vsbc1> show status service SC podName ALL addressContext default zone PR_ZONE_INGRESS cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp PR_INGRESS_TG { intervalValid true; time 1683091200; attemptedCalls 0; relayedCalls 0; establishedCalls 0; successfulCalls 0; failedCalls 0; pendingCalls 0; rejectedCalls 0; } cnfSipOcsCallIntervalStatistics 24 prupgrade-sc-8695fcdc64-jd8xp PR_INGRESS_TG1 { intervalValid true; time 1683091200; attemptedCalls 0; relayedCalls 0; establishedCalls 0; successfulCalls 0; failedCalls 0; pendingCalls 0; rejectedCalls 0; }
CLI example of callCountIntervalStatistics
aggregated info:
admin@vsbc1> show table service SC podName ALL global callCountIntervalStatistics ENHANCED AMRNB AMRWB EVRC NICE MRF SIP EV INTERVAL CALL ENCRYPT SRTP VIDEO LEG LEG LEG REC SESSIONS REC TRANSCODE PDCS LE NUMBER NAME VALID TIME COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT CO --------------------------------------------------------------------------------------------------------------------------------------- 55 entry true 1683100500 0 0 0 0 0 0 0 0 0 0 0 0 0 56 entry true 1683100800 0 0 0 0 0 0 0 0 0 0 0 0 0 57 entry true 1683101100 0 0 0 0 0 0 0 0 0 0 0 0 0 58 entry true 1683101400 0 0 0 0 0 0 0 0 0 0 0 0 0
In a CNF environment, the service level statistics from all SC Pods are aggregated at the OAM and presented to the CLI.
A service-level, cluster-wide Current Statistics example is shown below :
admin@vsbc1> show table service SC podName ALL global callCountCurrentStatistics ENHANCED AMRNB AMRWB EVRC NICE MRF SIP EVS SILK SLB CALL ENCRYPT SRTP VIDEO LEG LEG LEG REC SESSIONS REC TRANSCODE PDCS LEG LEG LICENSE SESSIONS NAME COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT MODE COUNT --------------------------------------------------------------------------------------------------------------------------------------- entry 36129 0 0 0 0 0 0 0 0 0 0 0 0 0 domain 36129 [ok][2023-05-03 16:35:07] admin@vsbc1>
Pod-level view of current statistics:
admin@vsbc1> show table service SC podName npbasedtones-sc-86647c7d94-djg2s global callCountCurrentStatistics ENHANCED AMRNB AMRWB EVRC NICE MRF SIP EVS SILK SLB CALL ENCRYPT SRTP VIDEO LEG LEG LEG REC SESSIONS REC TRANSCODE PDCS LEG LEG LICENSE SESSIONS NAME COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT COUNT MODE COUNT --------------------------------------------------------------------------------------------------------------------------------------- entry 12173 0 0 0 0 0 0 0 0 0 0 0 0 0 domain 12173 [ok][2023-05-03 16:36:18] admin@vsbc1>
Under service level, a key is added to all status commands to prevent data aggregation since CNF status details are meant to reflect pod-specific data.
admin@vsbc1> show table service SC podName ALL addressContext default zone MR_ZONE_INGRESS trunkGroupQoeStatus INBOUND OUTBOUND RFACTOR INBOUND RFACTOR OUTBOUND INBOUND NUM RFACTOR OUTBOUND NUM RFACTOR RFACTOR CRITICAL NUM MAJOR RFACTOR CRITICAL NUM MAJOR INBOUND FROM THRESHOLD THRESHOLD OUTBOUND FROM THRESHOLD THRESHOLD CURR POD NAME NAME RFACTOR SBXBOOT BREACHED BREACHED RFACTOR SBXBOOT BREACHED BREACHED ASR --------------------------------------------------------------------------------------------------------------------------------------- npbasedtones-sc-86647c7d94-djg2s MR_TG_INGRESS 94 94 0 0 94 94 0 0 90 npbasedtones-sc-86647c7d94-plxp6 MR_TG_INGRESS 94 94 0 0 94 94 0 0 90 npbasedtones-sc-86647c7d94-xf625 MR_TG_INGRESS 94 94 0 0 94 94 0 0 90 [ok][2023-05-03 17:02:06] admin@vsbc1>
Under service level, the podName
object is added to the action command response to differentiate responses from multiple pods.
Action commands at the local level (For example, oam, system, addressContext, etc.) are unchanged.
admin@vsbc1> request service SC podName ALL oam eventLog typeAdmin debug rolloverLogNow response { podname npbasedtones-sc-86647c7d94-plxp6 result success reason } response { podname npbasedtones-sc-86647c7d94-xf625 result success reason } response { podname npbasedtones-sc-86647c7d94-djg2s result success reason } [ok][2023-05-03 17:37:52]
All CLI commands for unsupported features are hidden in a CNF deployment.
The CNF traps from all pods are consolidated on the OAM Pod and streamed to RAMP's Observability backend. The traps include Kubernetes-specific details with the addition of the varbinds Use the following CLI command to view the alarms in the OAM Pod. In this example, the pod name (CNF Traps/Alarms
nodeName
, podName
and containerName
to identify the originator of the trap from the CLI/RAMP. Also, the keyword 'Cnf' is appended to the CNe trap names. A new set of MIBs are introduced for CNe traps.lplcnf1344-sc-696bb7958f-gflgq
) is for the key. Thus, only the alarms raised from this pod only are shown.admin@vsbc1> show status AlarmsCnf currentStatus lplcnf1344-sc-696bb7958f-gflgq
currentStatus lplcnf1344-sc-696bb7958f-gflgq 4379 {
clearType AUTOMATIC;
timestamp 2022-01-31T10:41:01-00:00;
initialTimestamp 2022-01-31T10:01:01-00:00;
localTimestamp 2022-01-31T05:41:01;
localInitialTimestamp 2022-01-31T05:01:01;
count 9;
desc "Debug Event Log filter level is set to INFO. Set to MAJOR if finished troubleshooting";
reporter EVLOG;
severity Major;
acknowledgeState unAcknowledge;
comment "";
}