In this section:


Modified: for 12.1.4



The SBC CNe containers currently leverage probes, which involve running a specific command inside the container to assess a container's health. 

Introduction to the CNF probes

Probes in Cloud-Native Functions are the mechanisms used to check the status of the containers. They are essential for monitoring and managing their health and performance. They help ensure that applications are running smoothly and can handle traffic effectively. 

This method is unsuitable for CNFs that employ CPU pinning and run DPDK processes. In such scenarios, the accumulation of executive probes may overload the node and lead to performance bottlenecks. 

The probes function by establishing connections to the container in a specified manner. A successful connection signifies a healthy container, while a failure indicates potential issues. If a predetermined number of consecutive failures occur, the Kubernetes kubelet restarts the pod to restore functionality. Probes verify container status with the following behaviors:

  • Liveness Probes: These check if the application inside the container is still running. If a liveness probe fails, Kubernetes will restart the container to try to fix the issue.
  • Readiness Probes: These determine if the application is ready to handle requests. If a readiness probe fails, the container will be temporarily removed from the load balancer until it is ready again.
  • Startup Probes: These are used to check if the application has successfully started. This is particularly useful for applications that take a long time to initialize.

A probe listener script employed within the SBC containers during the startup establishes three listening sockets, one for each health probe type: startup, liveness, and readiness. These sockets remain active and prepared to accept connections as long as the parent listener script runs in the background.

When configuring probes in Kubernetes, there are several common parameters are used to define their behavior:

  • initialDelaySeconds: The number of seconds after the container has started before the probe is initiated. This allows the application some time to start up before health checks begin.
  • periodSeconds: How often (in seconds) to perform the probe. This defines the frequency of the health checks.
  • timeoutSeconds: The number of seconds after which the probe times out. If the probe does not complete within this time, it is considered a failure.
  • successThreshold: The number of consecutive successes required for the probe to be considered successful after it has failed. This is useful for ensuring transient issues do not cause the container to be marked as unhealthy.
  • failureThreshold: The number of consecutive failures required for the probe to be considered failed after it has succeeded. This helps in avoiding false positives due to temporary issues.

In the event of application failure or termination, the probe listener processes are terminated during the cleanup phase, consequently closing the corresponding sockets. In such scenarios, the Kubernetes kubelet attempts to establish connections to these sockets for health checks. Connection failures exceeding a predefined retry limit activate a pod restart, aiming to restore functionality.

Configure the SBC container probe by providing the required values in the manifest (deployment.yaml). The Kubelet will then attempt to establish a connection to the provided port.

Probe Types in SBC CNe

In Kubernetes, several types of probe mechanisms check the health and status of containers. Each type of probe serves different purposes and can be chosen based on your application's specific requirements for liveness, readiness, and startup probes. There are currently three main probe types: HTTP GET Probe, TCP Socket Probe, and Exec Probe.

HTTP GET Probe

This probe sends an HTTP GET request to a specified endpoint on the container. If the response code is between 200 and 399, then the container is considered healthy.

TCP Socket Probe

TCP probes function by establishing TCP connections to the container on a designated port. A successful connection signifies a healthy container, while a connection failure indicates potential issues. If a predetermined number of consecutive failures occur, the Kubernetes kubelet will initiate a restart of the pod to restore functionality. All three probes are recreated when the container restarts and runs. The TCP Socket Probe has the additional following parameter:

  • tcpSocket: This element specifies the network details for the TCP probe, including the port (the container port to connect to).

For example, below are the sockets opened for the TCP probes' communications between the container and the kubelet service.

Liveness:               tcp-socket :45456 delay=200s timeout=2s period=5s #success=1 #failure=15
Readiness:              tcp-socket :45457 delay=240s timeout=2s period=5s #success=1 #failure=3
Startup:                tcp-socket :45455 delay=240s timeout=2s period=10s #success=1 #failure=30

Exec Probe

Exec probes run a specified command inside the container. The command is executed directly without a shell, and its exit status determines the health of the container. If the command returns an exit status of 0, then the container is considered healthy. Any non-zero exit status indicates an unhealthy state, prompting Kubernetes to take corrective actions, such as restarting the container.

The Exec Probes have the additional following parameter:

  • exec: This element specifies the executable command contained within the probe.

Some examples of exec probes include:

  • pvclogger-container: The readiness and liveness of the container are monitored by executing a cat operation on a marker file. These marker files are generated within the container application. If the application fails to start, the marker files won't be created, leading to probe failures and ultimately causing the container to restart.
  • rbbn-telemetry-agent: The telemetry agent also uses the exit code of the cat command on a marker file to perform exec probe container health checks
  • rbbn-cache: Cache containers utilize a bash script for liveness and readiness probe checks, which is stored in a cache config map. This script includes aliveness checks for the Redis-cli component by continuously sending pings and loading memory databases.

For example, below is a probe executing an oamproxy-container probe:

Readiness:            exec [cat /opt/ribbon/probe/readiness_probe] delay=10s timeout=1s period=5s #success=1 #failure=3
Startup:              exec [cat /opt/ribbon/probe/start_probe] delay=30s timeout=1s period=10s #success=1 #failure=30