The SBC CNe containers currently leverage executive probes, which involve running a specific command inside the container to assess a container's health. However, this method is unsuitable for CNFs that employ CPU pinning and run DPDK processes. In such scenarios, the accumulation of executive probes may overload the node and lead to performance bottlenecks. Hence, the solution is transitioning from executive probes to TCP probes for health checks.

TCP probes function by establishing TCP connections to the container on a designated port. A successful connection signifies a healthy container, while a failure indicates potential issues. If a predetermined number of consecutive failures occur, the Kubernetes kubelet restarts the pod to restore functionality.

Configure the SBC container TCP probe by providing the required values in the manifest (deployment.yaml). The Kubelet will then attempt to establish a connection to the provided port.

The core configuration elements typically include:

  • livenessProbe, readinessProbe, and startupProbe: These sections define the health checks for liveness, readiness, and startup, respectively.
  • tcpSocket: This element specifies the network details for the TCP probe, including the port (the container port to connect to).
  • initialDelaySeconds: Defines the initial wait time before initiating the first health check after container startup.
  • periodSeconds: Sets the interval between subsequent health checks.
  • timeoutSeconds: Determines the maximum allowable time for a successful connection establishment.
  • failureThreshold: Specifies the number of consecutive failed health checks before marking the container unhealthy.

A probe listener script employed within the SBC containers during the startup establishes three listening sockets, one for each health probe type: startup, liveness, and readiness. These sockets remain active and prepared to accept connections as long as the parent listener script runs in the background.

In the event of application failure or termination, the probe listener processes are terminated during the cleanup phase, consequently closing the corresponding sockets. In such scenarios, the Kubernetes kubelet attempts to establish connections to these sockets for health checks. Connection failures exceeding a predefined retry limit activate a pod restart, aiming to restore functionality.