Overview

Restarting the containers in a specific pod is identified by executing "oc get pods" and checking the RESTARTS. If any pod shows the restart as non-zero (eg: aijaincnf-cs-574f555cc-f7f29), issue the "oc describe pod <pod_name>" command to check the container that restarted.

Example:
[jdoe@cli-server ~]$ oc get pods

NAME                                  READY   STATUS             RESTARTS        AGE
aijaincnf-cac-7588d5dc9c-9d4bk        3/3     Running            0               45h
aijaincnf-cac-7588d5dc9c-qpt5w        3/3     Running            0               45h
aijaincnf-cache-0                     2/2     Running            0               45h
aijaincnf-cache-1                     2/2     Running            0               45h
aijaincnf-cache-2                     2/2     Running            0               45h
aijaincnf-cache-3                     2/2     Running            0               45h
aijaincnf-cache-4                     2/2     Running            0               45h
aijaincnf-cache-5                     2/2     Running            0               45h
aijaincnf-cs-574f555cc-f7f29          5/5     Running            3 (24m ago)     45h
aijaincnf-cs-574f555cc-qq74v          5/5     Running            0               45h
aijaincnf-epu-7cb4fc5576-nvfk8        3/3     Running            0               45h
aijaincnf-epu-7cb4fc5576-rz8w5        3/3     Running            1 (84m ago)     45h
aijaincnf-hpa-8675fd8fc5-7g582        3/3     Running            0               45h
aijaincnf-hpa-8675fd8fc5-dsrtk        3/3     Running            0               45h
aijaincnf-ns-59b4ddc6fc-k2rqp         4/4     Running            0               45h
aijaincnf-ns-59b4ddc6fc-zxbl6         4/4     Running            2 (84m ago)     45h
aijaincnf-oam-64cc99dd54-4nzmd        2/2     Running            0               45h
aijaincnf-oam-64cc99dd54-ghtq2        2/2     Running            2 (24m ago)     45h
aijaincnf-rac-6dffbf46dd-2q6vv        4/4     Running            1 (89m ago)     45h
aijaincnf-rac-6dffbf46dd-5qwvc        4/4     Running            2 (84m ago)     45h
aijaincnf-sc-54f9cf779b-9xqs6         5/5     Running            0               152m
aijaincnf-sc-54f9cf779b-rvw57         5/5     Running            0               172m
aijaincnf-slb-8cff5ddcd-psjbr         4/4     Running            0               45h
aijaincnf-slb-8cff5ddcd-tvw2h         4/4     Running            2 (24m ago)     45h


[jdoe@cli-server ~]$ oc describe pod aijaincnf-cs-574f555cc-f7f29
Name:         aijaincnf-cs-574f555cc-f7f29
Namespace:    sbc-svt
Priority:     0
Node:         worker-7.rco-ocp1.lab.rbbn.com/172.29.208.32
Start Time:   Fri, 07 Jul 2023 07:21:05 -0400
:
:
Containers:
  cs-container:
    Container ID:   cri-o://23629a37945e44156c95aa435557695b0571c4541ba1978a47ac1b6c20ed6c32
    Image:          artifactory-tx.rbbn.com/sbx-docker-prod-plano/sbx/sbxmain/production/isbcslb:12.1.0-14664
    Image ID:       artifactory-tx.rbbn.com/sbx-docker-prod-plano/sbx/sbxmain/production/isbcslb@sha256:4a261b1c2ab273075a5be1114977d157854be47cc760794c512c54ba957ff1a4
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Sun, 09 Jul 2023 04:16:35 -0400
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Sun, 09 Jul 2023 03:18:42 -0400
      Finished:     Sun, 09 Jul 2023 04:16:27 -0400
    Ready:          True
    Restart Count:  2

Restarting SC/CS/SLB Containers

During a Helm installation, if you observe the restart of isbc-container/slb-container/cs-container in SC/SLB/CS pods respectively, a possible reason for the restart is:

"DPDK probe fails when the ownership of the PCI device (VF - Virtual function) was with another POD, and this results in SWe_NP exit. This can happen when the VF given to the pod is associated with another pod that's in terminating state."