web-dev-qa-db-fra.com

Comment réparer «l'exécution du conteneur est en panne, le PLEG n'est pas sain»

J'ai aks avec un cluster kubernetes ayant 2 nœuds. Chaque nœud a environ 6-7 pod fonctionnant avec 2 conteneurs pour chaque pod. Un conteneur est mon image docker et l'autre est créé par istio pour son maillage de service. Mais après environ 10 heures, les nœuds deviennent "non prêts" et le nœud décrit me montre 2 erreurs: 1. l'exécution du conteneur est en panne, le PLEG n'est pas sain: pleg était en dernier actif il y a 1h32m35.942907195s; le seuil est de 3m0s. Erreur 2.rpc: code = DeadlineExceeded desc = délai de contexte dépassé, impossible de se connecter au démon Docker sous unix: ///var/run/docker.sock. Le démon docker est-il en cours d'exécution?

Lorsque je redémarre le nœud, cela fonctionne bien, mais le nœud revient à "PAS PRÊT" après un certain temps. A commencé à faire face à ce problème depuis l'ajout d'istio, mais n'a trouvé aucun document les concernant. La prochaine étape consiste à essayer de mettre à niveau kubernetes

Le nœud décrit le journal:

Name:               aks-agentpool-22124581-0
Roles:              agent
Labels:             agentpool=agentpool
                    beta.kubernetes.io/Arch=AMD64
                    beta.kubernetes.io/instance-type=Standard_B2s
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=eastus
                    failure-domain.beta.kubernetes.io/zone=1
                    kubernetes.Azure.com/cluster=MC_XXXXXXXXX
                    kubernetes.io/hostname=aks-XXXXXXXXX
                    kubernetes.io/role=agent
                    node-role.kubernetes.io/agent=
                    storageprofile=managed
                    storagetier=Premium_LRS
Annotations:        aks.Microsoft.com/remediated=3
                    node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp:  Thu, 25 Oct 2018 14:46:53 +0000
Taints:             <none>
Unschedulable:      false
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Thu, 25 Oct 2018 14:49:06 +0000   Thu, 25 Oct 2018 14:49:06 +0000   RouteCreated                 RouteController created a route
  OutOfDisk            False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure       False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Wed, 19 Dec 2018 19:28:55 +0000   Thu, 25 Oct 2018 14:46:53 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                False   Wed, 19 Dec 2018 19:28:55 +0000   Wed, 19 Dec 2018 19:27:24 +0000   KubeletNotReady              container runtime is down,PLEG is not healthy: pleg was lastseen active 1h32m35.942907195s ago; threshold is 3m0s
Addresses:
  Hostname:  aks-XXXXXXXXX
Capacity:
 cpu:                2
 ephemeral-storage:  30428648Ki
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             4040536Ki
 pods:               110
Allocatable:
 cpu:                1940m
 ephemeral-storage:  28043041951
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             3099480Ki
 pods:               110
System Info:
 Machine ID:                 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 System UUID:                XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 Boot ID:                    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 Kernel Version:             4.15.0-1035-Azure
 OS Image:                   Ubuntu 16.04.5 LTS
 Operating System:           linux
 Architecture:               AMD64
 Container Runtime Version:  docker://Unknown
 Kubelet Version:            v1.11.3
 Kube-Proxy Version:         v1.11.3
PodCIDR:                     10.244.0.0/24
ProviderID:                  Azure:///subscriptions/9XXXXXXXXXXX/resourceGroups/MC_XXXXXXXXXXXXXXXXXXXXXXXXXXXX/providers/Microsoft.Compute/virtualMachines/aks-XXXXXXXXXXXX
Non-terminated Pods:         (42 in total)
  Namespace                  Name                                                               CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                                                               ------------  ----------  ---------------  -------------
  default                    emailgistics-graph-monitor-6477568564-q98p2                        10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-message-handler-7df4566b6f-mh255                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-reports-aggregator-5fd96b94cb-b5vbn                   10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-rules-844b77f46-5lrkw                                 10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-scheduler-754884b566-mwgvp                            10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    emailgistics-subscription-token-manager-7974558985-f2t49           10m (0%)      0 (0%)      0 (0%)           0 (0%)
  default                    mollified-kiwi-cert-manager-665c5d9c8c-2ld59                       0 (0%)        0 (0%)      0 (0%)           0 (0%)
  istio-system               grafana-59b787b9b-dzdtc                                            10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-citadel-5d8956cc6-x55vk                                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-egressgateway-f48fc7fbb-szpwp                                10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-galley-6975b6bd45-g7lsc                                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-ingressgateway-c6c4bcdbf-bbgcw                               10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-pilot-d9b5b9b7c-ln75n                                        510m (26%)    0 (0%)      2Gi (67%)        0 (0%)
  istio-system               istio-policy-6b465cd4bf-92l57                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-policy-6b465cd4bf-b2z85                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-policy-6b465cd4bf-j59r4                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-policy-6b465cd4bf-s9pdm                                      20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-sidecar-injector-575597f5cf-npkcz                            10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-9794j                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-g7gh5                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-Gd88n                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-px8qb                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-telemetry-6944cd768-xzslh                                    20m (1%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               istio-tracing-7596597bd7-hjtq2                                     10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               prometheus-76db5fddd5-d6dxs                                        10m (0%)      0 (0%)      0 (0%)           0 (0%)
  istio-system               servicegraph-758f96bf5b-c9sqk                                      10m (0%)      0 (0%)      0 (0%)           0 (0%)
  kube-system                addon-http-application-routing-default-http-backend-5ccb95zgfm8    10m (0%)      10m (0%)    20Mi (0%)        20Mi (0%)
  kube-system                addon-http-application-routing-external-dns-59d8698886-h8xds       0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                addon-http-application-routing-nginx-ingress-controller-ff49qc7    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                heapster-5d6f9b846c-m4kfp                                          130m (6%)     130m (6%)   230Mi (7%)       230Mi (7%)
  kube-system                kube-dns-v20-7c7d7d4c66-qqkfm                                      120m (6%)     0 (0%)      140Mi (4%)       220Mi (7%)
  kube-system                kube-dns-v20-7c7d7d4c66-wrxjm                                      120m (6%)     0 (0%)      140Mi (4%)       220Mi (7%)
  kube-system                kube-proxy-2tb68                                                   100m (5%)     0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-svc-redirect-d6gqm                                            10m (0%)      0 (0%)      34Mi (1%)        0 (0%)
  kube-system                kubernetes-dashboard-68f468887f-l9x46                              100m (5%)     100m (5%)   50Mi (1%)        300Mi (9%)
  kube-system                metrics-server-5cbc77f79f-x55cs                                    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                omsagent-mhrqm                                                     50m (2%)      150m (7%)   150Mi (4%)       300Mi (9%)
  kube-system                omsagent-rs-d688cdf68-pjpmj                                        50m (2%)      150m (7%)   100Mi (3%)       500Mi (16%)
  kube-system                tiller-deploy-7f4974b9c8-flkjm                                     0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                tunnelfront-7f766dd857-kgqps                                       10m (0%)      0 (0%)      64Mi (2%)        0 (0%)
  kube-systems-dev           nginx-ingress-dev-controller-7f78f6c8f9-csct4                      0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-systems-dev           nginx-ingress-dev-default-backend-95fbc75b7-lq9tw                  0 (0%)        0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource  Requests      Limits
  --------  --------      ------
  cpu       1540m (79%)   540m (27%)
  memory    2976Mi (98%)  1790Mi (59%)
Events:
  Type     Reason             Age                 From                               Message
  ----     ------             ----                ----                               -------
  Warning  ContainerGCFailed  48m (x43 over 19h)  kubelet, aks-agentpool-22124581-0  rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Warning  ImageGCFailed      29m (x57 over 18h)  kubelet, aks-agentpool-22124581-0  failed to get image stats: rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
  Warning  ContainerGCFailed  2m (x237 over 18h)  kubelet, aks-agentpool-22124581-0  rpc error: code = Unknown desc = Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Fichier de déploiement général:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  creationTimestamp: null
  name: emailgistics-pod
spec:
  minReadySeconds: 10
  replicas: 1
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      annotations:
        sidecar.istio.io/status: '{"version":"ebf16d3ea0236e4b5cb4d3fc0f01da62e2e6265d005e58f8f6bd43a4fb672fdd","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs"],"imagePullSecrets":null}'
      creationTimestamp: null
      labels:
        app: emailgistics-pod
    spec:
      containers:
      - image: xxxxxxxxxxxxxxxxxxxxx/emailgistics_pod:xxxxxx
        imagePullPolicy: Always
        name: emailgistics-pod
        ports:
        - containerPort: 80
        resources: {}
      - args:
        - proxy
        - sidecar
        - --configPath
        - /etc/istio/proxy
        - --binaryPath
        - /usr/local/bin/envoy
        - --serviceCluster
        - emailgistics-pod
        - --drainDuration
        - 45s
        - --parentShutdownDuration
        - 1m0s
        - --discoveryAddress
        - istio-pilot.istio-system:15005
        - --discoveryRefreshDelay
        - 1s
        - --zipkinAddress
        - zipkin.istio-system:9411
        - --connectTimeout
        - 10s
        - --proxyAdminPort
        - "15000"
        - --controlPlaneAuthPolicy
        - MUTUAL_TLS
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: INSTANCE_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: ISTIO_META_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: ISTIO_META_INTERCEPTION_MODE
          value: REDIRECT
        - name: ISTIO_METAJSON_LABELS
          value: |
            {"app":"emailgistics-pod"}
        image: docker.io/istio/proxyv2:1.0.4
        imagePullPolicy: IfNotPresent
        name: istio-proxy
        ports:
        - containerPort: 15090
          name: http-envoy-prom
          protocol: TCP
        resources:
          requests:
            cpu: 10m
        securityContext:
          readOnlyRootFilesystem: true
          runAsUser: 1337
        volumeMounts:
        - mountPath: /etc/istio/proxy
          name: istio-envoy
        - mountPath: /etc/certs/
          name: istio-certs
          readOnly: true
      imagePullSecrets:
      - name: ga.secretname
      initContainers:
      - args:
        - -p
        - "15001"
        - -u
        - "1337"
        - -m
        - REDIRECT
        - -i
        - '*'
        - -x
        - ""
        - -b
        - "80"
        - -d
        - ""
        image: docker.io/istio/proxy_init:1.0.4
        imagePullPolicy: IfNotPresent
        name: istio-init
        resources: {}
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
          privileged: true
      volumes:
      - emptyDir:
          medium: Memory
        name: istio-envoy
      - name: istio-certs
        secret:
          optional: true
          secretName: istio.default
status: {}
---
5
Ask

Il s'agit actuellement d'un bug connu et aucun correctif réel n'a été créé pour normaliser le comportement des nœuds. Inspectez les URL ci-dessous:

https://github.com/kubernetes/kubernetes/issues/45419

https://github.com/kubernetes/kubernetes/issues/61117

https://github.com/Azure/AKS/issues/102

J'espère que nous aurons bientôt une solution.

1
VKR