Zero Dropped Connections during Ingress Pod Updates with AWS Load Balancer Controller

June 23, 2023 • 4 minute read

kubernetes, ingress, aws

Warning: This is a note, so don't expect much 😅!

Current environment:

EKS 1.25
AWS Load Balancer Controller v2.4.6 (Helm)
NGINX Ingress Controller v1.8.0 (Helm)

I have been experiencing dropped connection while updating Ingress Controller pods. Many people are talking about the very same problem in issue 2366. So, to understand what was happening without impacting current services, I created a fresh Ingress Controller using the following recipe:

# helm install internal-nlb ingress-nginx/ingress-nginx --namespace production -f ./internal-nlb-values.yaml --dry-run
# helm uninstall internal-nlb --namespace production
# helm upgrade internal-nlb ingress-nginx/ingress-nginx --namespace production -f ./internal-nlb-values.yaml
# helm get values internal-nlb --namespace production
# https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.5/guide/service/annotations/
# https://github.com/kubernetes/ingress-nginx/tree/main/charts/ingress-nginx
controller:
  admissionWebhooks:
    enabled: false
  config:
    http-snippet: |
      # Rule to redirect HTTP to HTTPS using custom port 2443 (proxy_protocol). Know more at:
      # https://github.com/kubernetes/ingress-nginx/issues/5051
      # https://github.com/kubernetes/ingress-nginx/issues/9776
      server {
        listen 2443 proxy_protocol;
        return 308 https://$host$request_uri;
      }
    use-proxy-protocol: "true"
  containerPort:
    http: 80
    https: 443
    redirect: 2443
  electionID: internal-nlb
  ingressClass: internal-nlb
  ingressClassByName: true
  ingressClassResource:
    controllerValue: k8s.io/internal-nlb
    default: false
    enabled: true
    name: internal-nlb
  resources:
    requests:
      cpu: 15m
      memory: 128Mi
    limits:
      cpu: 100m
      memory: 172Mi
  service:
    # https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/service/annotations/
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: Environment=Production,Product=Cross
      service.beta.kubernetes.io/aws-load-balancer-attributes: load_balancing.cross_zone.enabled=true
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
      service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: '*'
      service.beta.kubernetes.io/aws-load-balancer-scheme: internal
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:us-east-1:YOUR_ACCOUNT:certificate/YOUR_CERT_ID
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: https
      service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=true,proxy_protocol_v2.enabled=true
      service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
      service.beta.kubernetes.io/aws-load-balancer-type: external
    targetPorts:
      http: 2443
      https: 80
  watchIngressWithoutClass: false
  replicaCount: 1

Given the file name is internal-nlb-values.yaml, I issued the command:

helm install internal-nlb ingress-nginx/ingress-nginx --namespace production -f ./internal-nlb-values.yaml

Then I created a sample deployment that uses the ingress class above:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/force-ssl-redirect: "false"
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
  name: sample-api-ingress
  namespace: production
spec:
  rules:
    - host: test-sample-api.willianantunes.com
      http:
        paths:
          - backend:
              service:
                name: sample-api-service
                port:
                  name: http-web-svc
            path: /
            pathType: Prefix
  ingressClassName: internal-nlb

---

apiVersion: v1
kind: Service
metadata:
  namespace: production
  name: sample-api-service
spec:
  selector:
    app: sample-api
    tier: web
  type: NodePort
  ports:
  - name: http-web-svc
    protocol: TCP
    port: 8080
    targetPort: web-server

---

# After executing `kubectl proxy` you can issue:
# http://localhost:8001/api/v1/namespaces/production/services/sample-api-antunes-service:8080/proxy/health-check

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-api-deployment
  namespace: production
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-api
      tier: web
  template:
    metadata:
      labels:
        app: sample-api
        tier: web
    spec:
      containers:
        - name: sample-api-container
          image: willianantunes/runner-said-no-one-ever
          ports:
            - name: web-server
              containerPort: 8080
          env:
            - name: PUMA_BIND_ADDRESS
              value: "0.0.0.0"
            - name: PUMA_BIND_PORT
              value: "8080"
            - name: RACK_ENV
              value: production
            - name: APP_ENV
              value: production
            - name: PUMA_MIN_THREADS
              value: "4"
            - name: PUMA_MAX_THREADS
              value: "20"
            - name: PUMA_NUMBER_OF_WORKERS
              value: "1"
            - name: PUMA_PERSISTENT_TIMEOUT
              value: "20"
            - name: PUMA_FIRST_DATA_TIMEOUT
              value: "30"
            - name: PROJECT_LOG_LEVEL
              value: "DEBUG"
            - name: RACK_IP_ADDRESS_HEADER
              value: "REMOTE_ADDR"

After creating it by executing kubectl apply -f sample-api-manifests.yaml, I was able to call the service by running the following:

curl --insecure -H "Host: test-sample-api.willianantunes.com" https://k8s-producti-internal-nlb-id.elb.us-east-1.amazonaws.com/health-check

My environment was ready for a load test that would call the sample API indefinitely. To do that, I used JMeter. So while it was calling the API and asserting its result many times, I did the following scenarios:

Change the requests for CPU.
Decrease the number of replicas to 1.
Change the limits for memory.

Each time the pod had to change, the error would increase:

There is a terminal showing pods and a window representing JMeter doing its load testing. JMeter shows 64392 requests with 0.07% of errors.

The NLB target group would also add new targets and drain the old ones, though the pods had already been terminated, which explains the error. While the target is draining, the NLB may send traffic for it still. To avoid errors, there is a workaround that uses container hook preStop and deregistration delay. Their value depends on your context, but let's say the following:

controller:
  # ...
  # ...
  service:
    annotations:
      # ...
      service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=true,proxy_protocol_v2.enabled=true,deregistration_delay.timeout_seconds=300
  lifecycle:
    preStop:
      exec:
        command: [ "sleep", "420" ]

Then I started a new test plan in JMeter and did the same scenarios above, but this time without downtime:

There is a terminal showing pods and a window representing JMeter doing its load testing. JMeter shows 111692 requests with 0.00% of errors.

I hope this may help you. See you 😄!