How to Implement and Test Graceful Shutdown β€” Node.js + Hono + Kubernetes

Published: 2026-03-13

When building an application on Kubernetes, implementing Graceful Shutdown is absolutely essential. In this article, I’ll walk you through simple Graceful Shutdown implementation examples using a Node.js application with Hono, deployed on a local Kubernetes cluster with Minikube.

Prerequisites

  • Mac OS with Colima running
  • Docker installed
  • kubectl installed
  • minikube installed
  • Node.js 24.14.0+

What Is Graceful Shutdown?

When Kubernetes needs to stop a pod β€” during a deployment, a scaling event, or node maintenance β€” it sends a SIGTERM signal to the process inside the container. What happens next depends entirely on your application.

Without Graceful Shutdown: The process exits immediately. Every in-flight request is dropped. Every open connection is reset. Clients see an error.

With Graceful Shutdown: The process catches SIGTERM, stops accepting new work, finishes what it is already doing, releases its resources, and exits cleanly. From a user’s perspective, nothing happened.

The difference is a few lines of code in your application β€” but the impact is significant in any environment where pods restart regularly.

Why It Matters β€” Two Types of Processes

Graceful Shutdown is often discussed in the context of HTTP servers, but the requirement applies equally to any long-running process on Kubernetes. The consequences, however, are different depending on the type of workload.

For a REST API, a dropped request means one user receives an error. It is visible, momentary, and often retryable.

For a batch or workflow process, the consequences are harder to detect and more damaging. A process killed mid-run may leave data in a partially written state β€” a billing job that charged some customers but not others, a data import that completed 8,000 of 10,000 records, a file transformation that produced corrupt output. These failures do not produce an immediate error spike. They surface later, in a data audit or a customer complaint, long after the pod restart that caused them.

Both workload types require Graceful Shutdown. The implementation details differ β€” an HTTP server needs to drain open connections, while a batch job needs to finish its current unit of work and checkpoint its progress β€” but the foundation is the same: handle SIGTERM, finish cleanly, then exit.

Why Kubernetes Makes This Non-Negotiable

In a traditional server environment, a process might run for weeks without restarting. In Kubernetes, pod restarts are routine. They happen on every deployment, config change, node scale-down, and rolling update. If you are running on a managed service like AWS EKS Auto Mode, pods are evicted even more frequently β€” Auto Mode continuously replaces and consolidates nodes for cost efficiency, and uses Spot Instances that can be reclaimed with only a 2-minute warning.

Every one of those restarts is a potential source of dropped requests or corrupted state if Graceful Shutdown is not implemented.

The Kubernetes termination sequence is:

  1. The pod is marked Terminating and removed from Service endpoints
  2. SIGTERM is sent to PID 1 in the container β€” simultaneously
  3. Kubernetes waits up to terminationGracePeriodSeconds (default: 30s)
  4. If the process has not exited, SIGKILL is sent β€” no further negotiation

One important subtlety: the endpoint removal and SIGTERM happen at the same time, but iptables rules across cluster nodes update asynchronously. In practice, new requests can still be routed to a terminating pod for a few seconds after SIGTERM arrives. A correct Graceful Shutdown implementation must account for this gap β€” which we will cover in the Kubernetes configuration section.


Project Structure

graceful-shutdown-hono-kubernetes/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ index.js
β”‚   β”œβ”€β”€ package.json
β”‚   └── Dockerfile
└── k8s/
    β”œβ”€β”€ deployment.yaml
    └── service.yaml

Part 1 β€” Demonstrating the Problem

Before implementing a fix, we first make the problem visible. The application below has no signal handling β€” this is the β€œbefore” state.

The Hono App

// app/index.js
import { Hono } from 'hono';
import { serve } from '@hono/node-server';

const app = new Hono();

// Simulate slow work (e.g. a DB query or external API call)
const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));

app.get('/health', (c) => c.json({ status: 'ok' }));

app.get('/slow', async (c) => {
  console.log(`[${new Date().toISOString()}] /slow received β€” starting 5s work...`);

  await sleep(5000); // Simulate 5 seconds of work

  console.log(`[${new Date().toISOString()}] /slow completed`);
  return c.json({ message: 'Work done!' });
});

// ❌ No SIGTERM handler β€” Node.js exits immediately when the pod is killed
// ❌ Any in-flight request to /slow will be dropped with no response
const server = serve({ fetch: app.fetch, port: 3000 }, () => {
  console.log(`Server started on port 3000`);
});
// app/package.json
{
  "name": "graceful-shutdown-hono-kubernetes",
  "version": "1.0.0",
  "license": "MIT",
  "type": "module",
  "main": "index.js",
  "scripts": {
    "start": "node index.js"
  },
  "dependencies": {
    "@hono/node-server": "^1.19.11",
    "hono": "^4.12.7"
  }
}

The Dockerfile

Use the exec form of CMD β€” ["node", "index.js"] β€” not the shell form. This ensures Node.js runs as PID 1 and receives OS signals directly.

Shell form (CMD node index.js) launches Node.js via /bin/sh -c, making the shell PID 1. Signals are sent to the shell, not Node.js, and SIGTERM never reaches your application.
Exec form (CMD ["node", "index.js"]) makes Node.js PID 1. Signals arrive directly. Always use the exec form.

# app/Dockerfile

# ---- Build Stage ----
FROM node:24.14.0-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install --production

# ---- Runtime Stage ----
FROM node:24.14.0-alpine

RUN addgroup -S appgroup && adduser -S appuser -G appgroup

WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .

RUN chown -R appuser:appgroup /app
USER appuser

EXPOSE 3000

# βœ… Exec form β€” Node.js is PID 1, receives SIGTERM directly
CMD ["node", "index.js"]

Kubernetes Manifests

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: graceful-app
spec:
  replicas: 1 # Single pod β€” ensures curl hits the terminating pod with no healthy fallback
  selector:
    matchLabels:
      app: graceful-app
  template:
    metadata:
      labels:
        app: graceful-app
    spec:
      containers:
        - name: graceful-app
          image: graceful-app:v1
          imagePullPolicy: Never # Use local Minikube image
          ports:
            - containerPort: 3000
          # ❌ No readinessProbe
          # ❌ No lifecycle preStop hook
          # ❌ No terminationGracePeriodSeconds
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: graceful-app-svc
spec:
  selector:
    app: graceful-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: NodePort

Environment Check

Before running any commands, confirm your local environment is ready.

Check Colima is running:

colima status
# INFO[0000] colima is running using QEMU
# INFO[0000] arch: aarch64
# INFO[0000] runtime: docker
# INFO[0000] mountType: sshfs
# INFO[0000] address:
# INFO[0000] docker context: colima

If Colima is not running, start it first:

colima start

Note: colima start --kubernetes installs k3s inside the Colima VM, which is a separate Kubernetes distribution from Minikube. In this tutorial we use Minikube as our Kubernetes environment, so you do not need the --kubernetes flag when starting Colima. Minikube manages its own cluster independently.

Check Minikube is running:

minikube status
# minikube
# type: Control Plane
# host: Running
# kubelet: Running
# apiserver: Running
# kubeconfig: Configured

If Minikube is not running, start it:

minikube start

Check kubectl is connected to Minikube:

kubectl config current-context
# minikube

If the context is not minikube, switch to it:

kubectl config use-context minikube

Once all three are confirmed, proceed with the build and deploy steps below.

Build & Deploy

Build the image inside Minikube’s Docker daemon so it is available locally without a registry. All commands below are run from the project root (graceful-shutdown-hono-kubernetes/).

cd graceful-shutdown-hono-kubernetes

# Start Minikube
minikube start

# Point your shell to Minikube's Docker daemon
eval $(minikube docker-env)

# Build the image
docker build -t graceful-app:v1 ./app

# Deploy
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml

# Confirm pods are running
kubectl get pods -w
# NAME                            READY   STATUS    RESTARTS   AGE
# graceful-app-5d4f9b7c6-abc12    1/1     Running   0          12s

Reproducing the Problem

Open three terminal windows.

Terminal 1 β€” get the service URL:

minikube service graceful-app-svc --url
# http://127.0.0.1:51234

This command creates a tunnel from your Mac to the Service inside the Minikube VM and prints the local URL. Keep this terminal running while you work in Terminals 2 and 3.

Terminal 2 β€” send a slow request:

curl http://127.0.0.1:51234/slow
# ... (hanging, waiting for a response) ...

Terminal 3 β€” while Terminal 2 is waiting, force-delete the pod:

kubectl get pods
# graceful-app-5d4f9b7c6-abc12    1/1     Running

kubectl delete pod graceful-app-5d4f9b7c6-abc12 --grace-period=0 --force
# pod "graceful-app-5d4f9b7c6-abc12" deleted

--grace-period=0 --force bypasses terminationGracePeriodSeconds and sends SIGKILL immediately, making the failure reliably reproducible. This is only used here to demonstrate the problem β€” in production, Kubernetes always goes through the normal termination sequence with SIGTERM first.

Terminal 2 immediately returns:

curl: (52) Empty reply from server
# or: curl: (56) Recv failure: Connection reset by peer

The request was mid-flight when the pod was killed. Node.js received SIGTERM, had no handler, and exited immediately. The client received no response.

What Happened β€” The Termination Sequence

StepEventResult
1kubectl delete pod triggeredPod marked Terminating
2SIGTERM sent to Node.js (PID 1)No handler β€” exits immediately
3In-flight /slow requests abandonedClient receives connection reset
4iptables rules update asynchronouslyNew requests may still reach the dying pod

Clean Up

kubectl delete -f k8s/

Part 2 β€” Node.js Graceful Shutdown Implementation

Now we fix the application. The goal is to catch SIGTERM, stop accepting new connections, wait for in-flight requests to finish, and then exit.

// app/index.js
import { Hono } from 'hono';
import { serve } from '@hono/node-server';

const app = new Hono();

const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));

app.get('/health', (c) => c.json({ status: 'ok' }));

app.get('/slow', async (c) => {
  console.log(`[${new Date().toISOString()}] /slow received β€” starting 5s work...`);
  await sleep(5000);
  console.log(`[${new Date().toISOString()}] /slow completed`);
  return c.json({ message: 'Work done!' });
});

const server = serve({ fetch: app.fetch, port: 3000 }, () => {
  console.log(`[${new Date().toISOString()}] Server started on port 3000`);
});

// βœ… Graceful Shutdown handler
const shutdown = (signal) => {
  console.log(`[${new Date().toISOString()}] ${signal} received β€” starting graceful shutdown`);

  // Step 1: Stop accepting new connections
  // Existing in-flight requests are still handled
  server.close(() => {
    console.log(`[${new Date().toISOString()}] All in-flight requests finished β€” exiting`);

    // Step 2: Release other resources here if needed
    // e.g. await db.disconnect(), mqConsumer.stop()

    process.exit(0); // βœ… Clean exit
  });

  // Step 3: Safety net β€” force exit if drain takes too long
  // This should be shorter than terminationGracePeriodSeconds in k8s
  setTimeout(() => {
    console.error(`[${new Date().toISOString()}] Forced exit after timeout`);
    process.exit(1);
  }, 25000); // 25s β€” safely under the default 30s grace period
};

process.on('SIGTERM', () => shutdown('SIGTERM')); // Sent by Kubernetes
process.on('SIGINT', () => shutdown('SIGINT')); // Sent by Ctrl+C locally

Key points:

  • serve() from @hono/node-server returns a standard Node.js http.Server instance. Its .close() method behaves identically to Express β€” stops accepting new connections and fires the callback once all in-flight requests finish.
  • The setTimeout safety net ensures the process does not hang indefinitely if a request is stuck. Set it to a value safely below terminationGracePeriodSeconds.
  • SIGINT is handled for local development convenience (Ctrl+C). In Kubernetes, SIGTERM is what matters.

Rebuild and Redeploy

eval $(minikube docker-env)
docker build -t graceful-app:v2 ./app

Update deployment.yaml to use v2:

image: graceful-app:v2
# Recreates both the Deployment and Service deleted during Part 1 clean up
kubectl apply -f k8s/

Verify It Works

We need four terminal windows this time.

Terminal 1 β€” start the Minikube tunnel and keep it running:

minikube service graceful-app-svc --url
# http://127.0.0.1:51234

The tunnel was stopped when you closed Terminal 1 in the Part 1 clean up, so it needs to be restarted here. Keep this terminal running for the duration of the test.

Terminal 2 β€” stream the pod logs so we can observe the shutdown sequence in real time:

kubectl logs -f $(kubectl get pod -l app=graceful-app -o jsonpath='{.items[0].metadata.name}')

-f follows the log stream. kubectl get pod -l app=graceful-app selects the pod by label so you don’t need to copy the exact pod name.

Terminal 3 β€” send a slow request:

curl http://127.0.0.1:51234/slow

Terminal 4 β€” delete the pod while it is processing:

kubectl delete pod graceful-app-5d4f9b7c6-abc12

Terminal 3 β€” this time, it completes:

{"message":"Work done!"}

Terminal 2 β€” pod logs confirm the clean shutdown sequence:

[2024-01-15T10:00:00.000Z] Server started on port 3000
[2024-01-15T10:00:01.000Z] /slow received β€” starting 5s work...
[2024-01-15T10:00:01.500Z] SIGTERM received β€” starting graceful shutdown
[2024-01-15T10:00:06.000Z] /slow completed
[2024-01-15T10:00:06.001Z] All in-flight requests finished β€” exiting

The application caught SIGTERM mid-request, waited for /slow to finish, and exited cleanly.


Part 3 β€” Kubernetes Configuration

The Node.js implementation alone is not sufficient. Kubernetes must also be configured to give the application enough time to drain, and to stop routing new traffic before shutdown begins.

The Full Deployment Configuration

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: graceful-app
spec:
  replicas: 2 # Scale up to 2 β€” required for a meaningful rolling update test
  selector:
    matchLabels:
      app: graceful-app
  template:
    metadata:
      labels:
        app: graceful-app
    spec:
      terminationGracePeriodSeconds: 30 # βœ… Must be longer than your app's drain timeout
      containers:
        - name: graceful-app
          image: graceful-app:v2
          imagePullPolicy: Never
          ports:
            - containerPort: 3000

          # βœ… readinessProbe β€” stops routing traffic to this pod before shutdown
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 1

          # βœ… preStop hook β€” adds a delay before SIGTERM to handle async iptables propagation
          lifecycle:
            preStop:
              exec:
                command: ['sh', '-c', 'sleep 5']

Why Each Setting Matters

terminationGracePeriodSeconds: 30

This is the total time Kubernetes will wait between sending SIGTERM and force-killing with SIGKILL. It must be longer than the maximum time your application needs to finish draining. The relationship is:

terminationGracePeriodSeconds > preStop sleep + app drain timeout

In our case: 30s > 5s (preStop) + 25s (app timeout) β€” this is satisfied with a small margin.

readinessProbe

The readiness probe tells Kubernetes whether a pod is ready to receive traffic. When a pod starts shutting down, it will fail the readiness check and be removed from the load balancer rotation β€” stopping new requests from arriving. Without this, Kubernetes has no way to know when to stop sending traffic to the pod.

lifecycle.preStop: sleep 5

This is the key to handling the async iptables propagation gap mentioned in Part 1. The preStop hook runs before SIGTERM is sent. By sleeping for 5 seconds, we give the cluster time to propagate the endpoint removal to all nodes before our application starts shutting down. Without this sleep, the app may begin rejecting new connections while the cluster is still routing traffic to it.

The complete shutdown sequence with these settings looks like this:

TimeEvent
T+0sPod marked Terminating, removed from endpoints
T+0spreStop hook starts (sleep 5)
T+0–5siptables rules propagate across cluster nodes
T+5spreStop completes, SIGTERM sent to Node.js
T+5sNode.js stops accepting new connections, drains in-flight requests
T+5–30sIn-flight requests complete
T+30sSIGKILL sent if process has not exited (safety net)

Deploy

Apply the updated deployment.yaml. This applies the new Kubernetes configuration (terminationGracePeriodSeconds, readinessProbe, preStop) and scales from 1 to 2 replicas at the same time:

kubectl apply -f k8s/deployment.yaml

# Confirm two pods are running before proceeding
kubectl get pods -w
# NAME                            READY   STATUS    RESTARTS   AGE
# graceful-app-5d4f9b7c6-abc12    1/1     Running   0          30s
# graceful-app-5d4f9b7c6-def34    1/1     Running   0          30s

Final Verification β€” Rolling Update

The most realistic test is a rolling update, which is how deployments happen in practice. With replicas: 2, Kubernetes replaces pods one at a time β€” traffic continues flowing through the surviving pod while the other drains, which is exactly the behaviour we want to verify.

Open three terminal windows.

Terminal 1 β€” start the Minikube tunnel and keep it running:

minikube service graceful-app-svc --url
# http://127.0.0.1:51234

The tunnel must stay alive throughout the test. If you close this terminal, curl will return 000 β€” not because of a pod issue, but because the tunnel itself is gone.

Terminal 2 β€” run continuous traffic:

while true; do
  curl -s -o /dev/null -w "%{http_code}\n" http://127.0.0.1:51234/slow
  sleep 0.5
done

You should see a steady stream of 200 responses:

200
200
200
...

Terminal 3 β€” trigger a rolling update:

kubectl rollout restart deployment/graceful-app

Keep watching Terminal 2 while the rolling update runs. With the full implementation in place β€” SIGTERM handler in Node.js, preStop sleep, and readinessProbe β€” every request should continue returning 200. No errors, no dropped connections, despite pods being replaced underneath the running traffic.

Clean Up

kubectl delete -f k8s/
minikube stop

Wrap Up

In this article, we walked through a complete Graceful Shutdown implementation for a Node.js + Hono application running on Kubernetes β€” from reproducing the problem with a broken app, to fixing it at the application level, to completing the picture with the right Kubernetes configuration.

As we saw, a correct implementation requires changes on both sides. Neither alone is sufficient.

LayerWhat to implement
Node.js + HonoHandle SIGTERM, call server.close() on the @hono/node-server instance, add a force-exit timeout
DockerfileUse exec form CMD ["node", "index.js"] so Node.js receives signals
KubernetesterminationGracePeriodSeconds, readinessProbe, preStop sleep

The pattern shown here applies to any Node.js service running in Kubernetes. Feel free to adjust the timeout values to match your application’s actual drain time β€” especially for batch or long-running processes, where terminationGracePeriodSeconds may need to be significantly longer than the default 30 seconds.

I hope this article is helpful for you. Keep on building.


Further Reading