How to Implement and Test Graceful Shutdown β Node.js + Hono + Kubernetes
Published: 2026-03-13
When building an application on Kubernetes, implementing Graceful Shutdown is absolutely essential. In this article, Iβll walk you through simple Graceful Shutdown implementation examples using a Node.js application with Hono, deployed on a local Kubernetes cluster with Minikube.
Prerequisites
- Mac OS with Colima running
- Docker installed
- kubectl installed
- minikube installed
- Node.js 24.14.0+
What Is Graceful Shutdown?
When Kubernetes needs to stop a pod β during a deployment, a scaling event, or node maintenance β it sends a SIGTERM signal to the process inside the container. What happens next depends entirely on your application.
Without Graceful Shutdown: The process exits immediately. Every in-flight request is dropped. Every open connection is reset. Clients see an error.
With Graceful Shutdown: The process catches SIGTERM, stops accepting new work, finishes what it is already doing, releases its resources, and exits cleanly. From a userβs perspective, nothing happened.
The difference is a few lines of code in your application β but the impact is significant in any environment where pods restart regularly.
Why It Matters β Two Types of Processes
Graceful Shutdown is often discussed in the context of HTTP servers, but the requirement applies equally to any long-running process on Kubernetes. The consequences, however, are different depending on the type of workload.
For a REST API, a dropped request means one user receives an error. It is visible, momentary, and often retryable.
For a batch or workflow process, the consequences are harder to detect and more damaging. A process killed mid-run may leave data in a partially written state β a billing job that charged some customers but not others, a data import that completed 8,000 of 10,000 records, a file transformation that produced corrupt output. These failures do not produce an immediate error spike. They surface later, in a data audit or a customer complaint, long after the pod restart that caused them.
Both workload types require Graceful Shutdown. The implementation details differ β an HTTP server needs to drain open connections, while a batch job needs to finish its current unit of work and checkpoint its progress β but the foundation is the same: handle SIGTERM, finish cleanly, then exit.
Why Kubernetes Makes This Non-Negotiable
In a traditional server environment, a process might run for weeks without restarting. In Kubernetes, pod restarts are routine. They happen on every deployment, config change, node scale-down, and rolling update. If you are running on a managed service like AWS EKS Auto Mode, pods are evicted even more frequently β Auto Mode continuously replaces and consolidates nodes for cost efficiency, and uses Spot Instances that can be reclaimed with only a 2-minute warning.
Every one of those restarts is a potential source of dropped requests or corrupted state if Graceful Shutdown is not implemented.
The Kubernetes termination sequence is:
- The pod is marked
Terminatingand removed from Service endpoints SIGTERMis sent to PID 1 in the container β simultaneously- Kubernetes waits up to
terminationGracePeriodSeconds(default: 30s) - If the process has not exited,
SIGKILLis sent β no further negotiation
One important subtlety: the endpoint removal and SIGTERM happen at the same time, but iptables rules across cluster nodes update asynchronously. In practice, new requests can still be routed to a terminating pod for a few seconds after SIGTERM arrives. A correct Graceful Shutdown implementation must account for this gap β which we will cover in the Kubernetes configuration section.
Project Structure
graceful-shutdown-hono-kubernetes/
βββ app/
β βββ index.js
β βββ package.json
β βββ Dockerfile
βββ k8s/
βββ deployment.yaml
βββ service.yaml
Part 1 β Demonstrating the Problem
Before implementing a fix, we first make the problem visible. The application below has no signal handling β this is the βbeforeβ state.
The Hono App
// app/index.js
import { Hono } from 'hono';
import { serve } from '@hono/node-server';
const app = new Hono();
// Simulate slow work (e.g. a DB query or external API call)
const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
app.get('/health', (c) => c.json({ status: 'ok' }));
app.get('/slow', async (c) => {
console.log(`[${new Date().toISOString()}] /slow received β starting 5s work...`);
await sleep(5000); // Simulate 5 seconds of work
console.log(`[${new Date().toISOString()}] /slow completed`);
return c.json({ message: 'Work done!' });
});
// β No SIGTERM handler β Node.js exits immediately when the pod is killed
// β Any in-flight request to /slow will be dropped with no response
const server = serve({ fetch: app.fetch, port: 3000 }, () => {
console.log(`Server started on port 3000`);
});
// app/package.json
{
"name": "graceful-shutdown-hono-kubernetes",
"version": "1.0.0",
"license": "MIT",
"type": "module",
"main": "index.js",
"scripts": {
"start": "node index.js"
},
"dependencies": {
"@hono/node-server": "^1.19.11",
"hono": "^4.12.7"
}
}
The Dockerfile
Use the exec form of CMD β ["node", "index.js"] β not the shell form. This ensures Node.js runs as PID 1 and receives OS signals directly.
Shell form (
CMD node index.js) launches Node.js via/bin/sh -c, making the shell PID 1. Signals are sent to the shell, not Node.js, andSIGTERMnever reaches your application.
Exec form (CMD ["node", "index.js"]) makes Node.js PID 1. Signals arrive directly. Always use the exec form.
# app/Dockerfile
# ---- Build Stage ----
FROM node:24.14.0-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install --production
# ---- Runtime Stage ----
FROM node:24.14.0-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .
RUN chown -R appuser:appgroup /app
USER appuser
EXPOSE 3000
# β
Exec form β Node.js is PID 1, receives SIGTERM directly
CMD ["node", "index.js"]
Kubernetes Manifests
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: graceful-app
spec:
replicas: 1 # Single pod β ensures curl hits the terminating pod with no healthy fallback
selector:
matchLabels:
app: graceful-app
template:
metadata:
labels:
app: graceful-app
spec:
containers:
- name: graceful-app
image: graceful-app:v1
imagePullPolicy: Never # Use local Minikube image
ports:
- containerPort: 3000
# β No readinessProbe
# β No lifecycle preStop hook
# β No terminationGracePeriodSeconds
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: graceful-app-svc
spec:
selector:
app: graceful-app
ports:
- protocol: TCP
port: 80
targetPort: 3000
type: NodePort
Environment Check
Before running any commands, confirm your local environment is ready.
Check Colima is running:
colima status
# INFO[0000] colima is running using QEMU
# INFO[0000] arch: aarch64
# INFO[0000] runtime: docker
# INFO[0000] mountType: sshfs
# INFO[0000] address:
# INFO[0000] docker context: colima
If Colima is not running, start it first:
colima start
Note:
colima start --kubernetesinstalls k3s inside the Colima VM, which is a separate Kubernetes distribution from Minikube. In this tutorial we use Minikube as our Kubernetes environment, so you do not need the--kubernetesflag when starting Colima. Minikube manages its own cluster independently.
Check Minikube is running:
minikube status
# minikube
# type: Control Plane
# host: Running
# kubelet: Running
# apiserver: Running
# kubeconfig: Configured
If Minikube is not running, start it:
minikube start
Check kubectl is connected to Minikube:
kubectl config current-context
# minikube
If the context is not minikube, switch to it:
kubectl config use-context minikube
Once all three are confirmed, proceed with the build and deploy steps below.
Build & Deploy
Build the image inside Minikubeβs Docker daemon so it is available locally without a registry. All commands below are run from the project root (graceful-shutdown-hono-kubernetes/).
cd graceful-shutdown-hono-kubernetes
# Start Minikube
minikube start
# Point your shell to Minikube's Docker daemon
eval $(minikube docker-env)
# Build the image
docker build -t graceful-app:v1 ./app
# Deploy
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
# Confirm pods are running
kubectl get pods -w
# NAME READY STATUS RESTARTS AGE
# graceful-app-5d4f9b7c6-abc12 1/1 Running 0 12s
Reproducing the Problem
Open three terminal windows.
Terminal 1 β get the service URL:
minikube service graceful-app-svc --url
# http://127.0.0.1:51234
This command creates a tunnel from your Mac to the Service inside the Minikube VM and prints the local URL. Keep this terminal running while you work in Terminals 2 and 3.
Terminal 2 β send a slow request:
curl http://127.0.0.1:51234/slow
# ... (hanging, waiting for a response) ...
Terminal 3 β while Terminal 2 is waiting, force-delete the pod:
kubectl get pods
# graceful-app-5d4f9b7c6-abc12 1/1 Running
kubectl delete pod graceful-app-5d4f9b7c6-abc12 --grace-period=0 --force
# pod "graceful-app-5d4f9b7c6-abc12" deleted
--grace-period=0 --forcebypassesterminationGracePeriodSecondsand sendsSIGKILLimmediately, making the failure reliably reproducible. This is only used here to demonstrate the problem β in production, Kubernetes always goes through the normal termination sequence withSIGTERMfirst.
Terminal 2 immediately returns:
curl: (52) Empty reply from server
# or: curl: (56) Recv failure: Connection reset by peer
The request was mid-flight when the pod was killed. Node.js received SIGTERM, had no handler, and exited immediately. The client received no response.
What Happened β The Termination Sequence
| Step | Event | Result |
|---|---|---|
| 1 | kubectl delete pod triggered | Pod marked Terminating |
| 2 | SIGTERM sent to Node.js (PID 1) | No handler β exits immediately |
| 3 | In-flight /slow requests abandoned | Client receives connection reset |
| 4 | iptables rules update asynchronously | New requests may still reach the dying pod |
Clean Up
kubectl delete -f k8s/
Part 2 β Node.js Graceful Shutdown Implementation
Now we fix the application. The goal is to catch SIGTERM, stop accepting new connections, wait for in-flight requests to finish, and then exit.
// app/index.js
import { Hono } from 'hono';
import { serve } from '@hono/node-server';
const app = new Hono();
const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
app.get('/health', (c) => c.json({ status: 'ok' }));
app.get('/slow', async (c) => {
console.log(`[${new Date().toISOString()}] /slow received β starting 5s work...`);
await sleep(5000);
console.log(`[${new Date().toISOString()}] /slow completed`);
return c.json({ message: 'Work done!' });
});
const server = serve({ fetch: app.fetch, port: 3000 }, () => {
console.log(`[${new Date().toISOString()}] Server started on port 3000`);
});
// β
Graceful Shutdown handler
const shutdown = (signal) => {
console.log(`[${new Date().toISOString()}] ${signal} received β starting graceful shutdown`);
// Step 1: Stop accepting new connections
// Existing in-flight requests are still handled
server.close(() => {
console.log(`[${new Date().toISOString()}] All in-flight requests finished β exiting`);
// Step 2: Release other resources here if needed
// e.g. await db.disconnect(), mqConsumer.stop()
process.exit(0); // β
Clean exit
});
// Step 3: Safety net β force exit if drain takes too long
// This should be shorter than terminationGracePeriodSeconds in k8s
setTimeout(() => {
console.error(`[${new Date().toISOString()}] Forced exit after timeout`);
process.exit(1);
}, 25000); // 25s β safely under the default 30s grace period
};
process.on('SIGTERM', () => shutdown('SIGTERM')); // Sent by Kubernetes
process.on('SIGINT', () => shutdown('SIGINT')); // Sent by Ctrl+C locally
Key points:
serve()from@hono/node-serverreturns a standard Node.jshttp.Serverinstance. Its.close()method behaves identically to Express β stops accepting new connections and fires the callback once all in-flight requests finish.- The
setTimeoutsafety net ensures the process does not hang indefinitely if a request is stuck. Set it to a value safely belowterminationGracePeriodSeconds. SIGINTis handled for local development convenience (Ctrl+C). In Kubernetes,SIGTERMis what matters.
Rebuild and Redeploy
eval $(minikube docker-env)
docker build -t graceful-app:v2 ./app
Update deployment.yaml to use v2:
image: graceful-app:v2
# Recreates both the Deployment and Service deleted during Part 1 clean up
kubectl apply -f k8s/
Verify It Works
We need four terminal windows this time.
Terminal 1 β start the Minikube tunnel and keep it running:
minikube service graceful-app-svc --url
# http://127.0.0.1:51234
The tunnel was stopped when you closed Terminal 1 in the Part 1 clean up, so it needs to be restarted here. Keep this terminal running for the duration of the test.
Terminal 2 β stream the pod logs so we can observe the shutdown sequence in real time:
kubectl logs -f $(kubectl get pod -l app=graceful-app -o jsonpath='{.items[0].metadata.name}')
-ffollows the log stream.kubectl get pod -l app=graceful-appselects the pod by label so you donβt need to copy the exact pod name.
Terminal 3 β send a slow request:
curl http://127.0.0.1:51234/slow
Terminal 4 β delete the pod while it is processing:
kubectl delete pod graceful-app-5d4f9b7c6-abc12
Terminal 3 β this time, it completes:
{"message":"Work done!"}
Terminal 2 β pod logs confirm the clean shutdown sequence:
[2024-01-15T10:00:00.000Z] Server started on port 3000
[2024-01-15T10:00:01.000Z] /slow received β starting 5s work...
[2024-01-15T10:00:01.500Z] SIGTERM received β starting graceful shutdown
[2024-01-15T10:00:06.000Z] /slow completed
[2024-01-15T10:00:06.001Z] All in-flight requests finished β exiting
The application caught SIGTERM mid-request, waited for /slow to finish, and exited cleanly.
Part 3 β Kubernetes Configuration
The Node.js implementation alone is not sufficient. Kubernetes must also be configured to give the application enough time to drain, and to stop routing new traffic before shutdown begins.
The Full Deployment Configuration
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: graceful-app
spec:
replicas: 2 # Scale up to 2 β required for a meaningful rolling update test
selector:
matchLabels:
app: graceful-app
template:
metadata:
labels:
app: graceful-app
spec:
terminationGracePeriodSeconds: 30 # β
Must be longer than your app's drain timeout
containers:
- name: graceful-app
image: graceful-app:v2
imagePullPolicy: Never
ports:
- containerPort: 3000
# β
readinessProbe β stops routing traffic to this pod before shutdown
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 1
# β
preStop hook β adds a delay before SIGTERM to handle async iptables propagation
lifecycle:
preStop:
exec:
command: ['sh', '-c', 'sleep 5']
Why Each Setting Matters
terminationGracePeriodSeconds: 30
This is the total time Kubernetes will wait between sending SIGTERM and force-killing with SIGKILL. It must be longer than the maximum time your application needs to finish draining. The relationship is:
terminationGracePeriodSeconds > preStop sleep + app drain timeout
In our case: 30s > 5s (preStop) + 25s (app timeout) β this is satisfied with a small margin.
readinessProbe
The readiness probe tells Kubernetes whether a pod is ready to receive traffic. When a pod starts shutting down, it will fail the readiness check and be removed from the load balancer rotation β stopping new requests from arriving. Without this, Kubernetes has no way to know when to stop sending traffic to the pod.
lifecycle.preStop: sleep 5
This is the key to handling the async iptables propagation gap mentioned in Part 1. The preStop hook runs before SIGTERM is sent. By sleeping for 5 seconds, we give the cluster time to propagate the endpoint removal to all nodes before our application starts shutting down. Without this sleep, the app may begin rejecting new connections while the cluster is still routing traffic to it.
The complete shutdown sequence with these settings looks like this:
| Time | Event |
|---|---|
| T+0s | Pod marked Terminating, removed from endpoints |
| T+0s | preStop hook starts (sleep 5) |
| T+0β5s | iptables rules propagate across cluster nodes |
| T+5s | preStop completes, SIGTERM sent to Node.js |
| T+5s | Node.js stops accepting new connections, drains in-flight requests |
| T+5β30s | In-flight requests complete |
| T+30s | SIGKILL sent if process has not exited (safety net) |
Deploy
Apply the updated deployment.yaml. This applies the new Kubernetes configuration (terminationGracePeriodSeconds, readinessProbe, preStop) and scales from 1 to 2 replicas at the same time:
kubectl apply -f k8s/deployment.yaml
# Confirm two pods are running before proceeding
kubectl get pods -w
# NAME READY STATUS RESTARTS AGE
# graceful-app-5d4f9b7c6-abc12 1/1 Running 0 30s
# graceful-app-5d4f9b7c6-def34 1/1 Running 0 30s
Final Verification β Rolling Update
The most realistic test is a rolling update, which is how deployments happen in practice. With replicas: 2, Kubernetes replaces pods one at a time β traffic continues flowing through the surviving pod while the other drains, which is exactly the behaviour we want to verify.
Open three terminal windows.
Terminal 1 β start the Minikube tunnel and keep it running:
minikube service graceful-app-svc --url
# http://127.0.0.1:51234
The tunnel must stay alive throughout the test. If you close this terminal,
curlwill return000β not because of a pod issue, but because the tunnel itself is gone.
Terminal 2 β run continuous traffic:
while true; do
curl -s -o /dev/null -w "%{http_code}\n" http://127.0.0.1:51234/slow
sleep 0.5
done
You should see a steady stream of 200 responses:
200
200
200
...
Terminal 3 β trigger a rolling update:
kubectl rollout restart deployment/graceful-app
Keep watching Terminal 2 while the rolling update runs. With the full implementation in place β SIGTERM handler in Node.js, preStop sleep, and readinessProbe β every request should continue returning 200. No errors, no dropped connections, despite pods being replaced underneath the running traffic.
Clean Up
kubectl delete -f k8s/
minikube stop
Wrap Up
In this article, we walked through a complete Graceful Shutdown implementation for a Node.js + Hono application running on Kubernetes β from reproducing the problem with a broken app, to fixing it at the application level, to completing the picture with the right Kubernetes configuration.
As we saw, a correct implementation requires changes on both sides. Neither alone is sufficient.
| Layer | What to implement |
|---|---|
| Node.js + Hono | Handle SIGTERM, call server.close() on the @hono/node-server instance, add a force-exit timeout |
| Dockerfile | Use exec form CMD ["node", "index.js"] so Node.js receives signals |
| Kubernetes | terminationGracePeriodSeconds, readinessProbe, preStop sleep |
The pattern shown here applies to any Node.js service running in Kubernetes. Feel free to adjust the timeout values to match your applicationβs actual drain time β especially for batch or long-running processes, where terminationGracePeriodSeconds may need to be significantly longer than the default 30 seconds.
I hope this article is helpful for you. Keep on building.