Deployment¶
Helm Chart Structure¶
All services are deployed from a single shared Helm chart at helm/appointment-service/. Values are layered at deploy time using Helm's last-value-wins precedence:
values.yaml # Base defaults for all services
└── values-{service}.yaml # Per-service overrides
└── values-{env}.yaml # Per-environment overrides (production, staging)
└── --set flags # CI-injected: image tag, digest, strategy
Chart Templates¶
| Template | Purpose |
|---|---|
deployment.yaml |
Deployment with rolling/canary strategy, probes, security context |
hpa.yaml |
HorizontalPodAutoscaler — CPU 70% + memory 80% targets |
service.yaml |
ClusterIP service |
ingress.yaml |
Optional ingress (disabled by default; enabled per service) |
pdb.yaml |
PodDisruptionBudget — minAvailable: 1 (overridden to 2 in production) |
serviceaccount.yaml |
ServiceAccount with optional IRSA annotation |
Per-Service Values Files¶
| File | Key Overrides |
|---|---|
values-booking.yaml |
3 replicas min, HPA 3–20, canary strategy, IRSA annotation for RDS + MSK |
values-search.yaml |
nodeSelector: workload=search, tolerations for the search taint, HPA 3–30, IRSA for OpenSearch SigV4 |
values-availability.yaml |
Canary strategy |
values-patient.yaml |
Canary strategy |
values-notification.yaml |
Rolling strategy |
values-practitioner.yaml |
Rolling strategy |
values-production.yaml |
minReplicas: 3, minAvailable: 2 PDB, scale-down stabilisation 600s |
values-staging.yaml |
Reduced replicas and resource limits for cost |
Deployment Strategies¶
Services are classified by criticality:
| Service | Strategy | Rationale |
|---|---|---|
| Booking Service | Canary (10% initial) | Revenue-critical write path |
| Availability Service | Canary (10% initial) | Feeds the booking pre-check |
| Patient Service | Canary (10% initial) | User-facing profile data |
| Search Service | Rolling | Read-only; stateless; fast rollback acceptable |
| Notification Service | Rolling | Async consumer; no synchronous user impact |
| Practitioner Service | Rolling | Low-traffic CRUD |
Rolling Update¶
New pods are added one at a time before old pods are terminated. At no point is capacity reduced below the requested replica count.
Canary Deployment Flow¶
The canary flow runs entirely within the production GitHub Actions job:
- A separate Helm release (
{service}-canary) is created with a reduced replica count and acanary.weightannotation routingN% of traffic to it via the ingress/service-mesh layer. - A 5-minute observation window (
sleep 300) allows error rate monitoring. - If no manual rollback is triggered, the stable release is upgraded to the new image.
- The canary release is deleted with
helm uninstall {service}-canary.
# Step 1 — deploy canary
helm upgrade booking-service-canary helm/appointment-service \
--set canary.enabled=true \
--set canary.weight=10 \
--set image.tag="${SHA}" \
--atomic --timeout 10m
# Step 3 — promote to stable
helm upgrade booking-service helm/appointment-service \
--set canary.enabled=false \
--set image.tag="${SHA}" \
--atomic --timeout 15m
# Step 4 — clean up
helm uninstall booking-service-canary
Pod Security Configuration¶
Security defaults are set in values.yaml and apply to all services. Per-service files may only tighten, not loosen, these settings.
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000 # matches appuser UID in the Dockerfile
runAsGroup: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Two emptyDir volumes (/tmp and /app/.cache) are mounted to satisfy services that need writable scratch space while keeping the root filesystem read-only.
Horizontal Pod Autoscaler¶
HPA is enabled for all services with dual-metric scaling:
| Metric | Target |
|---|---|
| CPU utilization | 70% |
| Memory utilization | 80% |
Scale-up: 2 pods per 60-second window (fast response to traffic spikes). Scale-down: 1 pod per 60-second window, with a 300-second stabilisation window (600s in production) to prevent flapping.
No CPU limit is set — CPU throttling causes latency spikes. The HPA handles horizontal scaling before a node becomes CPU-saturated. Memory limits are enforced to prevent OOM cascades.
Docker Build Strategy¶
All services share docker/Dockerfile. The build accepts a SERVICE_DIR argument so the same file serves the entire monorepo.
Stages¶
| Stage | Base | Purpose |
|---|---|---|
base |
node:22-bookworm-slim |
OS security patches, tini, appuser (UID 1000) |
deps |
base |
npm ci --ignore-scripts — cached until package-lock.json changes |
builder |
deps |
TypeScript compile (npm run build); prune devDependencies |
release |
base |
Copy only dist/, production node_modules, package.json — ~50 MB final image |
Key Practices¶
- tini as PID 1 — ensures correct SIGTERM forwarding and zombie process reaping
--ignore-scriptsduringnpm ci— prevents malicious postinstall hooks in CI- No shell or package manager in the final image (a distroless alternative is documented in a comment in the Dockerfile)
- OCI image labels —
org.opencontainers.image.*with build date, git commit, branch - Digest pinning — CI passes
--set image.digestso Kubernetes pulls by digest, not by mutable tag - SBOM + provenance —
docker/build-push-actiongenerates attestations on every push to ECR
EKS Node Group Layout¶
| Node Group | Instance Type | Taint | Workloads |
|---|---|---|---|
general |
m6i.xlarge |
None | Booking, Availability, Patient, Practitioner, Notification |
search |
r6i.xlarge |
workload=search:NoSchedule |
Search Service only |
Pod anti-affinity (preferred) spreads pods across AZs. A topology spread constraint (maxSkew: 1 on kubernetes.io/hostname) prevents all replicas from landing on the same node.