Deployment¶

Helm Chart Structure¶

All services are deployed from a single shared Helm chart at helm/appointment-service/. Values are layered at deploy time using Helm's last-value-wins precedence:

values.yaml                     # Base defaults for all services
  └── values-{service}.yaml     # Per-service overrides
        └── values-{env}.yaml   # Per-environment overrides (production, staging)
              └── --set flags   # CI-injected: image tag, digest, strategy

Chart Templates¶

Template	Purpose
`deployment.yaml`	Deployment with rolling/canary strategy, probes, security context
`hpa.yaml`	HorizontalPodAutoscaler — CPU 70% + memory 80% targets
`service.yaml`	ClusterIP service
`ingress.yaml`	Optional ingress (disabled by default; enabled per service)
`pdb.yaml`	PodDisruptionBudget — `minAvailable: 1` (overridden to `2` in production)
`serviceaccount.yaml`	ServiceAccount with optional IRSA annotation

Per-Service Values Files¶

File	Key Overrides
`values-booking.yaml`	3 replicas min, HPA 3–20, canary strategy, IRSA annotation for RDS + MSK
`values-search.yaml`	`nodeSelector: workload=search`, tolerations for the search taint, HPA 3–30, IRSA for OpenSearch SigV4
`values-availability.yaml`	Canary strategy
`values-patient.yaml`	Canary strategy
`values-notification.yaml`	Rolling strategy
`values-practitioner.yaml`	Rolling strategy
`values-production.yaml`	`minReplicas: 3`, `minAvailable: 2` PDB, scale-down stabilisation 600s
`values-staging.yaml`	Reduced replicas and resource limits for cost

Deployment Strategies¶

Services are classified by criticality:

Service	Strategy	Rationale
Booking Service	Canary (10% initial)	Revenue-critical write path
Availability Service	Canary (10% initial)	Feeds the booking pre-check
Patient Service	Canary (10% initial)	User-facing profile data
Search Service	Rolling	Read-only; stateless; fast rollback acceptable
Notification Service	Rolling	Async consumer; no synchronous user impact
Practitioner Service	Rolling	Low-traffic CRUD

Rolling Update¶

rollingUpdate:
  maxSurge: 1
  maxUnavailable: 0

New pods are added one at a time before old pods are terminated. At no point is capacity reduced below the requested replica count.

Canary Deployment Flow¶

The canary flow runs entirely within the production GitHub Actions job:

A separate Helm release ({service}-canary) is created with a reduced replica count and a canary.weight annotation routing N% of traffic to it via the ingress/service-mesh layer.
A 5-minute observation window (sleep 300) allows error rate monitoring.
If no manual rollback is triggered, the stable release is upgraded to the new image.
The canary release is deleted with helm uninstall {service}-canary.

# Step 1 — deploy canary
helm upgrade booking-service-canary helm/appointment-service \
  --set canary.enabled=true \
  --set canary.weight=10 \
  --set image.tag="${SHA}" \
  --atomic --timeout 10m

# Step 3 — promote to stable
helm upgrade booking-service helm/appointment-service \
  --set canary.enabled=false \
  --set image.tag="${SHA}" \
  --atomic --timeout 15m

# Step 4 — clean up
helm uninstall booking-service-canary

Pod Security Configuration¶

Security defaults are set in values.yaml and apply to all services. Per-service files may only tighten, not loosen, these settings.

podSecurityContext:
  runAsNonRoot: true
  runAsUser: 1000      # matches appuser UID in the Dockerfile
  runAsGroup: 1000
  fsGroup: 1000
  seccompProfile:
    type: RuntimeDefault

securityContext:
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  capabilities:
    drop:
      - ALL

Two emptyDir volumes (/tmp and /app/.cache) are mounted to satisfy services that need writable scratch space while keeping the root filesystem read-only.

Horizontal Pod Autoscaler¶

HPA is enabled for all services with dual-metric scaling:

Metric	Target
CPU utilization	70%
Memory utilization	80%

Scale-up: 2 pods per 60-second window (fast response to traffic spikes). Scale-down: 1 pod per 60-second window, with a 300-second stabilisation window (600s in production) to prevent flapping.

No CPU limit is set — CPU throttling causes latency spikes. The HPA handles horizontal scaling before a node becomes CPU-saturated. Memory limits are enforced to prevent OOM cascades.

Docker Build Strategy¶

All services share docker/Dockerfile. The build accepts a SERVICE_DIR argument so the same file serves the entire monorepo.

Stages¶

Stage	Base	Purpose
`base`	`node:22-bookworm-slim`	OS security patches, `tini`, `appuser` (UID 1000)
`deps`	`base`	`npm ci --ignore-scripts` — cached until `package-lock.json` changes
`builder`	`deps`	TypeScript compile (`npm run build`); prune devDependencies
`release`	`base`	Copy only `dist/`, production `node_modules`, `package.json` — ~50 MB final image

Key Practices¶

tini as PID 1 — ensures correct SIGTERM forwarding and zombie process reaping
--ignore-scripts during npm ci — prevents malicious postinstall hooks in CI
No shell or package manager in the final image (a distroless alternative is documented in a comment in the Dockerfile)
OCI image labels — org.opencontainers.image.* with build date, git commit, branch
Digest pinning — CI passes --set image.digest so Kubernetes pulls by digest, not by mutable tag
SBOM + provenance — docker/build-push-action generates attestations on every push to ECR

EKS Node Group Layout¶

Node Group	Instance Type	Taint	Workloads
`general`	`m6i.xlarge`	None	Booking, Availability, Patient, Practitioner, Notification
`search`	`r6i.xlarge`	`workload=search:NoSchedule`	Search Service only

Pod anti-affinity (preferred) spreads pods across AZs. A topology spread constraint (maxSkew: 1 on kubernetes.io/hostname) prevents all replicas from landing on the same node.