ADR-004: DB Unique Constraint over Distributed Locks for Double-Booking Prevention¶
Status: Accepted
Context¶
The system must guarantee that a practitioner's time slot cannot be double-booked, even under concurrent load (90 bookings/second peak, with bursts). Two patients attempting to book the same (practitioner_id, slot_start) pair must result in exactly one confirmation and one conflict error.
The approaches evaluated were:
- PostgreSQL unique partial index — let the database engine enforce the constraint
- Redis distributed lock (SETNX / Redlock) — acquire a lock per
(practitioner_id, slot_start)before writing to the DB - Optimistic concurrency control — version column on the slot; retry on conflict
- Application-level serialisation — single-writer queue per practitioner
Decision¶
Enforce double-booking prevention with a PostgreSQL unique partial index on the bookings table:
CREATE UNIQUE INDEX idx_no_double_book
ON bookings (practitioner_id, slot_start)
WHERE status = 'confirmed';
This index makes it physically impossible for two rows with the same (practitioner_id, slot_start) to both have status = 'confirmed'. The partial index (WHERE status = 'confirmed') means cancelled bookings do not consume a slot — the same slot can be rebooked after a cancellation without any index cleanup.
A Redis pre-check is used as a fast path to reduce contention under load, but it is not the correctness gate:
1. Redis GETBIT availability:{practitioner_id}:{date} {slot_index}
→ If bit is 0 (slot taken), reject early — no DB round-trip needed
2. INSERT INTO bookings ... ON CONFLICT DO NOTHING RETURNING id
→ Unique index enforces correctness; zero rows returned = conflict
Consequences¶
Positive:
- The correctness guarantee lives entirely within the database — no additional distributed infrastructure to deploy, monitor, or reason about.
- No lock expiry edge cases. A distributed lock (Redlock) requires careful TTL tuning: too short and the lock expires during a slow DB write; too long and a crashed lock holder blocks the slot until the TTL expires.
- The partial index is transparent in the query plan and auditable via
EXPLAIN. - Cancellations are handled correctly by design:
UPDATE SET status = 'cancelled'removes the row from the partial index, immediately freeing the slot for rebooking. - The Redis pre-check is additive — it reduces DB contention but the system is correct without it (fallback is automatic if Redis is unavailable).
Negative:
- Under high contention on a popular slot, multiple concurrent
INSERTattempts will compete at the DB level. The unique index resolves conflicts with a serialisation error, but all losing transactions have already consumed a DB round-trip. The Redis pre-check reduces (but does not eliminate) this wasted work. - The constraint is enforced at the DB layer, so application code must handle
UniqueViolation(error code 23505) and translate it into a user-facing "slot no longer available" response.
Why not Redlock: Redis distributed locks introduce a failure mode that the unique index avoids entirely: if the Redis node holding a lock key fails before the TTL expires, no other process can acquire the lock until the TTL elapses. The unique index has no equivalent concept of a "stuck lock". Additionally, Redlock correctness under network partitions is a known area of debate — the unique index correctness is not.
Why not optimistic concurrency: An optimistic version column on the slot would require a read-modify-write cycle for every booking attempt, and retry logic in the application. The unique index provides equivalent safety with a single INSERT — no read is required before the write.
Trade-off accepted: The unique index is the simplest, most auditable, and most operationally robust approach at the projected throughput of 90 bookings/second. At significantly higher write rates (thousands per second to the same hot slot), a pre-serialisation layer would be warranted, but this is not a current requirement.