FamilySafe

The product

FamilySafe is a zero-knowledge, end-to-end encrypted digital vault for personal records — finance, legal, identity, health, digital assets. Users store cards organised by category, share them selectively with family, guests and professional advisors, and nominate executors who can unlock a "probate" subset of the vault only after a verified death — and only if a configurable threshold of executors collaborate.

The product's defining promise is uncompromising: the FamilySafe servers cannot decrypt vault contents under any circumstance — not with a database dump, not under court order. The release rules are enforced cryptographically, not by policy.

That promise is also what made FamilySafe a high-risk build. If the cryptographic design didn't work end-to-end in the browser, in production, the product had no business case.

This is how we took it from idea to live product across our three engagement stages.

With the riskiest feature de-risked, Build turned the roadmap into a product, one feature at a time, in front of the client.

How it ran day to day

Every feature followed the same loop:

Scope the next slice from the Prove roadmap.
Build in real production-grade code.
Push to the live preview environment — a private URL running 24/7 that the client and their stakeholders could use the moment a feature landed.
Sign off in the preview, with the product board updated live.
Move to the next slice.

The client never had to ask "what are you working on?" or "when can I see it?" — both questions were answered in real time by the preview URL and the live board.

The feature slices we shipped

The full FamilySafe platform was built as a sequence of slices, each shippable and reviewable independently:

Cryptographic core

Browser-side key derivation (Argon2id → K0), random VMK generation, asymmetric keypair for sharing.
Per-card K1 encryption, category-level KGroup, vault-wide KGlobal, probate-level KProbate.
Client-side storage policy enforced (K0 in RAM only, VMK in SessionStorage, JWT in HttpOnly cookies — never the wrong combinations).

Authentication and 2FA

Two-stage login: password validates, then a separate factor unlocks. Vault decryption gated behind both.
Four real factors shipped: Passkey (recommended), TOTP, Email OTP, Recovery Codes.
DB-backed endpoint authorisation — every API route loaded into an EndpointCache at boot; routes not seeded are blocked at the middleware.
Non-live DEV_PASSWORD_ONLY bypass for testing, gated to localhost/preview hosts and a config flag — physically incapable of activating on live.

Vault and sharing

Categories, cards, fields, attachments, reminders, profile photos.
Sharing model with three table primitives (key_templates / key_user_defaults / key_share) covering single-card, category, and global shares.
Family member, guest, and advisor invite flows — including the asynchronous case where a recipient is invited before they have an account, with the owner finalising the encryption on next login (cipher/iv/tag populated then).
Delegated VMK ("elderly assist") — a deliberate exception for write-delegation, designed and labelled as such.

Probate and executor management

Executor management screen with add/remove/resend, threshold configuration (k of n), and Generate Keys regeneration with old-share revocation.
Executor portal — case list, case detail, my-share retrieval, assembly session, audit log view.
Death-report submission with admin verification queue.
Hard-blocked /kc endpoint — the gate that enforces "cannot decrypt before death is verified" at the network layer, not just the UI.
Rekey on every executor change — re-generate PK, re-encrypt probate cards, re-split shares, revoke old, rotate kc_enc.

Billing and payments

GoCardless integration with the company-correct architecture: FamilySafe is the source of truth, GoCardless is a collection rail. No recurring schedules in GoCardless; everything anchored to a billing_anchor_at that never drifts.
Daily billing CronJob on Kubernetes, with double-billing protection via a last_billed_at check.
Webhook-driven payment lifecycle: payments.confirmed, failed, cancelled, paid_out, plus mandate lifecycle. Signature validation done correctly: HMAC-SHA256 over the raw body, lowercase hex, constant-time compare.
One-off payments via Billing Requests + hosted page for upgrades like Lifetime.

Email infrastructure

Database-backed queue with a stateless worker pod model: workers carry no DB credentials and call internal-only API endpoints to claim work, render templates with Scriban, send via the SMTP relay (Brevo), and report results.
Crash-safe processing — rows stuck in Processing past LOCK_TIMEOUT_MINUTES are requeued by sp_email_queue_requeue_stuck and may be claimed by another worker.
Templates ship inside the worker image, versioned with releases — prevents the "what version of the template went out yesterday?" problem.

Backups and disaster recovery

Kubernetes CronJob that uses mysqldump --single-transaction --hex-blob (the --hex-blob flag is mandatory for FamilySafe's BINARY(16) UUIDs and encrypted VARBINARY columns — without it, restores corrupt).
Output gzipped, uploaded to Wasabi S3 (eu-west-3), retained 14 days. Live cadence: every 2 hours, full database.
Read-only backup_user MySQL account; Wasabi credentials are write-only; both in a single Kubernetes Secret.

Admin and operations

Vault administration screen with family/advisor/pending-invite tabs.
User management with admin-controlled MFA disable (audited via sp_add_auth_security_event).
Activity log + event-log viewer.
Profile administration with full self-service 2FA management.

Transparency artifacts the client kept throughout Build

Live preview URL — every feature visible the moment it merged.
Live product board — done, in progress, queued, planned. No status meetings.
Wiki of architecture decisions — owned by the client, written as the work happened.
Endpoint-by-endpoint cross-reference — every screen mapped to its API calls and stored procedures, so the client could audit complexity in any area without reading code.

Outcome

A production-deployable FamilySafe — fully encrypted, fully tested, fully auditable — handed off to a Kubernetes cluster on Civo with Traefik v2 ingress, Cert-Manager managing Let's Encrypt certs, and DNS pointing at a single LoadBalancer IP serving four domains.

The client knew, the day Build ended, exactly what was running and exactly how it worked — because they had been watching it grow for months.

FamilySafe is live. We run it. Same feature-led philosophy — small features, short cycles, continuous progress — alongside everything that keeps the product alive and growing.

What we run, every day

Infrastructure

The full Kubernetes cluster on our managed estate — namespace, ingress, certificates, deployments for API, vault frontend, marketing website, email worker, and MySQL.
Cert-Manager + Let's Encrypt rotation for www.familysafe.co.uk, api.familysafe.co.uk, vault.familysafe.co.uk, and the company brand domain.
Wasabi-backed offsite backups on a 2-hour live cadence (full database), with a documented restore-to-test procedure run on a schedule.

Operational hygiene

Continuous security patching (base images, .NET runtime, Node, MySQL, kubectl, the lot).
MySQL service-account token rotation for ops access.
Brevo SMTP credential rotation tied to ops events.
Webhook-secret rotation between sandbox and live.

Continuous product growth — feature-led, same as Build

Recent and in-flight slices since launch:

AI Chatbot — Phase 1 — an in-product FamilySafe Assistant that answers user questions grounded in their own vault context, with the same zero-knowledge respect as the rest of the product (no plaintext leaves the user's cryptographic boundary).
Endpoint caching — recent merge to reduce p95 latency on hot routes.
2FA mail copy refinement — small, frequent UX improvements driven by real user feedback from live.

Each one ships to the preview environment first, gets signed off, then promotes to live — same loop as Build, just continuous.

Monitoring and dashboards

Pod health, ingress error rates, MySQL slow-query surfacing.
Backup-success monitoring with paged alerts on missed runs.
Email queue age and failure-rate dashboards (oldest pending, retry rates by category).
Webhook delivery health (signature failures, replay attempts, duplicate handling).

The chatbot we own

Prompt tuning, content updates, escalation flows, and conversation review — all part of the retainer, no separate "AI work" line item.
New AI features added as feature slices when they make sense (the chatbot is Phase 1; phases beyond are scoped from real usage data, not speculation).

Where Grow is heading next for FamilySafe

AI marketing agents — lead generation, outreach, content drafting, social scheduling — drawing on the same anonymised, aggregated product data we already monitor.
AI-tuned business dashboards — package conversion, churn signals, cost per active vault, agent performance.
AWS cost engineering as we migrate select workloads.

What this case study proves

Prove works because we build the riskiest thing first.

FamilySafe's riskiest thing — a server that genuinely cannot decrypt its own customer data — is also the thing that makes it a viable product. Building that first meant the client never paid to discover, six months in, that the foundation didn't hold.

Build works because the client never has to ask "where is it?"

Every feature went to the preview environment the day it merged. Every status was on the board, in real time. The client has the entire product wiki on their own infrastructure. There were no surprises — and no $50k bills for "discovery work" that never materialised.

Grow works because the team that built it is the team that runs it.

There was no handover gap, no "we'll need to ramp up on the codebase" delay. Day-one of Grow looked exactly like day-N of Build — small features, short cycles, shipped to preview, signed off, promoted. The product gets faster, smarter, and bigger every month, and the client builds none of it.

Stage scope summary, against this product

Stage	What we delivered for FamilySafe
Prove	End-to-end working death-unlock crypto in production code; full architecture; data model; risk analysis; shippable roadmap; client-owned spec.
Build	Full vault platform — auth/2FA, vault encryption, sharing, probate, executor management, billing, email infrastructure, backups, admin tools — built feature by feature, visible in preview throughout.
Grow	Live infrastructure on our estate; continuous security and patching; ongoing feature slices (AI chatbot, performance, UX); monitoring and dashboards; chatbot ownership; planned AI marketing agents and AWS cost engineering.

Stage

What we delivered for FamilySafe

Prove

End-to-end working death-unlock crypto in production code; full architecture; data model; risk analysis; shippable roadmap; client-owned spec.

Build

Full vault platform — auth/2FA, vault encryption, sharing, probate, executor management, billing, email infrastructure, backups, admin tools — built feature by feature, visible in preview throughout.

Grow

Live infrastructure on our estate; continuous security and patching; ongoing feature slices (AI chatbot, performance, UX); monitoring and dashboards; chatbot ownership; planned AI marketing agents and AWS cost engineering.

Investment shape

Prove

Up to £15,000. Fixed, scoped before work starts.

Build

Scoped from the Prove output, fixed before work starts.

Grow

Monthly retainer, sized to the product.

For FamilySafe-scale products — multiple cryptographic primitives, payment integration, asynchronous worker infrastructure, multi-tenant sharing, a probate workflow with regulatory implications, live infrastructure on the partner's estate — Grow is sized at a level that costs less than hiring a single mid-level engineer and delivers an entire team plus the platform they run on.

Want to talk?

If your product has a riskiest feature you'd rather not discover the truth about in month nine — that's a Prove engagement.

If you've got the proof and you want to ship without status meetings or ten-page weekly reports — that's a Build engagement.

If you want a team that builds the thing, runs the thing, and grows the thing — without you ever owning a Kubernetes cluster — that's Grow.

Start a conversation

The product

Prove

The riskiest feature we built

What Prove also delivered to the client

Outcome

Build

How it ran day to day

The feature slices we shipped

Cryptographic core

Authentication and 2FA

Vault and sharing

Probate and executor management

Billing and payments

Email infrastructure

Backups and disaster recovery

Admin and operations

Transparency artifacts the client kept throughout Build

Outcome

Grow

What we run, every day

Infrastructure

Operational hygiene

Continuous product growth — feature-led, same as Build

Monitoring and dashboards

The chatbot we own

Where Grow is heading next for FamilySafe

What this case study proves

Prove works because we build the riskiest thing first.

Build works because the client never has to ask "where is it?"

Grow works because the team that built it is the team that runs it.

Stage scope summary, against this product

Investment shape

Want to talk?