Shipped 2026-05-10
cf-for-saas / caddy decommission · production cutover

/ Customer custom domains
moved off Caddy. CF edge takes over.

Three hours from git checkout -b to live: cf-saas-decommission shipped end-to-end on production. Caddy GCE VM stopped, customer hostnames now terminate TLS at Cloudflare edge with auto-issued Google CA certs, and the dispatcher worker (already serving *.vibehost.space) gained a custom-domain branch with Hard‑Rule‑#8 access‑gate enforcement.

Two real customer hostnameslacoste.projectd.cc and cpbl.ladykaren.org — now serve real content (they previously 404'd because the API middleware was never wired in prod). Plus a fresh-bind E2E proves the same flow works for new prod users: add hostname → set 1 CNAME + 1 TXT → click verify → cert active in ~1 min.

Phases shipped
7 / 7
P1–P7, plus 2 follow-up issues
Customer hostnames live
3
2 backfilled + 1 fresh-bind E2E
Caddy GCE VMs running
0
caddy-proxy: TERMINATED
Cert provisioning
~1–3 min
Google CA · auto issue / renew
· · · · ·

01 Before / After

The two existing prod custom domains were silently broken — Caddy was on the path but the API custom-domain-proxy middleware was never enabled in production (CUSTOM_DOMAIN_PROXY_ENABLED defaulted to false). After cutover, requests terminate at CF edge with an auto-issued cert and route through the dispatcher Worker.

Before · Caddy era

HTTP/2 404 — middleware unwired

$ curl -sI https://lacoste.projectd.cc/
HTTP/2 404
via: 2.0 Caddy
via: 1.1 google
x-powered-by: Express
content-type: text/html

$ curl -s ... | head -3
<!DOCTYPE html>
<pre>Cannot GET /</pre>

Customer DNS pointed at cname.vibehost.host → Caddy GCE IPs → API pod → Express default not-found. The customer thought their site was up because the dashboard showed verified.

After · CF edge

HTTP/2 200 — real customer content

$ curl -sI https://lacoste.projectd.cc/
HTTP/2 200
server: cloudflare
cf-ray: 9f97af3cccc6e60e-SIN
strict-transport-security: max-age=31536000

$ openssl s_client ...
subject=CN = lacoste.projectd.cc
issuer=Google Trust Services, CN = WE1
notAfter=Aug 8 04:19:57 2026 GMT

CF edge terminates TLS with a cert auto-issued by Google CA. Dispatcher worker reads custom-domain:<host> from KV → resolves to the app's tenant subdomain → /authz/check AND-composes visibility / password / share-link gates → R2 serves customer content.

https://lacoste.projectd.cc/ 200 OK
lacoste.projectd.cc serving its real LACOSTE keyword dashboard via CF edge
Real customer screenshot · LACOSTE Dcard Keyword Dashboard · CF edge · Google CA cert
https://cpbl.ladykaren.org/ 200 OK
cpbl.ladykaren.org serving its 2026 CPBL opening report
Real customer screenshot · 2026 CPBL 開幕戰觀察報告 · CF edge · Google CA cert
"3 hours 前是 404 Cannot GET / from broken Caddy、現在是真的 LACOSTE / CPBL 內容 from CF + dispatcher worker."
— session log, 2026-05-10 12:48
· · · · ·

02 Fresh-bind E2E · what new prod users will experience

Most of the migration value comes from new domains, not the 2 backfilled ones. So I bound a fresh hostname end-to-end via the live API to prove the experience.

01
vibehost domain add cf-saas-brief cf-saas-test.vibehost.cc API returns DNS instructions — 1 CNAME (apex → cname.vibehost.host) + 1 ownership TXT.
~150 ms
02
User adds CNAME + _vibehost.<host> TXT in their DNS provider. For this E2E, vibehost.cc (testing zone) — the same flow on any customer-owned zone.
~30 sec
03
vibehost domain verify cf-saas-brief cf-saas-test.vibehost.cc API checks DNS, marks verifiedAt, calls cf.create(), writes KV (custom-domain mapping + ownership token), tags row cf_sync_state='ok'.
~1.2 sec
04
CF edge runs ACME HTTP-01 against the dispatcher worker, picks up the ownership challenge, issues cert. No customer interaction — fully automated. Dispatcher self-answers /.well-known/cf-custom-hostname-challenge/<id> from KV.
~1 min
05
https://cf-saas-test.vibehost.cc/HTTP/2 200 with real app content. Same edge / cert pipeline as vibehost.com. Caddy entirely out of the picture.
live
https://cf-saas-test.vibehost.cc/ 200 OK · fresh bind
cf-saas-test.vibehost.cc serving the cf-saas-brief app — fresh-bind E2E proof
Real screenshot · fresh-bind E2E · 1 CNAME + 1 TXT + click verify → cert active in ~1 min
· · · · ·

03 Architecture · before vs after

BEFORE
  customer DNS:  blog.example.com  CNAME → cname.vibehost.host
                                      │
  cname.vibehost.host  A 35.187.148.54 (Caddy GCE)
                      A 34.81.53.121  (Caddy GCE)
                                      │
                                      ▼
                          ┌─────────────────────┐
                          │ Caddy on-demand TLS │
                          │ (Let's Encrypt)     │
                          └──────────┬──────────┘
                                     │ X-Forwarded-Host
                                     ▼
                          ┌─────────────────────┐
                          │ apps/api  (GKE)     │
                          │ middleware          │ ← CUSTOM_DOMAIN_PROXY_ENABLED
                          │ never wired         │   defaulted false in prod
                          └──────────┬──────────┘
                                     ▼
                                 404 Cannot GET /


AFTER (shipped)
  customer DNS:  blog.example.com  CNAME → cname.vibehost.host  (unchanged)
                                      │
  cname.vibehost.host  CNAME → cf-saas-origin.vibehost.com proxied
                                      │
                                      ▼
                          ┌─────────────────────────┐
                          │ Cloudflare edge         │
                          │ TLS terminate (Google CA)│
                          │ CF for SaaS fallback     │
                          └─────────────┬───────────┘
                                        │ per-host worker route
                                        ▼
                          ┌─────────────────────────┐
                          │ apps/dispatcher Worker   │
                          │ KV custom-domain:<host>  │
                          │ → tenant subdomain      │
                          │ /authz/check (3 gates)   │
                          └─────────────┬───────────┘
                                        ▼
                                  R2 / SSR Worker
                                  200 — real customer content
· · · · ·

04 What shipped — PRs & ops

PR #306
feat: cloudflare for saas — dispatcher accepts custom domains (P1–P5)
7 phase commits · 7 review rounds · score 9.5 → merged · base for P6
PR #331
fix(cli): vibehost update fails on macOS — BSD grep regex
caught live during this session · 9.5/10 review · merged
issue #304
drizzle migrate:generate drifts from manually-numbered prod migrations
tracked · hand-wrote 0019_custom_domains_cf_columns.sql for this PR
issue #333
CF for SaaS service uses wrong zone (tenant zone instead of provider zone)
discovered by fresh-bind E2E · ~1h fix · workaround: per-host manual migration

P6 / P7 — manual ops on prod

· · · · ·

05 The session — abridged timeline