WiseChef platform (paying clients, names withheld)
Multi-tenant AI agent SaaS — live production
Multiple paying tenants, each on an isolated agent stack with dedicated infrastructure. Per-tenant VPS, per-tenant tunnel, per-tenant API key. Running in production since March 2026.
The situation
We needed to prove the Framework product could run more than one tenant without the stacks interfering with each other — the common SaaS failure mode where an outage for one customer cascades into all of them. The internal test was: can multiple paying tenants run simultaneously, each on their own infrastructure, with one person maintaining the fleet?
What we did
- Built an automated provisioning pipeline: checkout → cloud VPS creation → cloud-init bootstrap → edge tunnel + DNS records → per-tenant LLM key → welcome email
- Each tenant gets a dedicated subdomain, dedicated container, dedicated tunnel, dedicated LLM budget
- Wrote the fleet-management tooling so adding the next tenant is a single webhook away
- Enforced budget caps on per-tenant LLM keys so a runaway cost on one tenant cannot affect the others
- Set a hard tenant cap for this generation of the platform; new tiers planned before the cap matters
Timeline: pipeline built in two weeks, first paying tenant onboarded within 24 hours of going live.
What changed
The platform has been running paying tenants in production since March. Each tenant is isolated at the container, tunnel, and budget layer. When one has an incident it does not affect the others. The fleet is operated by one person.
The architecture survived an enterprise on-prem engagement and concurrent SaaS tenants without structural changes. The same provisioning pipeline is now sold as the Framework product.
Relevant context
The architecture that runs this platform is what the Framework product installs for new customers. We dogfood what we sell. Infrastructure cost stays well inside the tenant subscription price at current scale.