The Compute Coffer
A funding pattern for LLM-fronted apps that extracts from no one
An LLM app costs the operator money on every call. The default playbook makes the visitor pay — directly through a subscription, indirectly through ads. The Compute Coffer is a third path: a small running pot, funded by past visitors and visible in real time, covers the next call. When the pot empties, the app falls back to a free tier and keeps working. The visitor pays nothing. The donor's contribution doesn't go to the operator. It pays for the next stranger's reading.
The pattern, in five parts
Per-call cost transparency. Every LLM call has a cost. The pattern surfaces it rather than hiding it inside a subscription or an ad impression. The visitor sees what the next call costs in the same way they see the price on a menu.
Public running balance. The coffer is a number. Anyone can read it, anytime, without an account. It goes up when someone donates, down when someone uses the app. The mechanism is not behind a dashboard.
Donations fund the next user, not the operator. Money entering the coffer pays the LLM provider directly. The operator never withdraws. A donor's contribution is a gift to the next stranger who shows up, not to the person who built the thing.
Graceful fallback to a free tier. When the coffer drops below threshold, or the daily cap is hit, the app switches engines mid-stride. The output the visitor gets is structurally the same; only the LLM voice rendering it back changes. Service degrades. It does not break.
Hard caps. Daily and monthly ceilings sit above the coffer as a runaway safety net. The pattern does not assume the operator can absorb unlimited overflow. The caps are protection, not targets.
What it isn't
Not a subscription — there is no commitment and no gating. Not pay-what-you-can — there is no transactional ask at the point of use. Not a tip jar — the money does not go to the person who built the thing. Not charity — the donor's gift is to the next stranger, specifically, not to a cause. Not freemium — the free tier is the fallback, not a degraded version designed to upsell.
Closest precedents
Pay-it-forward suspended coffee — the Italian caffè sospeso tradition. Same generosity-toward-strangers spirit. But: offline, per-purchase, no aggregation, no public ledger. You buy two coffees and trust the bar to keep one for whoever asks.
Open Collective — public ledger and transparent finances. But: the granularity is project-level. Donors fund a project's monthly bills, not specifically the next person's API call. Closer in transparency, further in immediacy.
Twitch tip feeds — real-time and visible. But: money goes to a person, not to the next viewer's experience. The thing the donor funds is not the thing the next viewer also gets.
The unseen combination is the four together: per-call cost transparency, a public running balance, donations that fund the next stranger's call, and a graceful free-tier fallback so the app never breaks. As far as I can tell, no one is doing this yet.
Why it might work
The intuition is that when the funding mechanism is opaque, donating feels like charity, and the visitor's relationship with the operator is one of obligation. When the mechanism is visible, donating feels like contribution, and the relationship is one of curiosity. People who understand a system are more likely to support it.
This is empirical, not proven. The Oracle is the first test. If the pattern holds, the coffer stays roughly stable across cycles of donation and use. If it doesn't, the operator ends up paying the gap, the coffer trends to zero, and the app spends most of its time in free-tier mode. Neither failure mode is catastrophic. The app keeps working either way.
Reference implementation
The Oracle (qav2-oracle.netlify.app) implements this pattern. The stack:
Frontend. Static, served from Netlify. The status pill at the bottom of the chat shows which engine is active (Claude or Llama) and how much the coffer holds. Clicking the engine name jumps to a plain-language explanation of the model.
Worker. A Cloudflare Worker holds the LLM provider's API key as a secret, calls the LLM, and decrements the coffer after each call. Worker source is public on GitHub.
Coffer. A single integer in Cloudflare KV — the balance in cents. A
/balance endpoint exposes it for the frontend to poll. Pre-call, the worker
reads the balance and decides which engine to route to. A second public endpoint,
/stats, surfaces aggregate usage — requests served this month, the Claude/Llama
split, total spend. Requests served, transparently: the pool is finite and visible, and so
is its use.
Donations. Currently routed through Buy Me a Coffee, with manual top-ups from the operator to the worker's KV balance. Until fiscal hosting goes live the operator is technically in the chain of custody, and the pattern's integrity rests on stated commitment (no draw of revenue) rather than structural separation. The designed destination is Open Source Collective: contributions would land in the project's collective balance and Anthropic invoices would be paid from there directly, making the operator never sees the money structurally true rather than aspirational. OSC application is on hold pending demonstrable usage to clear their eligibility bar; the structural claim follows once OSC hosts.
Free fallback. Cloudflare Workers AI (Llama 3.1 70B) takes over when the coffer drops below 50 cents or the daily cap is hit. The Oracle's geometric diagnosis is computed locally in either case — only the LLM-voiced reflection layer changes engines.
Hard caps. $5 per day, $100 per month, set as protection against runaway cost. The Oracle has yet to approach either.
Verification. Before a browser can talk to the worker, it passes a brief
Cloudflare
Turnstile check. Most visitors never see anything; flagged visitors get a quick
checkbox. The worker mints a short-lived signed token (HMAC-SHA256, 30 min) the browser
sends on each call, kept in memory only. This is the entry point of the defense layer
below — without it the coffer is trivially DoS-able. Cloudflare may set a
cf_clearance cookie during verification per their
privacy
policy; Oracle itself stores no personal data.
Open questions
The public ledger. The Oracle currently shows a running balance but not transaction history. A real-time feed — who gave (if they choose to be named), when, how much — is the next feature. Platform decision is settled: Open Source Collective hosts the donation side because contributions land in the project's collective balance rather than the operator's bank, and OSC's public ledger is built-in. Application is on hold while the project accumulates the usage signals OSC's eligibility criteria look for; integration design is ready to ship once hosting approves.
Scale. The pattern hasn't been tested at volume. If a thousand visitors arrive in an hour, the pot drains; the free fallback handles it, but the UX of mid-session engine switches at scale is untested.
How to defend a coffer. A naive coffer is DoS-able for the cost of a
single daily cap — one attacker drains the pot, and every honest visitor falls to the
free fallback for the rest of the day. The reference defense layer (native rate limit +
datacenter ASN block + Turnstile-gated signed sessions + per-session budget shares) is
specified in
THREAT_MODEL_AND_DEFENSE.md
and travels with the pattern. All four layers are in production on the Oracle. The
headline primitive is per-session shares: dividing the daily cap across signed sessions
so a single attacker can only burn their own slice while other visitors keep Claude
access.
Adoption. The pattern is one instance. Whether other developers find it worth adopting — whether it becomes a thing other people build with, rather than a thing one app does — is the open question that matters most.
Adopting it
A future phase will package the worker, the KV schema, and the status pill as a drop-in scaffold for any LLM-fronted project. Until that exists, the Oracle's worker is the reference: read the source, fork the structure, swap the LLM call for whichever provider you use, point the donate link at your own platform. The five-part recipe above is the contract.
The phrase that named the model: visible, finite, donated forward. The thing it makes possible: an app that doesn't ask the visitor to be a customer.