Where Vibe Coding Hits Its Limits: An Honest Look at Production Readiness
A measured framing to start: AI app builders are excellent at a specific class of work and unsuitable for another. Most content treats them as universally capable. Most resulting prototypes hit a wall when a real customer arrives.
This article walks the four failure modes that consistently appear when vibe-coded apps cross from internal experiment to external product, with cross-references to OWASP’s official guidance on AI-generated code risks.
The honest scope of these tools
From the Industry Rockstar teaching session that prompted this series, the host stated the position plainly: “We’re not replacing enterprise scaled backends. […] We’re not releasing tools and go across to millions and millions of users through this. And we also have to consider the security protocols on them because they’re not overly secure.”
That is a useful baseline. Lovable, Bolt, Cursor and the rest are exceptional at internal tools, MVPs, internal dashboards, marketing pages, lightweight CRUD apps. They are not currently a replacement for a senior engineering team building software that needs to serve hundreds of thousands of paying users with regulatory exposure. The opportunity is the long tail of small tools that previously never got built at all.
Failure mode 1: secrets in client-side code
The most common production failure: API keys end up in the React bundle that ships to the browser. The AI writes a call to OpenAI directly from the front-end component because that is the shortest path to a working demo. Anyone with browser dev tools can extract the key from the network tab within seconds.
OWASP’s 2025 LLM Top 10 names this risk directly under LLM02 Sensitive Information Disclosure, which covers PII, security credentials and confidential business data leaking through LLM-adjacent code paths. The published consequences include unauthorised data access, privacy violations and intellectual property exposure.
The fix is structural: secrets live on the server, the front end calls your server, the server calls the third party. AI builders increasingly default to this pattern when connected to Supabase or similar backends, but they still require the operator to enforce the rule. Audit every external API call in the generated code before you ship.
Failure mode 2: missing or weak authentication
AI builders happily generate a useful internal dashboard that has no login at all. The URL alone grants access. This is fine when the dashboard is internal-only on a private network. It becomes a data breach the moment a teammate forwards the URL to a contractor’s personal Gmail.
Two practical rules:
- Any app that touches customer data needs a login from day one. “Internal only” is a posture, not a security control.
- If your AI builder offers a one-click auth integration with Supabase, Clerk or similar, enable it before you add the first record. Retrofitting auth into a working app is twice the work of starting with it.
Failure mode 3: unbounded cost spikes
An AI builder generates code that calls a language model. The code works. A user submits a 50-page PDF and the call generates 10,000 tokens of output. Multiply by every form submission and your monthly bill jumps an order of magnitude.
The fix is operational. Set spending caps on every API key (OpenAI, Anthropic, Google all offer hard limits). Cap input length before it reaches the model. Log token usage per user so you can identify abuse before the bill arrives.
Failure mode 4: schema drift
Each iteration with the AI may quietly change your database schema. The AI adds a column, renames a field, switches a type from integer to string. The app continues to work because the AI also updates the calling code. Six iterations later, you have no migration history and no record of how the schema evolved.
For internal tools this is fine. For anything that persists customer data across releases, schema drift becomes a recovery nightmare the day something breaks. The fix: enable Supabase migrations or your platform’s equivalent, commit each schema change as a migration file, treat the database as a code asset rather than something the AI manages invisibly.
When to call a developer
A useful threshold list based on what we see across client work:
- The app handles payments, PII, or regulated data (health, finance, minors).
- You expect more than 1,000 concurrent users in the first six months.
- The app integrates with three or more external systems on the back end.
- Downtime would cost real money or reputation.
- You need an audit trail of changes for compliance.
Any one of these calls for a developer review before launch. Two or more, and the AI builder should be your prototyping tool, not your production stack.
The right pattern for the moment
The pattern that has emerged across teams shipping real revenue with this stack:
- Build the prototype in Lovable or Bolt within a week.
- Run it as an internal tool for thirty days. Collect real usage data.
- If the tool earns its keep, hand the codebase to a developer for a security review, an auth pass, and a migration to production-grade infrastructure.
- Keep iterating on features in the AI builder. Keep the developer involved for releases that touch sensitive paths.
This is the same shape as the host’s framing in the original session: AI builders are a way to do fast experimentation. The experimentation produces signal about what is worth investing engineering hours in. The mistake is mistaking the experiment for the product.
The closing measure
Vibe coding has not made software cheap. It has made the first version of a small piece of software nearly free. That distinction is where the value sits. Building things that previously did not get built. Not replacing things that previously needed proper engineering.
Article co-written by Viktoriia Didur and Elis. OWASP LLM Top 10 references verified 2026-05-12 at owasp.org. This article concludes a 4-part series on vibe coding for entrepreneurs.