Back to The News

Where Vibe Coding Hits Its Limits: An Honest Look at Production Readiness

AI app builders like Lovable, Bolt and Cursor are excellent for internal tools and MVPs. They are not yet a replacement for senior engineering on production software. Four failure modes consistently break vibe-coded apps when they cross to external users: leaking secrets, missing auth, cost spikes, schema drift. Know the line between prototype and product before you ship.

The honest scope of these tools

A measured framing to start: AI app builders are excellent at a specific class of work and unsuitable for another. Most content treats them as universally capable. Most resulting prototypes hit a wall when a real customer arrives.

We’re not replacing enterprise scaled backends. […] We’re not releasing tools and go across to millions and millions of users through this. And we also have to consider the security protocols on them because they’re not overly secure.

That is a useful baseline. Lovable, Bolt, Cursor and the rest are exceptional at internal tools, MVPs, internal dashboards, marketing pages, lightweight CRUD apps. They are not currently a replacement for a senior engineering team building software that needs to serve hundreds of thousands of paying users with regulatory exposure. The opportunity is the long tail of small tools that previously never got built at all.

Failure mode 1: secrets in client-side code

The most common production failure: API keys end up in the React bundle that ships to the browser. The AI writes a call to OpenAI directly from the front-end component because that is the shortest path to a working demo. Anyone with browser dev tools can extract the key from the network tab within seconds.

OWASP’s 2025 LLM Top 10 names this risk directly under LLM02 Sensitive Information Disclosure, which covers PII, security credentials and confidential business data leaking through LLM-adjacent code paths. The published consequences include unauthorised data access, privacy violations and intellectual property exposure.

The fix is structural: secrets live on the server, the front end calls your server, the server calls the third party. AI builders increasingly default to this pattern when connected to Supabase or similar backends, but they still require the operator to enforce the rule. Audit every external API call in the generated code before you ship.

Failure mode 2: missing or weak authentication

AI builders happily generate a useful internal dashboard that has no login at all. The URL alone grants access. This is fine when the dashboard is internal-only on a private network. It becomes a data breach the moment a teammate forwards the URL to a contractor’s personal Gmail.

Two practical rules:

  • Any app that touches customer data needs a login from day one. Internal only is a posture, not a security control.
  • If your AI builder offers a one-click auth integration with Supabase, Clerk or similar, enable it before you add the first record. Retrofitting auth into a working app is twice the work of starting with it.

Failure mode 3: unbounded cost spikes

An AI builder generates code that calls a language model. The code works. A user submits a 50-page PDF and the call generates 10,000 tokens of output. Multiply by every form submission and your monthly bill jumps an order of magnitude.

Operational fix: set spending caps on every API key (OpenAI, Anthropic, Google all offer hard limits). Cap input length before it reaches the model. Log token usage per user so you can identify abuse before the bill arrives.

Failure mode 4: schema drift

Each iteration with the AI may quietly change your database schema. The AI adds a column, renames a field, switches a type from integer to string. The app continues to work because the AI also updates the calling code. Six iterations later, you have no migration history and no record of how the schema evolved.

For internal tools this is fine. For anything that persists customer data across releases, schema drift becomes a recovery nightmare the day something breaks. Enable Supabase migrations or your platform’s equivalent, commit each schema change as a migration file, treat the database as a code asset rather than something the AI manages invisibly.

When to call a developer

A useful threshold list based on what we see across client work:

  • The app handles payments, PII, or regulated data (health, finance, minors).
  • You expect more than 1,000 concurrent users in the first six months.
  • The app integrates with three or more external systems on the back end.
  • Downtime would cost real money or reputation.
  • You need an audit trail of changes for compliance.

Any one of these calls for a developer review before launch. Two or more, and the AI builder should be your prototyping tool, not your production stack.

The right pattern for the moment

The pattern that has emerged across teams shipping real revenue with this stack:

  1. Build the prototype in Lovable or Bolt within a week.
  2. Run it as an internal tool for thirty days. Collect real usage data.
  3. If the tool earns its keep, hand the codebase to a developer for a security review, an auth pass, and a migration to production-grade infrastructure.
  4. Keep iterating on features in the AI builder. Keep the developer involved for releases that touch sensitive paths.

AI builders are a way to do fast experimentation. The experimentation produces signal about what is worth investing engineering hours in. The mistake is mistaking the experiment for the product.

The closing measure

Vibe coding has not made software cheap. It has made the first version of a small piece of software nearly free. That distinction is where the value sits. Building things that previously did not get built. Not replacing things that previously needed proper engineering.

Questions People Ask About This

"is Lovable safe for production"
"vibe coding security risks OWASP"
"AI app builders for enterprise"
"when to hire developer over AI builder"
"API key leak in client-side code fix"
"Supabase migrations vibe coding"

Frequently Asked Questions

Can a vibe-coded app actually handle payments?

Technically yes via Stripe integrations Lovable surfaces natively. But you should run a developer-led security review before going live. Payment flows are a regulated path; mistakes here are expensive. Treat the AI-built version as a prototype to validate the UX, then have a developer ship the production version.

How do I audit a Lovable app for security issues?

Three steps: (1) grep the generated code for any API key strings that match patterns sk-, pk-, sb_, or hardcoded URLs to api.openai.com / api.anthropic.com / supabase.co. None should be in client-side files. (2) Confirm Supabase Row Level Security is enabled on every table. (3) Test that protected routes redirect to login when accessed without a session cookie.

What is the smallest red flag that means I need a developer?

When the AI writes a database migration that drops a column you cannot afford to lose, and you do not notice for three days. If you cannot review your own schema changes confidently, bring in a developer for the data layer at minimum.

Is OWASP LLM Top 10 relevant to small businesses?

Yes. The risks scale down to small apps with small data. LLM02 (Sensitive Information Disclosure) and LLM05 (Improper Output Handling) apply equally whether you serve 50 users or 50,000. A leaked customer email list is a breach regardless of headcount.

Should I use Supabase RLS or write my own auth checks?

Use Supabase Row Level Security. Writing your own checks at the application layer is fragile and easy to bypass via direct database queries. RLS enforces the rules at the database level so even a buggy app cannot leak data across users.

What about Cursor or Claude Code: same limits apply?

Slightly different. Cursor and Claude Code are developer tools, not no-code platforms. They produce code that runs on your infrastructure, so the security depends on your existing setup. The risks shift from the builder to the developer’s discipline.

How do I avoid cost spikes with a public-facing AI feature?

Three layers: (1) hard cap on input length before it reaches the model. Reject anything over a known threshold. (2) per-user rate limiting on the endpoint. (3) hard spend limit on your API key with the provider. Each layer catches a different abuse pattern.

Vibe coding did not make software cheap.

It made the first version of a small piece of software nearly free. Know the line before you ship.

Vimaxus

We help SMBs ship vibe-coded apps that survive their first real customer. Security review, auth hardening, schema migration, production hand-off. The unglamorous work that turns a prototype into a product.

Get a production-readiness review for your vibe-coded app

Written by Viktoriia Didur and Elis

...