You shipped your AI-built SaaS to production. Users are signing up. Revenue is starting to flow. Then, at 2 AM on a Tuesday, everything crashes.
Your Slack is on fire. Users are angry. Your Stripe dashboard shows failed payments. And you're frantically SSHing into a server you barely understand, trying to find logs in a directory structure Cursor generated three weeks ago.
This isn't a hypothetical. It's the reality for 80% of AI-built SaaS apps in their first 30 days of production.
I've seen this pattern repeat with dozens of indie hackers: Cursor/Claude writes beautiful code that works perfectly on localhost. You deploy with confidence. Then production tears it apart.
The problem isn't the AI. It's that AI writes code optimized for "working" not "surviving production at scale."
This post breaks down the 7 deadly sins that kill AI-built SaaS apps in production—and the exact fixes that keep you alive.
The 7 Deadly Sins of AI SaaS in Production
AI code generators optimize for localhost happiness. Production demands paranoia, redundancy, and graceful failure. Here are the gaps that kill apps:
Sin #1: Missing Error Boundaries
// ❌ What AI generates (works on localhost)
export default function Dashboard() {
const { data } = useQuery('/api/metrics')
return <MetricsChart data={data} />
}
// ✅ What production needs (survives real users)
export default function Dashboard() {
const { data, error, isLoading } = useQuery('/api/metrics')
if (error) {
return <ErrorState message="Failed to load metrics" retry={() => refetch()} />
}
if (isLoading) {
return <SkeletonLoader />
}
if (!data || data.length === 0) {
return <EmptyState />
}
return <MetricsChart data={data} />
}
The rule: Every user-facing component needs 4 states: loading, error, empty, success.
Sin #2: Unvalidated User Input
AI assumes inputs are valid. Production users will send you: empty strings, SQL injection attempts, 10MB JSON payloads, emoji-only fields, null values wrapped in strings ("null"), and Unicode edge cases that break your database.
Fix: Validate everything at the API boundary with Zod/Joi/Yup before touching your database:
import { z } from 'zod'
const createUserSchema = z.object({
email: z.string().email().max(255),
name: z.string().min(2).max(100).trim(),
password: z.string().min(8).max(128),
metadata: z.record(z.unknown()).optional()
})
export async function POST(req: Request) {
try {
const body = await req.json()
const validated = createUserSchema.parse(body)
// NOW it's safe to use validated data
const user = await db.users.create(validated)
return Response.json(user)
} catch (error) {
if (error instanceof z.ZodError) {
return Response.json({ error: error.errors }, { status: 400 })
}
return Response.json({ error: 'Internal error' }, { status: 500 })
}
}
Zod catches 90% of production input bugs before they hit your database.
Sin #3: Infinite Loops in Async Code
This is the #1 killer of AI-generated backends. A useEffect with missing dependencies, a webhook that retries itself, or a background job that never exits.
Real example from an indie hacker: Cursor generated a Stripe webhook handler that re-processed failed payments by calling itself recursively. Within 2 hours, it made 47,000 API calls and hit the rate limit, blocking all legitimate payments.
The Fix: Circuit Breakers
// ❌ Dangerous: Infinite retry
async function processPayment(userId: string) {
try {
await stripe.charges.create({ ... })
} catch (error) {
await processPayment(userId) // INFINITE LOOP
}
}
// ✅ Safe: Exponential backoff with max retries
async function processPayment(
userId: string,
retryCount = 0,
maxRetries = 3
) {
try {
await stripe.charges.create({ ... })
} catch (error) {
if (retryCount >= maxRetries) {
await logFailure(userId, error)
await sendAdminAlert(error)
throw error // Stop retrying
}
const delay = Math.pow(2, retryCount) * 1000 // 1s, 2s, 4s
await sleep(delay)
return processPayment(userId, retryCount + 1, maxRetries)
}
}
Always add retry limits and exponential backoff to async operations.
Sin #4: Database Connection Leaks
AI generates database queries without connection pooling or proper cleanup. After 100 requests, you run out of connections and your app hangs.
Example: Proper Connection Pooling
// ❌ AI-generated (leaks connections)
import { Pool } from 'pg'
export async function getUser(id: string) {
const pool = new Pool() // NEW POOL EVERY REQUEST
const result = await pool.query('SELECT * FROM users WHERE id = $1', [id])
return result.rows[0]
}
// ✅ Production-ready (reuses connections)
import { Pool } from 'pg'
const pool = new Pool({
max: 20, // Connection limit
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
})
export async function getUser(id: string) {
const client = await pool.connect()
try {
const result = await client.query('SELECT * FROM users WHERE id = $1', [id])
return result.rows[0]
} finally {
client.release() // CRITICAL: Return to pool
}
}
// Graceful shutdown
process.on('SIGTERM', async () => {
await pool.end()
process.exit(0)
})
One pool per app, not per request. Always release clients back to pool.
Sin #5: Missing Rate Limiting
AI doesn't add rate limiting by default. One malicious user (or bug) can drain your OpenAI credits, overwhelm your database, or trigger a $10,000 Vercel bill overnight.
Minimal Rate Limiting (Works Everywhere)
import rateLimit from 'express-rate-limit'
// Per-IP rate limiting
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
message: 'Too many requests, please try again later',
standardHeaders: true,
legacyHeaders: false,
})
app.use('/api/', limiter)
// Per-user rate limiting for expensive operations
const expensiveLimiter = rateLimit({
windowMs: 60 * 60 * 1000, // 1 hour
max: 10, // 10 AI generations per hour
keyGenerator: (req) => req.user?.id || req.ip,
})
app.post('/api/generate', expensiveLimiter, async (req, res) => {
// OpenAI call here
})
Add rate limiting before you need it. You'll thank yourself at 3 AM.
Sin #6: Hardcoded Secrets in Code
Cursor autocompletes API keys directly into your code. Then you commit them to GitHub. Within minutes, bots find them and drain your accounts.
Real story: An indie hacker accidentally committed their OpenAI key. Within 6 hours, bots racked up $2,400 in API charges running crypto mining prompts.
The Fix: Environment Variables
// ❌ NEVER do this
const openai = new OpenAI({
apiKey: 'sk-proj-abc123...' // EXPOSED IN GIT HISTORY
})
// ✅ Always use environment variables
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
})
// Validate secrets at startup
const requiredEnvVars = [
'OPENAI_API_KEY',
'DATABASE_URL',
'STRIPE_SECRET_KEY'
]
for (const varName of requiredEnvVars) {
if (!process.env[varName]) {
throw new Error(`Missing required env var: ${varName}`)
}
}
If you accidentally commit a secret, rotate it immediately. GitHub history is forever.
Sin #7: No Monitoring or Alerts
Your app crashes at 2 AM. Users are angry on Twitter. You wake up at 9 AM and discover you've been down for 7 hours.
Without monitoring, you're flying blind. You need to know when things break before your users do.
Minimal Monitoring Setup (10 minutes)
// 1. Add Sentry for error tracking
import * as Sentry from "@sentry/node"
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV,
tracesSampleRate: 0.1, // 10% of requests
})
// 2. Add health check endpoint
app.get('/health', async (req, res) => {
try {
await db.query('SELECT 1') // Database check
await redis.ping() // Cache check
res.json({ status: 'healthy' })
} catch (error) {
res.status(503).json({ status: 'unhealthy', error: error.message })
}
})
// 3. Set up UptimeRobot (free) to ping /health every 5 minutes
// If it fails, you get an email/SMS immediately
This simple setup catches 95% of production issues before users complain.
The 15-Minute Production Survival Audit
Run through this checklist right now. If you answer "no" to any of these, you're at risk:
Survival Checklist
- Every user-facing component has error boundaries and loading states
- All API endpoints validate input with a schema library (Zod/Joi)
- Database uses connection pooling with max limits
- Rate limiting enabled on all public endpoints
- No API keys or secrets in code (all in environment variables)
- Error monitoring configured (Sentry/Rollbar/LogRocket)
- Health check endpoint exists and is monitored
- Async operations have retry limits and timeouts
- Background jobs have dead letter queues
- You can roll back a deployment in under 5 minutes
When Things Break: The Debug Workflow
Production debugging is different from localhost debugging. Here's the workflow that actually works:
Step 1: Triage (2 minutes)
- Check monitoring dashboard: Is this affecting all users or just one?
- Check health endpoint: Are dependencies (DB, Redis, APIs) responding?
- Check recent deployments: Did you ship something in the last hour?
- Check error logs: What's the most frequent error message?
Step 2: Stop the Bleeding (5 minutes)
- If it's a bad deployment: Roll back immediately (don't try to fix forward)
- If it's a dependency outage: Enable fallback mode or circuit breaker
- If it's one user: Rate limit or block that user temporarily
- If it's database overload: Scale up or enable read replicas
Step 3: Root Cause Analysis (After Stability)
Only after the app is stable, dig into the root cause. Check Sentry traces, database slow query logs, and network timing. Most production bugs are timing issues, race conditions, or resource exhaustion—things that never happen on localhost.
Real Recovery Stories
Three indie hackers who survived production disasters:
Story 1: The Infinite Loop
Sarah's AI chat app had a useEffect that fetched chat history on every render. With 2 users, it was fine. With 50 concurrent users, it made 10,000 API calls in 3 minutes and crashed the database.
The fix: Added a ref to track if the fetch was already in progress, preventing duplicate calls. Also added request deduplication at the API layer.
Story 2: The Memory Leak
James's AI image generator cached generated images in-memory. After 200 generations, the Node process ran out of RAM and crashed.
The fix: Switched from in-memory cache to Redis with TTL (time-to-live). Old images automatically expire after 24 hours, freeing up memory.
Story 3: The API Key Leak
Mike committed his OpenAI key to GitHub. Bots found it and racked up $1,800 in charges before he noticed.
The fix: Rotated the key immediately, added git-secrets pre-commit hook to prevent future leaks, and set up billing alerts in OpenAI dashboard (alert at $50, hard cap at $100).
Your Production Survival Kit
These tools catch 90% of production issues before they become disasters:
Essential Tools
- Error Tracking: Sentry (free tier) - Catches exceptions and performance issues
- Uptime Monitoring: UptimeRobot (free) - Alerts you within 5 minutes of downtime
- Rate Limiting: express-rate-limit or upstash/ratelimit - Prevents abuse
- Input Validation: Zod - Type-safe validation at API boundaries
- Database Pooling: pg-pool or Prisma - Prevents connection leaks
- Secrets Management: Doppler or Infisical - Never commit secrets again
- Log Aggregation: Axiom or Logtail (free tier) - Searchable logs across all servers
Summary
AI code generators are incredible for building fast. But they optimize for "works on my machine" not "survives real users at 2 AM."
The good news: You don't need a DevOps team to production-harden your AI-built SaaS. You need 3 things:
- Error boundaries everywhere (handle loading, error, empty, success states)
- Input validation at API boundaries (never trust user data)
- Basic monitoring (know when things break before your users do)
These three practices alone will prevent 80% of production disasters.
The other 20%? You'll learn those the hard way. But with error monitoring, you'll catch them fast, fix them once, and move on.
Your app doesn't need to be perfect. It needs to fail gracefully, recover automatically, and alert you when manual intervention is required.
That's the difference between a side project and a real SaaS business.
Drowning in production fires? Get a free async vibe audit—we'll watch your Loom walkthrough, review your repo, and send back a personalized 8–12 minute video + PDF showing exactly where your app will break under real load.