Use AI to write SQL queries when you're not an engineer

AI is genuinely transformational for non-engineers who need to pull data from databases. PMs, marketers, ops people, and analysts who used to wait hours or days for an engineer to write a query can now self-serve. The catch: AI-generated SQL is wrong often enough that blindly trusting it can produce confidently incorrect numbers. Bad numbers driving bad decisions is worse than no data.

What AI does well for SQL

Translating natural language to SQL for clear, single-table queries
Suggesting JOIN structures across well-named tables
Explaining what an existing query does line-by-line
Adapting a query you found to your schema
Generating syntactically valid SQL across dialects (PostgreSQL, MySQL, BigQuery, Snowflake, etc.)

What AI does poorly

Knowing your specific schema's quirks (which is the active customers table — is it customers with deleted_at IS NULL, or active_customers view?)
Distinguishing similar-looking columns (created_at vs created_at_local, revenue vs gross_revenue)
Catching that two columns can be joined but shouldn't be (foreign key by data type, not relationship)
Knowing your soft-delete conventions, partitioning rules, business filter requirements
Performance optimization that depends on indexes and data volumes you haven't told it about

A workflow that works

Step 1: feed AI your schema. Don't ask SQL questions without context. Give Claude or GPT your schema (table names, column names, types, brief description of what each table represents). For databases with hundreds of tables, give the relevant subset for the question.

Useful trick: ask AI to summarize the schema first. "Based on this DDL, summarize the data model in 2 paragraphs and call out anything that looks unusual." If AI's summary is wrong, your follow-up queries will be wrong.

Step 2: ask in natural language with context. "From the orders table joined to customers, show me the top 10 customers by total spend in Q3 2025, excluding refunded orders. Format the spend as USD currency."

More context = more accurate query. Mention: time periods, business filters ("exclude test accounts"), output format requirements, edge cases you care about.

Step 3: read the query before running it. If you don't understand it line-by-line, ask AI to explain. "Walk me through this query in plain English, what each clause does and why." If something doesn't match what you intended, fix the prompt and regenerate.

Step 4: run on a sample first. Add LIMIT 100 (or WHERE date > '2025-12-01') to test the query on a small slice. If results look reasonable, expand to full date range.

Step 5: sanity-check the output. Does the count fit your expectation of business volume? Do top-N results have names you recognize? Are there any NULLs or zeros in unexpected places? Trust your business intuition; if a number feels too high or too low, dig into why.

Common AI SQL failures

Wrong table for the concept — AI sees a users and customers table, picks users, but customers is what you wanted
Missing soft-delete filter — AI's query doesn't include WHERE deleted_at IS NULL and counts archived records
Time zone confusion — AI uses UTC when your business uses local; off-by-one errors on date boundaries
Wrong aggregation — COUNT(*) vs COUNT(DISTINCT customer_id) matters and AI sometimes picks wrong
Inflated joins — AI joins tables in a way that multiplies rows; sums become 3× what they should be
Hardcoded test data — AI writes the right structure but uses example data from training; you run it and get nothing

The fix for all of these: read the query, run on a sample, sanity-check.

Building a per-database knowledge file

For any database you query repeatedly, build a markdown file with:

Schema overview
Important business rules ("customers with status='trial' should be excluded from revenue queries")
Common pitfalls you've hit ("the events table has duplicates from reruns; use DISTINCT on event_id")
Standard filters ("production data only: env='prod'")
Saved queries you trust

Paste this file as context every time you ask AI for SQL on that database. The accuracy improvement is dramatic.

Tools that help

Cursor / VS Code with AI — write SQL with autocomplete that knows your schema
Hex / Mode / Metabase — analytics platforms with AI-assisted SQL built in
Supabase / Neon SQL editor — built-in AI for the database you're already using
dbt + AI — for production analytics, AI helps draft transformations you'll review and version

For casual one-off queries, ChatGPT or Claude with schema context is enough. For repeated work, an integrated tool that knows your schema is much faster.

When NOT to AI-generate SQL

Production-affecting queries. Anything that writes (INSERT, UPDATE, DELETE) or that affects production performance. Have an engineer review.

Compliance-relevant data. Queries pulling PII, financial data subject to audit, or anything regulated. The query and its output need an audit trail; AI generation doesn't fit cleanly.

Performance-sensitive queries. If the query will run on millions of rows or in a hot path, it needs index awareness AI doesn't have. Have an engineer optimize.

Critical reporting numbers. Numbers shown to executives or used for paying commissions need to be triple-checked. AI-generated queries are fine for first draft; verification is non-negotiable.

The data quality trap

The danger isn't getting wrong SQL once — that's caught quickly. The danger is getting subtly wrong SQL repeatedly that produces plausible numbers nobody questions. "Customer churn was 4.3% last quarter" feels precise. If the query miscounted, the number is fiction, but it'll be repeated for months.

Develop the habit: any number that drives a decision gets verified by either running a different angle on the same question or asking someone who knows the data well. "Does this look right?" to an analyst friend takes 30 seconds and prevents months of wrong-direction decisions.

Decision tree

One-off curious query, low stakes: AI with schema context, sample-test
Recurring report you'll rely on: AI for first draft, engineer review for production
Customer-facing or compliance data: engineer-written, AI for explanation only
Learning SQL: AI as tutor + write your own queries

Next steps

Build the schema knowledge file for your most-queried database
Always read SQL before running it; ask AI to explain anything you don't get
Bookmark good queries; AI rewrites are easier when you have working examples
For team work, share schema knowledge files; everyone benefits from one person's investment