Skip to content

Use case★★★★★8 min read

Translate a blog into 3 languages with LLM + spot-check workflow

Don't run blog posts through Google Translate. Here's the workflow that produces translations readers don't notice are translations.

If you write a blog and want to reach readers in other languages, you have three options: hire human translators (expensive, slow, hard at scale), use machine translation directly (fast, cheap, but obviously translated), or use an LLM with a careful workflow that produces translations readers can't tell are translations. The third option is what works in 2026.

The basic recipe

For each post:

  1. Pass the source post through an LLM with a tuned prompt (described below)
  2. Get back a draft translation
  3. Spot-check 2-3 paragraphs by reading the draft
  4. Run a final pass to fix the issues you found
  5. Publish

Total time per post: 15-30 minutes. Cost: under $0.50 per post in API fees for ~2000-word articles. Quality: indistinguishable from a competent human translator on 90% of content.

The translation prompt

The prompt structure that consistently produces native-feeling output:

You are translating an article from [source language] to [target language]
for a [audience description] audience.

Guidelines:
- Tone: [match source — casual / professional / opinionated / etc]
- Keep these terms in [original language]: [glossary]
- Don't translate code blocks or technical syntax
- Don't add explanations for terms that would be obvious to the audience
- Match the source paragraph structure
- The audience reads native-level [target language]; do not over-explain
- Output only the translation, no commentary

Source:
[paste article]

The glossary is critical. For Chinese tech writing, you usually want to keep things like "prompt," "agent," "RAG," "context window," "token," model names (Claude, GPT, Gemini), and product names (Cursor, Lovable) in English. Adding "keep these in English" explicitly prevents over-localization.

Picking the right model

Claude 4.5 Sonnet is the default. It produces the most natural output for most language pairs and respects "keep these terms" instructions reliably.

GPT-5 is a close second and slightly cheaper for high volume.

Gemini 2.5 Pro shines specifically on Chinese ↔ English; the multilingual training is dense.

DeepSeek V3 / Qwen 2.5 are the strongest open-source choice and produce great Chinese output. Use these if cost matters or you're translating into Chinese.

DeepL has been losing this market — its translations are stylistically uniform and don't respect glossary or audience context the way LLMs do. Still useful as a backup or sanity check.

What to spot-check

Don't read the entire translation. You'll burn out and stop noticing issues. Instead:

  • Read the title and intro paragraph carefully. First impressions matter most for retention.
  • Read every paragraph that contains a quote, a number, or a name. These are where translation errors hide.
  • Skim section headers. A bad header (overly literal translation) signals trouble below.
  • Read the last paragraph. Endings often degrade as the LLM "runs out of attention."
  • Search for any glossary terms. Verify they were preserved in the right language.

Everything else, trust unless something looks wrong as you scroll.

Common failure modes

Over-localization. The LLM tries to be helpful by replacing English brand names with localized versions, or replacing American examples with local ones. Counter with: "Don't culturally adapt — translate, don't reframe."

Tone drift. A casual post becomes formal. Counter by including the source tone explicitly: "The original is conversational, opinionated, and uses contractions. Match this."

Numbers and dates getting reformatted. Often desirable (US → European date format) but sometimes wrong. Be specific in the prompt about which format you want.

Glossary leaks. "Prompt" gets translated to "提示" once or twice in a long article. Search-and-fix afterward.

Awkward calques. English idioms translated literally. The big ones in tech: "out of the box," "low-hanging fruit," "move the needle." If you use these, replace before translation or note them in glossary.

Multi-language workflow

For 3+ languages, batch:

  1. Translate from source (English) to each target language separately. Don't translate from a translation.
  2. Use the same model for consistency unless one language is dramatically better in another model (Chinese might be better in Gemini even if English source goes through Claude).
  3. Maintain a per-language glossary file. They diverge over time.
  4. Track issues per language. zh-TW and zh-CN have different conventions; what works in one fails in the other.

For Chinese specifically, run zh-TW and zh-CN as separate translations from English source, not as character conversion. The vocabulary differs (软件/軟體, 视频/影片, 默认/預設). LLMs handle this if you tell them which variant.

When NOT to LLM-translate

Marketing copy that's brand-defining. Hero headlines, taglines, ad copy where every word matters. Pay a transcreation specialist or a native copywriter.

Legal text. Terms of service, privacy policies, contracts. The cost of a mistranslation is real legal liability.

Poetry, fiction, or anything where rhythm matters. LLMs translate meaning competently but rhythm and rhyme almost never. Hire a literary translator.

Audio content directly translated to audio. Translating a transcript is fine; using TTS on the translation is fine; but the natural rhythm of speech differs across languages. Re-script for the target language audio.

A measurable improvement loop

If you publish translations regularly, build feedback:

  • Survey readers in target language. "Did this read like it was originally written in your language?"
  • Track engagement metrics by language. If translations have 50% lower time-on-page, the translation quality is hurting you.
  • Get a native reviewer (paid or community) to flag issues quarterly. They'll find patterns the LLM keeps making.
  • Update your glossary and prompt based on learnings.

Most solo blogs and small teams skip this. It's still worth doing once a quarter.

Decision tree

  • Personal blog, occasional posts, low stakes: direct LLM translate, light spot-check
  • Professional blog, regular posts: LLM + glossary + spot-check workflow
  • High-stakes brand content: LLM as draft + native human editor
  • Legal / contractual: certified human translator, no LLM
  • Poetry / literary: human translator, no LLM

Next steps

  • Build a per-language glossary file you reuse across posts
  • Pick one model and stick with it for a quarter; consistency matters
  • Read about prompt engineering for translation specifically
  • Get one native reader to spot-check your output once a month

Last updated: 2026-04-29

We use cookies

Anonymous analytics help us improve the site. You can opt out anytime. Learn more