Jose Cotes | Engineer, Consultant, Podcaster & Creator

Why Translation Memory Still Matters in 2026

If you work in a localization company—or you're just localization‑curious—you've definitely heard people talk about translation memory (TM). It's one of those terms everyone throws around in project kickoffs and quotes:

"We'll leverage your existing TM."
"Your fuzzy match rate is excellent."
"We can reduce costs thanks to TM reuse."

But what does that actually mean in practice today, when machine translation (MT) and AI are everywhere? Is translation memory still worth caring about?

Short answer: yes—but it needs to be managed intentionally.

In this post, we'll walk through:

What translation memory really is (and what it isn't)
How TM works inside a modern CAT tool
How TM, MT, and glossaries work together
The metrics that actually matter (cost, leverage, savings)
Practical guidelines to keep your TM clean and useful

Translation memory workflow

What Translation Memory Really Is

At its core, translation memory is a database of aligned source and target segments (usually sentences or short phrases) that your team has already translated.

Each entry typically stores:

Source segment
Target segment
Language pair (e.g. en-US → de-DE)
Metadata (who translated it, when, status, domain, client, etc.)

When your translators or linguists work in a CAT tool, that tool constantly checks whether the current segment has an exact or similar match in the TM. If it does, the tool suggests it—saving time and ensuring consistency.

Translation memory is not:

A full terminology database (that's your glossary/term base)
A full style guide
A replacement for MT engines
A replacement for human linguists

Instead, it's a structured memory of past decisions that should make translators:

Faster
More consistent
Cheaper (per usable word)

How TM Works Inside a CAT Tool

Let's zoom in on what happens inside a typical CAT tool when you translate a new file.

Translation memory workflow illustration

Segmentation

First, your CAT tool segments the source text:

It splits the content by sentence boundaries, punctuation, and custom rules.
It avoids splitting inside abbreviations, numbers, or code (ideally).

Why this matters: Bad segmentation = bad TM matches. If segments are inconsistent, you'll never fully benefit from TM leverage.

Matching Against the TM

For each source segment, the CAT tool runs a similarity check against all segments in the TM:

100% match (or 101% / context match): same text and same immediate context
High fuzzy match: e.g. 95%, 85%, 75% similar
No useful match: nothing similar enough is found

These percentages come from string similarity algorithms (Levenshtein distance and related methods), but you don't need to know the math to use them effectively. What matters is:

You define thresholds (e.g. "show matches ≥ 70%")
The tool ranks suggestions by similarity

Presenting Suggestions to the Linguist

The CAT tool shows the linguist:

The source segment
One or more TM suggestions with match percentages
Sometimes an MT suggestion (e.g. from a neural MT engine)
Glossary hits and QA warnings

The linguist then:

Accepts and lightly edits a good TM match
Cherry-picks parts of a fuzzy match
Or ignores them and works from MT or from scratch

Localization specialist working in a CAT tool

Storing the Final Translation Back to TM

After the linguist confirms a segment, the CAT tool can:

Save the new source–target pair into the TM
Overwrite an older entry (depending on your configuration)
Store it in a client-specific or project-specific TM

This is how your memory grows over time and becomes more and more valuable.

How TM, MT, and Glossaries Work Together

In 2026, most professional workflows are hybrid:

TM as the first layer
- Check for exact and fuzzy matches from past work.
Glossary enforcement
- Highlight mandatory terms, brand names, and forbidden terms.
MT as a fallback
- For segments with no or low-quality TM matches, call an MT engine.
Human review layer
- Linguists edit TM matches and MT output, enforcing style and terminology.

This layered approach gives you:

Consistency (thanks to TM + glossary)
Speed (thanks to MT)
Quality (thanks to human review and QA)

The Metrics That Actually Matter

Translation memory isn't just a technical feature—it's tightly connected to cost and turnaround time.

Here are the key concepts you'll hear in LSPs and localization teams:

Match Categories

You'll often see match bands like:

100% / 101% / Context matches
95–99%
85–94%
75–84%
New/No match

Each band is usually associated with a discount or weighting factor.

Weighted Word Count

Most LSPs price based on these match categories. A simplified example:

100% matches: counted at 10% of full rate
95–99% matches: 30%
85–94% matches: 60%
New words: 100%

So if your file has:

1,000 new words
500 words in 95–99% matches
500 words in 100% matches

Your weighted word count might be:

1,000 × 1.0 = 1,000
+ 500 × 0.3 = 150
+ 500 × 0.1 = 50
Total: 1,200 weighted words

That's what you'll be quoted on—not the raw 2,000 words.

Productivity and Turnaround

With a good TM:

A linguist might do 2,000–2,500 new words/day,
But 4,000–5,000+ words/day when a lot of segments are 100% or high fuzzy matches.

That's a huge difference in project timelines, especially for large continuous localization programs.

Common Ways Teams Misuse Translation Memory

TM is powerful, but it's easy to sabotage its value. Here are some classic mistakes.

TM Pollution

TM pollution happens when your memory contains:

Bad translations
Inconsistent terminology
Wrong locales (e.g. mixing pt-BR with pt-PT)
Outdated branding or product names

Once polluted, your TM will:

Suggest bad translations that are hard for linguists to ignore
Propagate mistakes across multiple projects
Lower quality and drive more rework

How to avoid it:

Restrict who can write to the main TM (e.g. only vetted linguists or post-review)
Run periodic TM audits and clean-up sprints
Use client-specific TMs to avoid cross-contamination

Over-Editing Fuzzy Matches

Not every fuzzy match is worth saving.

If a linguist spends more time fixing a 75% match than they would translating from scratch, your productivity drops.

Practical tips:

Set clear guidelines: e.g. "Below 80%, you can ignore the match unless it's structurally useful."
Encourage linguists to trust their judgment and not feel forced to use a bad fuzzy match.

Using One TM for Everything

Mixing:

Marketing
Legal
UI strings
Support content

…in the same TM guarantees messy results.

Better approach:

Separate TMs by client, product, and content type where it makes sense.
Use TM prioritization in your CAT tool (e.g. "Prefer this product TM over generic TM").

No Governance or Documentation

If your team doesn't document:

When to create a new TM
How to clean or merge TMs
How to handle legacy TMs

…you'll end up with a pile of disconnected memories, each half-useful.

A simple TM governance doc goes a long way.

Practical Guidelines for Using TM Effectively

Here's a compact playbook you can adapt to your team.

For Localization Project Managers

During scoping:

Always request existing TMs and glossaries from the client.
Analyze leverage and compute weighted word counts to forecast costs and timelines.

During setup:

Attach the right TM(s) to the project (client-specific, domain-specific).
Make sure write access is limited to vetted linguists or post-editors.
Enable MT only where it makes sense (e.g. no MT for high-risk legal content if the client forbids it).

After delivery:

Run spot checks on the updated TM for obvious issues.
Capture learnings (e.g. "new product name added," "style change") in your knowledge base.

For Translators and Linguists

Treat TM matches as suggestions, not orders.
Always align TM output with:
- Glossary
- Style guide
- Client feedback
Flag obviously wrong TM entries so PMs can clean them.
Don't over-edit good TM matches just to "sound more like you"—consistency matters more than subjective style.

For Localization Engineers / Tooling Owners

Standardize segmentation rules across tools and languages.
Automate TM maintenance where possible:
- Deduplication
- Locale split/merge
- Basic linguistic QA checks
Define and document TM environments:
- Dev TM (experimental)
- Staging TM (client review)
- Production TM (approved, locked)

Should You Reuse a Legacy TM?

This is a common question when onboarding a new client or starting a product refresh.

Ask yourself:

Has the product or brand changed drastically?
- If yes, maybe create a new TM and selectively migrate good segments.
Are there many known quality issues in the legacy TM?
- If yes, review and clean before reuse.
Is the content type totally different?
- Marketing vs. technical docs may justify separate TMs.

A hybrid strategy often works best:

Use legacy TM as read-only reference.
Build a new, write-enabled TM for the current project.
Gradually merge vetted segments from the old TM into the new one.

Wrapping Up: TM Is a Tool, Not a Magic Bullet

Translation memory is one of the most mature and reliable productivity tools in localization. When used well, it:

Cuts costs
Speeds up delivery
Enforces consistency

But it's not a replacement for good terminology management, style guides, or skilled linguists. Think of it as one layer in a modern localization stack:

TMS/CAT: Orchestration, TM management, workflows
TM: Human-curated memory of previous translations
MT engines: Neural MT for speed and cost efficiency
Glossaries & term bases: Terminology enforcement
QA tools: Automated checks for consistency, formatting, and errors
Human linguists: Final quality, cultural adaptation, creativity

TM is still central—it just doesn't operate alone anymore.

In future posts, we'll dive into related topics like terminology management, QA workflows, and how to design content that maximizes TM reuse from day one.

BLOG

The Real Value of Translation Memory in 2026