The Language Rewrite Question: When Migration Actually Pays Off — and When It Doesn’t

Eleftheria DrosopoulouMarch 25th, 2026Last Updated: March 26th, 2026

0 140 12 minutes read

Discord, Figma, Shopify, and Dropbox all documented their rewrites with real outcomes. There is now enough evidence to build a serious decision framework — and to name when a rewrite is an expensive engineering vanity project.

1. The Question That Never Goes Away

Every few years, a team hits a wall with their current stack and someone in the room says the words: “We should just rewrite this in something better.” What follows is usually either one of the best engineering decisions the team ever makes — or a year of missed features and a codebase that somehow ends up with the same problems in a different language.

The frustrating reality is that both outcomes are well-documented and both happen for reasons that were, in retrospect, entirely predictable. Joel Spolsky’s 2000 essay argued that rewriting from scratch is “the single worst strategic mistake that any software company can make,” citing Netscape’s three-year disappearing act while competitors ate their market share. That argument holds real weight and still gets cited in architecture discussions today. But it also predates a generation of carefully documented, scoped, metrics-driven rewrites that succeeded precisely because they did not follow the pattern Spolsky was warning about.

In 2026, there are enough public post-mortems with real numbers to move beyond anecdote. Discord published latency charts with microsecond-level before-and-after comparisons. Figma documented both the wins and the rough edges from their TypeScript-to-Rust migration. Shopify built a YJIT compiler rather than migrate away from Ruby — and used it to handle 489 million requests per minute on Black Friday 2025. The evidence is there. What has been missing is a clear framework for interpreting it. This article is that framework.

2. Four Case Studies with Real Outcomes

Table 1 — Four Case Studies with Real Outcomes

Company	Migration	Headline result	Primary motivation	Scope	Outcome
Discord	Go → Rust · 2019	~0 msGC pauses eliminated. Avg latency moved from milliseconds to microseconds. Rust beat Go on every metric after profiling.	GC pauses every 2 min for users with 1,000+ servers — no in-language fix possible	Read States service only — one bounded component	Clear ROI
Figma	TypeScript → Rust · 2018	10×Multiplayer server performance. Single-threaded Node.js was blocking the event loop on large documents.	Node.js single-threaded runtime — structural constraint, not fixable in-language	Multiplayer hot path only — initial plan to rewrite whole server was dropped	Clear ROI
Shopify	Ruby — stayed · 2023–25	489MRequests/min on Black Friday 2025. Invested in YJIT (Rust-backed Ruby JIT) and pod architecture instead of migrating.	Scaling pressure — chose to invest in current stack rather than migrate away from Rails	No migration — architecture and tooling investment instead	Counter-case
Dropbox	Python → Rust · 2019	“Best””Betting on Rust was one of the best decisions we made” — correctness and concurrency as primary wins, not just speed.	Correctness: encoding invariants in the type system that Python’s dynamic types could not enforce	File sync engine — bounded with clear interfaces	Clear ROI

3. Discord: Go → Rust — The Clean Success

The Discord rewrite is the most-cited successful language migration in recent engineering history, and it deserves that status — primarily because the problem definition was unusually precise before a single line of Rust was written.

Discord’s Read States service — which tracks which messages each user has read across all their servers — was written in Go. As Discord documented in their own blog post, the problem was specific: Go’s garbage collector needed to scan the entire LRU cache to determine which memory could be freed. As the service scaled to millions of concurrent users with larger and larger caches, this produced GC pauses of 10–50 milliseconds every two minutes. For most users, invisible. For power users with thousands of servers, a perceptible and recurring latency spike.

Crucially, the Go team had already exhausted in-language solutions. They had tuned GC settings extensively. They could not go further without either changing the problem or changing the language. The result of the Rust rewrite was unambiguous: average response time moved from milliseconds to microseconds. GC pauses were eliminated entirely — Rust has no GC. CPU and memory usage both improved. And the initial Rust port, written with only basic optimisation effort, already outperformed the hyper-tuned Go version. After further profiling, it beat Go on every single metric.

This case illustrates three conditions that, when present together, make a rewrite compelling: a quantified, specific bottleneck; evidence that the current language cannot solve it within its own paradigm; and a scoped component small enough to rewrite in weeks rather than months.

The Discord Lesson: The initial port from Go to Rust was completed in approximately six months with a small team. The ROI was immediate and measurable on day one of production deployment. Discord’s own summary: “We don’t think you should rewrite everything in Rust just because.” — The qualification is as important as the result.

4. Figma: TypeScript → Rust — The Scoped Win

Figma’s multiplayer server was originally written in TypeScript. It was, as they note in their own post-mortem, “surprisingly good” for years. The problem that eventually forced action was structural to the runtime, not the code: TypeScript runs single-threaded on Node.js. When a slow operation — like encoding a very large Figma document — blocked the event loop, every other document on that worker waited. There was no in-language solution. Single-threaded JavaScript is single-threaded.

The decision to rewrite in Rust was a deliberate scope limitation. As Evan Wallace, Figma’s co-founder and original author of the multiplayer protocol, wrote directly: “Our multiplayer server is a small amount of performance-critical code with minimal dependencies, so rewriting it in Rust even with the issues that came up was a good tradeoff for us. It enabled us to improve server-side multiplayer editing performance by an order of magnitude.” The key phrase is “small amount of performance-critical code with minimal dependencies.” That sentence describes the rewrite. Not the whole product — one service, one hot path, tight scope.

Figma’s post is notable also for its honesty about what went wrong. Rust’s ecosystem at the time was less mature than today. Two compression libraries they tried had correctness bugs that would have caused data loss. The async API (futures) had ergonomic issues that made them abandon Rust for some network handlers and fall back to C via FFI for compression. These are the kinds of friction costs that rarely make it into celebration blog posts. Furthermore, Figma explicitly dropped their initial plan to rewrite the whole server in Rust, choosing instead to focus solely on the performance-sensitive part.

5. Shopify: The Rewrite Avoided — The Most Important Counter-Case

The Shopify story is arguably more instructive than the Discord or Figma rewrites, precisely because it is the story of what happens when a team resists the rewrite impulse and invests in the existing stack instead.

Shopify runs Ruby on Rails. Their monolith dates to the early 2000s. For years, the standard engineering consensus has been that Ruby doesn’t scale — a claim that Shopify has been methodically refuting at increasing scale. Rather than migrate away from Ruby, Shopify made several strategic investments. They built YJIT, a Just-in-Time compiler for Ruby written in Rust, which became the default JIT in CRuby and delivered 15%+ throughput improvements for Rails applications globally. They created Sorbet, a static type checker for Ruby. They invested in a pod-based sharded deployment architecture that isolates failure domains. They built Ruby LSP for first-class editor intelligence.

The outcome is documented: on Black Friday 2025, Shopify’s Rails monolith processed $14.6 billion in merchant sales, handling peak loads of 489 million requests per minute on the edge and over 53 million database queries per second. Not despite staying on Ruby. Because of the depth of investment they made in that stack instead of migrating away from it.

The Shopify Principle: Shopify’s engineering leadership made a deliberate decision to treat Ruby and Rails as “100-year tools” and invest accordingly. The cost of that decision was saying no to the migration conversation for a decade. The payoff was one of the most resilient, high-throughput Ruby deployments in existence. The principle generalises: before asking “what language should we migrate to,” ask “what would happen if we invested this same engineering budget into making the current stack as good as it could be?”

6. Dropbox: Python → Rust — The Hybrid Route

Dropbox’s file sync engine rewrite is a less-discussed but important data point because the primary stated motivation was not performance — it was correctness and concurrency safety. The Dropbox engineering team’s own conclusion was that “Rust has been a force multiplier for our team, and betting on Rust was one of the best decisions we made. More than performance, its ergonomics and focus on correctness has helped us tame sync’s complexity. We can encode complex invariants about our system in the type system and have the compiler check them for us.”

This represents a third rewrite motivation pattern distinct from both Discord and Figma. Discord rewrote for GC-latency elimination. Figma rewrote for concurrency headroom in a single-threaded runtime. Dropbox rewrote to encode correctness guarantees that Python’s dynamic type system could not provide, in a system where concurrency bugs only appeared under load in production. The Rust ownership model and type system became their bug-prevention infrastructure — a different class of ROI that does not show up in a latency chart.

7. What the Data Actually Shows

Rewrite outcome patterns across documented cases

Categorised outcomes from 12 publicly documented language migrations (2017–2025). Source: engineering blog posts, conference talks, post-mortems. “Partial” = some services migrated; “Full” = complete language replacement; “Hybrid” = Rust/Go embedded in existing stack.

Several patterns emerge clearly from the documented cases. First, the most consistent predictor of a successful rewrite is the narrowness of the initial scope. Every successful case — Discord, Figma, Dropbox, npm’s registry rewrite, Cloudflare’s network-critical services, AWS Firecracker — involved a bounded component with clear before-and-after measurement criteria. Every documented failure or regret involves a broader scope: a full product rewrite, a migration without a specific quantified problem, or a team that spent the rewrite timeline losing product velocity to competitors.

Second, the motivation matters as much as the destination language. Rewrites driven by a specific, measurable problem — GC latency, single-threaded bottlenecks, memory safety bugs in security-critical code — have a significantly better track record than rewrites driven by “the codebase is messy” or “the new language has better ergonomics.” The latter reasons are not invalid, but they do not produce the kind of measurable ROI that justifies the migration cost.

8. The Anatomy of a Vanity Rewrite

The term “engineering vanity project” deserves a precise definition rather than being used as a vague insult. A vanity rewrite has some or all of these characteristics, and it is worth naming them plainly.

Table 2 — Signs the ROI is Real vs. Signs it is a Vanity Project

Signs the ROI is real	Signs it is probably a vanity project
You can state the specific bottleneck in one sentence, with a number attached	The primary motivation is “the new language is cleaner / more modern”
You have already exhausted in-language optimisations and can prove it	Nobody can define what “done” looks like in measurable terms
The scope is a single component or hot path, not the whole system	The scope keeps expanding as work progresses
Success criteria are defined before any migration code is written	The team needs to learn the target language during the migration
The team migrating is already productive in the target language	There is no plan for running old and new in parallel
The current language cannot solve the problem within its own paradigm	The problem could be solved with profiling and targeted refactoring
You can run old and new in parallel and do a measured comparison	The timeline is measured in months before any production validation

Notably, the “messy code” motivation that Joel Spolsky targeted in 2000 is still the most common driver of failed rewrites. As he pointed out, and as every subsequent study has echoed: working code, however messy, contains years of hard-won knowledge about edge cases, weird inputs, and failure modes that the team has encountered and handled. A rewrite does not preserve that knowledge — it discards it. The new codebase starts with clean architecture and rediscovers every one of those edge cases in production.

The Hidden Cost Nobody Budgets For: Development velocity typically drops 30–50% during the first 3–6 months of Rust adoption for teams coming from garbage-collected languages. Senior engineers who are productive in Go or Python find themselves debugging lifetime annotations instead of shipping features. This is not a reason to avoid Rust when it is the right tool — it is a cost that must be explicitly budgeted and communicated to stakeholders before the migration starts, not discovered mid-project.

9. The Decision Framework

The following framework is derived from the patterns across successful and failed rewrites. Work through the questions in order. The earlier you hit a stopping condition, the more clearly the data suggests the rewrite will not pay off.

Table 3 — Decision Framework: Work Through in Order

#	Question	What to look for	If no / unsure
01	Can you state the specific problem in one sentence with a measurable number?	“GC pauses cause 40 ms spikes every 2 min for users with 1,000+ servers” — not “the code is slow.”	Stop
02	Have you demonstrably exhausted in-language solutions?	Profiling, algorithmic improvements, infrastructure scaling, and GC tuning all tried and failed.	Stop
03	Is the problem inherent to the language’s model — not just your code?	Go’s GC cannot be disabled. Node.js is single-threaded. Python’s GIL limits parallelism. These are model constraints; a slow algorithm is not.	Stop
04	Is the scope a single bounded component?	Discord’s Read States service. Figma’s multiplayer path. Dropbox’s sync engine. Not “the backend” or “the monolith.”	Risky
05	Can you run old and new in parallel and measure the difference?	A/B deployment is the only way to validate ROI on day one. Without it, you cannot prove the migration paid off.	Risky
06	Is the team already productive in the target language, or has learning budget been explicitly allocated?	Rust’s 30–50% velocity drop in the first 3–6 months is real and documented. Budget it before you start or it will look like project failure.	Budget first
07	Is the business stable enough to absorb the feature velocity cost?	Pre-PMF startups should almost never rewrite. The bottleneck you are solving today may be in a component you pivot away from next quarter.	Yes to all → proceed

10. Language Fit Guide: Rust vs Go vs TypeScript

Scenario	Best fit	Why	Avoid if
GC-pause latency in hot path	Rust	No GC, deterministic memory release, Discord/Figma proven pattern	Team has no Rust experience — budget 3–6 months onboarding cost first
Python/Ruby service hitting CPU ceiling	Rust or Go	Either language will dramatically improve CPU throughput; Go is lower learning curve	The bottleneck is actually I/O or database — optimise that first
Memory safety bugs in security-critical code	Rust	Ownership model eliminates entire classes of CVEs at compile time; Microsoft, AWS pattern	The codebase is not security-critical — the ROI requires a security threat model to justify
High-concurrency microservice or CLI tool	Go	Goroutines, lower learning curve than Rust, single binary deployment, strong stdlib	You need guaranteed-zero GC pauses — Go’s GC is good but not eliminable
Gradually typing a JavaScript codebase	TypeScript	No language migration — incremental adoption, same runtime, immediate IDE feedback	You want runtime performance gains — TypeScript compiles to JS, no runtime difference
Legacy Ruby/Python — scaling pressure but working product	Stay + invest	Shopify pattern: invest in JIT, type checkers, and architecture before migrating	You’ve exhausted this path and still have a specific, measured language-model constraint
Proof of concept / startup, pre-PMF	Don’t rewrite	Pivot risk makes migration cost unrecoverable; performance rarely limits growth at this stage	You have a specific security or safety requirement that genuinely requires a different language

The rewrite velocity curve: scoped component vs full system

Illustrative development velocity (relative units) during and after a rewrite. Scoped component rewrites recover velocity quickly; full-system rewrites create a prolonged valley. Based on documented case study patterns.

11. What We Have Learned

The language rewrite question has a real answer in 2026 — it is just not the same answer for everyone. Here is the distilled version of what the documented evidence shows:

Scope is everything. Every successful documented rewrite was a bounded component, not a full system. Discord’s Read States service. Figma’s multiplayer hot path. Dropbox’s sync engine. When the scope expands to “the backend,” the failure rate climbs sharply.
The problem must be measurable and language-model-specific. GC pauses, single-threaded event loops, GIL limitations — these are language-model constraints. Slow algorithms and messy code are not. Only the former justify a migration.
The counter-case (Shopify) is as instructive as the success cases. Investing in the existing stack — JIT compilers, type checkers, better architecture — often delivers more ROI than a migration, and it preserves years of accumulated production knowledge.
Rust’s velocity cost is real. A 30–50% productivity drop during the first 3–6 months is documented consistently. It is not a reason to avoid Rust when it is the right tool. It is a cost that must be planned for explicitly.
Rust is the right answer for: GC-pause elimination, memory safety in security-critical paths, and correctness guarantees in complex concurrent systems. Go is the right answer for: high-concurrency services, developer teams that need faster onboarding, and CLI tools requiring a single static binary.
TypeScript is almost never a language migration. It is an incremental typing layer on existing JavaScript. Conflating it with a Rust or Go rewrite overstates both its cost and its benefit.
Joel Spolsky was right about full rewrites and wrong about scoped ones. The distinction he missed — which the 2020s evidence makes clear — is that a rewrite of one bounded component is categorically different from a full-system rewrite. The former can be rational. The latter almost never is.

The Language Rewrite Question: When Migration Actually Pays Off — and When It Doesn’t

1. The Question That Never Goes Away

2. Four Case Studies with Real Outcomes

3. Discord: Go → Rust — The Clean Success

4. Figma: TypeScript → Rust — The Scoped Win

5. Shopify: The Rewrite Avoided — The Most Important Counter-Case

6. Dropbox: Python → Rust — The Hybrid Route

7. What the Data Actually Shows

Rewrite outcome patterns across documented cases

8. The Anatomy of a Vanity Rewrite

9. The Decision Framework

10. Language Fit Guide: Rust vs Go vs TypeScript

The rewrite velocity curve: scoped component vs full system

11. What We Have Learned

Thank you!

Eleftheria Drosopoulou

Thank you!

1. The Question That Never Goes Away

2. Four Case Studies with Real Outcomes

3. Discord: Go → Rust — The Clean Success

4. Figma: TypeScript → Rust — The Scoped Win

5. Shopify: The Rewrite Avoided — The Most Important Counter-Case

6. Dropbox: Python → Rust — The Hybrid Route

7. What the Data Actually Shows

Rewrite outcome patterns across documented cases

8. The Anatomy of a Vanity Rewrite

9. The Decision Framework

10. Language Fit Guide: Rust vs Go vs TypeScript

The rewrite velocity curve: scoped component vs full system

11. What We Have Learned

Thank you!

Related Articles

Thank you!