Back to Blog

Strategy • Engineering

When Should a Startup Rewrite Its Codebase?

Almost never. But sometimes you genuinely have no choice. Here is how to tell the difference, and how to do it without killing your company.

Mike Tempest 10 min read

The Rewrite Temptation

Every startup reaches the moment. Deployment takes three hours. A simple feature change breaks two unrelated things. Your engineers are spending more time working around the existing code than building anything new. Someone in a standup says the words: "We should just rewrite this from scratch."

It is a seductive idea. Start fresh. Do it properly this time. Use modern tools, clean architecture, all the things you wish you had done from the beginning. The fantasy is compelling because it is simple. The old system is messy. A new system would be clean. Therefore, rewrite.

The problem is that rewrites are one of the most reliably destructive decisions a startup can make. Joel Spolsky called it "the single worst strategic mistake that any software company can make." Netscape did it and nearly died. Basecamp's founders have written extensively about why they resist the temptation. The graveyard of startups that rewrote their way into oblivion is vast and well populated.

That said, sometimes a rewrite genuinely is the right call. I have seen codebases where refactoring was throwing good money after bad, where the architecture was so fundamentally wrong that no amount of incremental improvement would get you where you needed to go. At Risika, we made hard calls about which parts of the system to rebuild and which to evolve, and getting that distinction right was critical to reaching profitability.

The question is not "should we ever rewrite?" The question is "how do we know when refactoring is enough and when it genuinely is not?"

When Refactoring Is Enough

Most of the time, your codebase does not need a rewrite. It needs discipline.

Refactoring means improving the internal structure of your code without changing what it does. It is less exciting than a rewrite, but it is dramatically less risky. You keep shipping features. You keep your customers happy. You improve things incrementally, and at every step you have a working system.

Refactoring is the right approach when:

The problems are localised

If the pain is concentrated in specific modules or services rather than spread across the entire system, you can fix those modules without touching everything else. Most "our whole codebase is terrible" complaints, when you actually look, turn out to be "three specific areas are terrible and the rest is fine."

The architecture is fundamentally sound

If your system's overall structure makes sense for what you are building, messy code within that structure is a refactoring problem. Bad variable names, duplicated logic, missing tests, inconsistent patterns: these are all fixable without starting over. The architecture is the skeleton. Bad code is the clutter. You can declutter without demolishing the building.

Your tech stack is still viable

If your language, framework, and core dependencies are still actively maintained with a healthy ecosystem, there is no technology reason to rewrite. Rails, Django, Express, Spring: these are not going anywhere. Your stack being "old" is not a reason to rewrite. Your stack being unmaintained, with no security patches and a shrinking talent pool, might be.

Engineers can still make changes safely

If your team can add features and fix bugs without an unreasonable rate of regressions, the system is still workable. "Unreasonable" is doing real work there. Some bugs are normal. Shipping a feature and breaking three unrelated things every time is not. If changes can still be made with reasonable confidence, refactor.

The technical debt trap that catches most Series A startups is not that they have too much debt. It is that they assume the only way to address it is a dramatic, expensive rewrite when steady refactoring would solve the problem at a fraction of the cost and risk.

The Three Legitimate Reasons to Rewrite

If you have genuinely exhausted refactoring and the problems persist, there are three scenarios where a rewrite becomes defensible.

1

The architecture fundamentally cannot support your next stage

This is the most common legitimate reason. Your system was designed for 100 users and you need to serve 100,000. Your monolith needs to process events in real time but it was built as a batch system. Your data model assumed one market and you are expanding to five with completely different regulatory requirements.

The key word is "fundamentally." If you chose a reasonable tech stack and the architecture was sensible for the stage you were at when you built it, the question is whether that architecture can evolve to meet your new requirements. Sometimes it can. A monolith can often be decomposed incrementally. A relational database can handle more scale than most startups realise. But sometimes the gap between where you are and where you need to be is too wide for incremental change.

At RefME, we had to make exactly this kind of decision as the product scaled from thousands to millions of users. The original architecture was perfectly reasonable for the early stage, but some components simply could not stretch to handle the load patterns we were seeing.

2

The tech stack is genuinely dying

Not "unfashionable." Dying. There is a critical difference. PHP is unfashionable. It also powers a significant percentage of the web and has a thriving ecosystem. ColdFusion is dying. Silverlight is dead. If your system is built on technology where you cannot hire engineers, cannot get security patches, and cannot find libraries for basic functionality, you have a genuine technology problem.

The test is practical: can you hire competent engineers who want to work with this stack at a price you can afford? If the answer is no and the talent pool is shrinking rather than growing, the technology is not just unfashionable. It is a business risk.

3

The codebase has become genuinely unmaintainable

This is the rarest legitimate reason, but it does happen. Usually in codebases that grew for years without any technical leadership, where multiple agencies built different parts with no coordination, or where a revolving door of contractors each brought their own patterns and left without documentation.

The symptom is not "the code is ugly." Ugly code can be refactored. The symptom is that nobody on the team can confidently explain what the system does. Changes that should take a day take a week because the side effects are unpredictable. The test suite, if one exists, provides no confidence because it tests the wrong things. A technical audit reveals that the system's actual behaviour has diverged so far from anyone's understanding of it that refactoring is essentially guesswork.

Even in this scenario, a targeted rewrite of the worst components is almost always better than a full system rewrite. Which brings us to the strangler fig.

The Strangler Fig Pattern: How to Rewrite Without the Risk

If you have decided a rewrite is necessary, how you do it matters more than whether you do it. And the answer, in almost every case, is the strangler fig pattern.

The name comes from a type of fig tree that grows around an existing tree, gradually replacing it until the original tree is gone. In software, the pattern works the same way. Instead of stopping everything to rebuild the system from scratch, you build new components alongside the old system and gradually migrate traffic and functionality from old to new.

Here is how it works in practice:

Step 1: Identify the boundaries

Map your system into components or domains. Identify which ones are causing the most pain and which have the cleanest interfaces with the rest of the system. The components with clear boundaries are the easiest to extract and replace first.

Step 2: Build new alongside old

Build the replacement component as a new service or module that runs alongside the existing system. Route a small percentage of traffic to the new component. Compare outputs. Fix discrepancies. Gradually increase the traffic split until you are confident the new component works correctly.

Step 3: Migrate and retire

Once the new component is handling all traffic reliably, remove the old component. Then move on to the next one. Repeat until the old system is gone, or, more commonly, until the remaining parts of the old system are stable enough that you stop caring about them.

The strangler fig approach has three massive advantages over a big bang rewrite. First, you can stop at any point and still have a working system. If priorities change, if funding gets tight, if a competitor forces you to shift focus, you are not stuck halfway through a rewrite with nothing to show for it. Second, you keep shipping features throughout the process. Third, you learn as you go. The first component you migrate teaches you things that make the second migration faster and better.

The big bang rewrite, where you freeze everything, rebuild from scratch, and flip the switch on launch day, sounds cleaner but almost always goes wrong. It takes longer than expected. The new system has different bugs than the old system. Edge cases that were handled by obscure code in the old system are missed in the new one. And during the entire rebuild, your competitors are shipping features while you are rebuilding something you already had.

A Decision Framework: Refactor, Rewrite, or Leave It Alone

Five questions to cut through the noise and make the right call.

1. What is the actual business problem?

"The code is messy" is not a business problem. "We cannot ship the features we need to close enterprise deals" is. "Our deployment process is slow" is not a reason to rewrite. "We are losing customers because we cannot fix critical bugs quickly enough" might be. Start with the business impact, not the engineering complaint. If you cannot articulate a concrete business problem that the rewrite solves, you do not need a rewrite.

2. Have you actually tried refactoring?

Not "talked about refactoring." Actually done it. Dedicated engineering time to improving the worst parts of the system. Added tests. Cleaned up the most problematic modules. Many teams jump to "we need a rewrite" without spending a single sprint on disciplined refactoring. Try it first. Give it a genuine effort, at least a quarter of focused work, and measure whether things improve. You might be surprised.

3. Can you articulate what would be different?

If the answer to "what would the new system look like?" is vague, you are not ready for a rewrite. You need to be specific about which architectural decisions would change and why. If you cannot explain exactly what was wrong with the old architecture and exactly how the new one solves those problems, you will make the same mistakes again with fresher code.

4. Can you afford the true cost?

Not the engineering estimate. The true cost, including slowed feature development, team morale during a long migration, potential customer churn from delayed improvements, and the opportunity cost of everything else you could be building instead. If the rewrite consumes more than 30% of your engineering capacity for more than six months, you need to be very sure the payoff justifies it.

5. Is this the engineers talking or the business?

Engineers have a natural bias towards rewrites. New code is more fun than maintaining old code. New technologies are more exciting than legacy systems. This does not mean engineers are wrong, but it means you should pressure-test the recommendation. If only the engineering team wants the rewrite and the business cannot articulate a concrete benefit, be sceptical.

What a Rewrite Actually Costs

Founders consistently underestimate rewrite costs because they only count engineering time. The real cost is far broader than that.

Direct engineering cost

The most visible cost. Take whatever your engineering team estimates and multiply it by two to three. This is not pessimism. It is the consistent pattern across every rewrite I have seen. The first 80% goes roughly to plan. The last 20%, the edge cases, the data migrations, the integrations, the things nobody remembered, takes as long as the first 80%.

Feature development slowdown

Even with the strangler fig approach, a rewrite consumes engineering capacity that would otherwise go towards features. Your competitors do not pause while you rebuild. The features your sales team is promising to close deals still need to ship. You need to budget for reduced feature velocity and have honest conversations with your commercial team about what that means.

Team morale and attrition

Rewrites sound exciting at the start. Six months in, when you are still migrating data and fixing subtle differences between old and new behaviour, the excitement has worn off. Long rewrites cause engineer burnout and attrition. Losing a key engineer mid-rewrite can set you back months, because the replacement needs to understand both the old system and the new one.

Opportunity cost

This is the cost founders miss most often. Every month your engineers spend on a rewrite is a month they are not spending on the thing that actually grows your business. If you are pre-Series A, that opportunity cost could be the difference between hitting your metrics for fundraising and missing them. If you are post-Series A, it could be the difference between hitting profitability targets and needing a bridge round.

None of this means a rewrite is never worth it. Sometimes the cost of not rewriting is higher: you cannot serve enterprise customers, you cannot enter a new market, you cannot meet regulatory requirements. But you need to be honest about the full cost and compare it honestly against the alternatives.

Before You Decide: Get an Independent Assessment

The worst rewrite decisions are made in a vacuum. Your engineers are too close to the code to be objective. You, as a non-technical founder, cannot assess the code yourself. The result is that you either blindly trust the engineering team's recommendation or blindly resist it. Neither approach is good.

Before committing to a rewrite, get an independent technical audit. Someone who has no emotional investment in the code, no preference for a particular technology, and no incentive to make work for themselves. A good audit will tell you:

What is actually wrong. Not "the code is bad" but specifically which architectural decisions are causing which business problems. What can be fixed incrementally. Often more than people think. What genuinely requires a rewrite. Often less than people think. How to sequence the work. Which components to tackle first for maximum business impact.

As a Fractional CPTO, this is one of the most valuable things I do for founders. Not building the new system or maintaining the old one, but providing an honest, experienced assessment of whether a rewrite is truly necessary and, if so, how to do it without betting the company.

The rewrite question is not really a technical question. It is a business question. How much risk can you afford? How long can you tolerate reduced feature velocity? What happens if it takes twice as long as planned? Answer those questions honestly, and the technical decision usually becomes clear.

Considering a rewrite? Get clarity first.

A technical audit from a Fractional CPTO can tell you whether you need a rewrite, a refactor, or simply better engineering practices. Honest assessment, no agenda.

Frequently Asked Questions

How long does a full codebase rewrite typically take?

Most rewrites take two to three times longer than the original estimate. A system that took six months to build will typically take 12 to 18 months to rewrite properly, because you are rebuilding all the edge cases and business logic that accumulated over time. If someone tells you they can rewrite your system in three months, they are underestimating the complexity hiding in the existing code.

Can we keep shipping features during a rewrite?

This is the central challenge of any rewrite. If you use the strangler fig pattern, yes, because you are replacing the system piece by piece while the old system keeps running. If you attempt a big bang rewrite where you stop everything and rebuild from scratch, you will effectively freeze feature development for months. That is why the strangler fig approach is almost always the right choice for startups that need to keep shipping.

How do I know if our codebase needs a rewrite or just refactoring?

Start with a technical audit. If engineers can still make changes safely, if the architecture supports your current scale, and if the tech stack has an active ecosystem, refactoring is almost certainly the better path. A rewrite is only justified when the fundamental architecture cannot support where you need to go, the tech stack is genuinely dying, or the codebase has become so tangled that every change breaks something unrelated.

What is the strangler fig pattern?

The strangler fig pattern is a migration strategy borrowed from Martin Fowler. Instead of replacing your entire system at once, you build new functionality in a new system alongside the old one, then gradually redirect traffic and features from old to new. Over time, the new system 'strangles' the old one until nothing depends on it any more. It is lower risk than a big bang rewrite because you can stop at any point and still have a working system.

Should we change our tech stack during a rewrite?

Only if the current tech stack is one of the reasons for the rewrite. Changing your tech stack adds enormous risk because your team needs to learn new tools while simultaneously rebuilding a complex system. If the rewrite is driven by architectural problems rather than technology problems, keep the same stack and focus on getting the architecture right. You can always migrate technologies later once the new architecture is stable.

Mike Tempest

Mike Tempest

Fractional CPTO

Mike is a Fractional CPTO helping UK startups make better technology decisions. With experience scaling products from zero to millions of users at Risika and RefME, he brings commercial thinking to technical decisions. Book a free day at fcto.uk/free-day.

Learn more about Mike