A Stitch In Time: Understanding seams in software system migration

Written by Adam Lammiman | May 19, 2025 2:05:33 PM

In my first article I talked about legacy systems and the problems many businesses face. I compared ‘Artifact’ orientated systems, which are focussed on minimising change and sudden big bang shifts, to ‘Process’ orientated systems which embrace gradual and continuous managed evolution.

This article is aimed at those who feel they are stuck in the former and want to move to the latter but don’t quite know what steps to take.

We’re going to look at some tools that everyone facing a legacy migration nightmare should have available to them. The Strangler Pattern, identifying value and Seams.

Before we explore those lets first make sure we understand the problems inherent with the ‘Big Bang’ approach, what makes it problematic and risky.

Denying Big Bang Theory

First and foremost is scope, often half the problem with migrating a legacy system is that no one really knows exactly what it does or how it does it. This means what seems straightforward soon becomes a morass of confusion and risk, how do you verify something if no one can remember all of the things it’s supposed to do?

This feeds into the second problem, cost. What was supposed to be a 2 month project balloons into 6 months, then a year as you’re forced to painfully analyse and reconstruct developer design decisions that are 5, 10 or even 20 years old.

This is why the idea of a ‘lift and shift’ while it may seem simple rarely is. Yeah let’s just take this 200 line SQL stored procedure with no documentation or testing and recreate it in spark, what could possibly go wrong?

Faced with this choice, it is easy to think of the exercise as a simple replacement. We know what the old system does, so just build a new system that does exactly the same, but in a new and better technology that is easier to support changes. But we’ve seen this simple-sounding plan go down in flames most of the time. Replacing a serious IT system takes a long time, and the users can’t wait for new features. Replacements seem easy to specify, but often it’s hard to figure out the details of existing behaviour. What’s worse, much of that behaviour is stuff that isn’t really wanted, so building it is a waste.

Martin Fowler — Strangler Fig

There’s also the issue that often development needs to continue on the old system, so while you are building the new you also have to keep shifting the goalposts to make sure the new systems matches the same features as the old. Either that or you freeze development on the old system which increases the time pressure on the delivery of the new.

Finally risk to the business. A ‘Big Bang’ is supposed to be by it’s nature one and done. However the normal pattern in my experience is more one and fall over followed by panicked fixing and then many months of fallout.

This is often because the new system is created without much interaction and feedback from the end user or much opportunity for testing under load, so when it is switched on there is push back from users as they complain about changes, run into bugs and worst case experience outright failure as it crashes under load and business critical processes become inaccessible.

The key thing with any system, be it internal reports or client facing apps, is trust. If you lose trust then people go elsewhere and it is really hard to win it back.

Meet ‘The Strangler’

Ok so what is the alternative? This is where the Strangler Pattern comes in. it is a paraphrase of a term the software engineer and ThoughtWorks founder Martin Fowler coined when observing the behaviour of strangler figs while on holiday in Australia.

Essentially it is the gradual replacement of an old system with a new system until it has been completely superseded.

The benefits of this approach are manifold.

Firstly you are not tackling the whole in one go but breaking it into manageable chunks that are easier to understand and with a more limited impact.

Secondly you are able to reap a notable return to the business sooner by bringing noticeable change with each release, as opposed to having to wait until the end.

Thirdly you are reducing the risk, by breaking it into smaller chunks you can analyse, test and deploy as small units of value (more on value later). If anything does go wrong the impact is smaller and more easily recoverable, you can also much more easily test changes against production levels because you are releasing each unit as it is changed.

Fourthly this activity can be carried out in parallel with normal BAU, there isn’t the need to dedicate all resource to it. It should also work on the principle of ‘pick up and put down’ in that each unit of change should be part of a workable whole so if at some point you have to pause delivery then you’ll still be able to make use of all the bits you have already uplifted and then be able to pick it up again when you are ready.

And fifthly, feedback. By gradually releasing change you can garner response from your users and use that to adjust and change trajectory as needed while simultaneously acclimatising them to changes more gradually. It also opens the possibility to do things like soft launches to small user sets or dark launches where you run things in the background without visibility to the user to confirm production performance.

All of this allows you to shift to process thinking, building a continuous feedback cycle of improvement, adopting the ‘Forth Bridge’ mentality. It’s an ongoing cyclical process with milestones and goals but the overarching aim is managing the process not reaching a finish line.

Seams and how to use them

Michael Feathers coined the term “Seam” in his seminal book “Working with Legacy Code” he defined it as:

A seam is a place where you can alter behaviour in your program without editing in that place

Think of it as understanding boundaries, where they are present and where they are missing. In order to start to use the Strangler Pattern to cut away at your legacy code you need to know where to cut, what order you are going to cut in and how to do it safely. Seams show you where those edges are and allow you to contemplate those steps in different ways and at different levels in the software.

One of the things that often architecturally defines an Artifact based legacy system is a blurring or lack of clear boundaries. The key skill of taking something from a legacy artifact to an evolving process is being able to identify those areas which should be separate, separating them and then isolating and replacing those separated components.

It's about the value stupid

Seams can be applied at multiple levels: code, application and enterprise and they can be used in different contexts. You could be slicing up a nasty tightly coupled class, a self referential set of data, splitting up a code monolith or replacing part of a distributed system.

In all these instances seams will help you identify the areas you need to isolate and then gradually ‘strangle’ but where you apply them is different depending on your reason for making the change.

For me the number one rule is this:

Focus on the value

Me, every day.

A technical change only has meaning at the point it brings value to the business, this value could manifest in many different ways, it could be a reduction in cost or risk, it could be a new feature or it could be more adaptability to change.

All of these are concrete pieces of value, a technical solution that can be expressed in non technical terms and helps move your business forward.

Using these terms allows us to clearly express to the business the value of a change but it also allows us to more easily define where to apply that change and judge when it is complete.

Once you have that overarching value identified, then you need to identify the subsets of that value which combine to make that whole. Essentially the value itself has seams which you can exploit.

For instance say we have a specific application that we want to sunset moving to an entirely new platform, in this instance the value is only reached when that new system is operational, therefore the values seams would be best applied end to end cutting horizontally through the whole system.

So identifying a key report, or a single page in a web app or an api endpoint and building out the system to support that would probably be the first piece of value to tackle.

On the other hand if the value is reducing risk by removing a redundant storage layer then your seam is less end to end and more focussed on identifying what is using that storage and migrating each bit of code a piece at a time.

By focussing on the overall value and subsets of that value you get several useful things.

Firstly you break a problem down in smaller more manageable chunks, often the most difficult things in software and data is the enormity of the problems you face, it is easy to become overwhelmed by the seemingly insurmountable problems and the pressure of results. By looking at the problems from a value perspective you can break things down into smaller steps that are much more approachable. It is freeing because you don’t have to solve every problem in one go you can just focus on each individual step and check your progress at each.

Secondly you have something that is measurable, this enables you to more easily judge if your change has achieved it’s result, if you are switching out one system to another to reduce cost or drive new revenue then that is something you can benchmark and then judge success against.

Did this change really reduce our storage costs by 50% as expected? Or did we achieve a 25% increase in sales? These are questions you can pose and answer by focusing on value, then as you deliver each sliver you can monitor and adjust to new information as it arises.

Examples

To make this clearer lets look at some examples at different levels, code, application and enterprise.

At the Code level

The class below is a very simple example of ‘tight coupling’ where the connection logic is embedded directly into the class:

import pyodbc

class PseudoCoupledConnection:
    def connect(self, conn_str, target):
        return pyodbc.connect(conn_str)

If this code is then repeated across multiple classes in slightly different ways then this is an example of the ‘duplicate code’ and ‘Combinatorial Explosion’ code smells.

Now imagine the scenario where you want to move your storage layer to reduce cost and increase your applications flexibility, for instance moving to another database like Postgres, you’d have to find every instance of that connection logic in all the classes it is applied and alter it.

However if we see that connection logic as a seam, a point where we can decouple and break that connection logic out, making it consistent, separate and more easily testable, then we can move the connection logic to it’s own class and inject it through the principle of Inversion of Control:

class PseudoDeCoupledConnection:
    
    def __init__(self, connection):
        self.connection = connection
        
    def connect(self):
        return self.connection.connect()

Encapsulating the connection logic and removing the direct dependency on the pyodbc library it makes it much easier to mock the connection and test the behaviour of the class, we can then replace every instance of the connection logic with the new injected class one connection at a time with no breaking change. The neat thing is that we don’t have to do this all in one go, we can find each instance of that connection and replace it one at a time, both old and new code can exist in tandem for a time as the migration happens.

Once we have migrated all of the connections and we have are satisfied that all is working as before we can finally add a new class that has the same interface and handles the new connection, then we can easily switch the dependencies over to a new class that handles a different connection with confidence.

At the Application level — The Danger of Invisible dependencies

To give a more data specific example, a common issue I have seen when people are just focussed on data is what I term the ‘invisible dependency’ problem.

This is where a single storage has multiple independent ‘owners’, often with competing priorities and needs. A simple example can look something like this:

This problem can occur for multiple reasons, it can be a focus on the short term getting the available data for your needs as quickly as possible, it could be getting everything ‘just in case’ you need it or it could be just a perception of convenience, this thing is already there and it’s got pretty much what I need, why do I need the effort and expense of spinning up another version?

Unfortunately all these approaches ignore the one question, what happens when something changes?

In the above example all 3 systems are reading and/or writing to the same database, this means that changes to the schema or data by any one system could have an adverse side affects on the other systems with no easy way to monitor or flag the impact of a change and a massive increase in complexity and risk.

Once this door is open you can often see a ballooning of this problem, once someone can see multiple systems are connecting to one source then that effectively gives permission for them to keep going and suddenly they are 4 connections, then 7 etc. Suddenly your entire system is deadlocked and every time someone needs to make a change they have to navigate a maze of interconnections.

The other issue with shared databases is the schema will end up confused, with a mishmash of columns for either one or the other system, or worse the same column that has slightly different meanings for each system.

The end result is no clear ownership and everyone afraid to make a change because they have no idea what will break in the process.

So if we look at this from a value perspective again then a potential reason for change is that you want greater resilience in your system, a reduction in risk and cost around change and an ability to make that change at a faster rate.

If we look at the diagram again we can clearly see areas where we can begin to separate:

So by logically separating the applications and schemas we can end up somewhere like this:

So we separate out the database and we decouple reporting by interacting with some form of interface on the applications. This could be as simple as a single stored procedure on the database, but it is better to have something like an api call or message queue as this gives obvious ownership to each application and key points of separation.

Again the beauty is that this does not need to happen in one go, you could separate out application 1 but leave application 2 using the old database until you were finished.

That’s a key point about this pattern at every level, you should have choices and steps that you can take that allow you to move one part while leaving the old part unaffected. The strangler pattern is ultimately about choice, being able to tackle a large problem in different ways and being able to ‘chunk’ a problem according to your need.

At the Enterprise Level — evolving your architecture

This approach gets even more interesting at an enterprise level. If we look at another data related example, this is a simplified view of a typical data platform:

This obviously grossly simplified but at it’s most basic a data pipeline will have a step to download data from a source, some form or processing like a Medallion Architecture or ETL and then a semantic layer and reporting to allow business users to get insights from the data.

In reality this would be much, much more complicated involving multiple sources and interlaced dependencies, however for our example it is sufficient to show the concept.

Where seams are applied in this instance depends on, you guessed it, what value you are trying to reach.

For instance if you were facing a requirement to change an aspect of this platform based on a technology need, for instance an old technology was being decommissioned or you wanted to modernise your processing and storage (migration from a Data Warehouse to a Lakehouse being a classic example). Then you might position your seams like this (not all need to be applied, it could just be one or two):

In this instance there is an obvious hand off between each point where we could make change without affecting the downstream dependencies, as long as the interface is clear and the information presented by the new system is the same as the old one.

This allows us to have two systems running in parallel again, for instance if you were shifting your ingestion you can move sources over one at a time while still using the old system as you migrate.

This approach allows you to make changes behind the scenes without your end users being adversely effected and because the end result stays the same you can migrate a pipeline at a time and have both systems running in parallel, slowly moving from one system to another with no one being any wiser.

However you do need to be aware of some of the risks you could face with this approach, if we go back to the potential issues around a ‘lift and shift’ that we discussed earlier there is a potential problem that when creating the new system you might be faced with simply recreating the old problems in a new technology.

This is most often seen in with data modelling, if you have an old Data Warehouse that you want to upgrade to a more flexible Lakehouse if you are forced to replicate the old models so that your analytics layer remains unaffected then it might not be as easy to replicate the behaviour of 20 years of horrible SQL as you think (if anyone even remembers what that behaviour was meant to be).

You do have the option of tidying your models as you go, improving the code that delivers them, but you have limits in how far you can take that because you want to limit the disruption to end users.

There are ways to mitigate that risk, for instance you could build newer models and then have a temporary ‘translation layer’ (like an Adaptor Pattern in software development) which acts as a temporary bridge between the old and new while the migration happens (though you do need to make sure that translation is tested). Then when your downstream migration is complete you can focus on that analytics side and gradually uplift your reporting to the new model.

There is another approach and way of applying streams that looks at splitting along the horizontal as opposed to the vertical, like this:

In this instance you are looking at slicing a platform a thin sliver at a time, creating a new platform and then moving source by source or report by report across to the new one.

This has several benefits, it means you don’t have to address the technical debt of the old system and are unencumbered with having to make existing behaviours work in the new. By splitting one section at a time you can take the time to analyse each one, assess the business value and the approach and really understand and inform your business intelligence.

You can also slowly bring your user base onto the new platform, quickly getting feedback and testing each section a slice at a time. This can be empowering for a user as they can see the new system developing, appreciate how each new bit enables them to do more than they did before and feel as though they have a say in it’s development.

The downside is you will have to run both systems in parallel for a time and your users will have to move between both systems, this can be hard especially if things are delayed or more difficult. Also users can be resistant to change, even if the status quo is suboptimal, so communication is key in these instances.

Conclusion

In summary the key points to focus on with this approach are:

Identify the overall value you want to achieve.
Identify the smallest subset of that value you can break a problem down into.
Identify your seams, the areas where you can surgically split to deliver the identified value.
Slowly remove or replace parts along those value seams ‘strangling’ the old until the overall value is achieved.
Review each change in regard to the overall goal, adjust either value or approach if required.
Rinse and repeat.

In taking this approach and applying it consistently and continuously you can set up a cycle of change that is always in progress, a constantly evolving set of systems always in motion.

Also if you practice it consistently I’ve found it becomes a key way to approach all problems, instead of being overwhelmed by a seemingly insurmountable task you immediately start trying to separate it out and identify the areas you can change. I’ve found it a really useful tool for breaking any problem down.

Whether you're facing a complex legacy system or looking to evolve your architecture, our team can help. Get in touch to start a conversation about how we can support your data engineering journey.

View full post