Branch By Abstraction - UML sequence diagrams

Branch By Abstraction?

"Trunk-Based Development and Branch by Abstraction" book is now available on Leanpub

Yes, by abstraction instead of by branching in source control. And no, that doesn't mean sprinkle conditionals into your source code, it means to use an abstraction concept that's idiomatic for the programming language you are using.

Context: Some types of non-functional change take so long to code that someone suggests doing the work on a branch and merging back later. These will be the types of changes that are going to impede the concurrent development of normal functional deliverables. Where the non-functional changes impact many parts of the codebase, there is a risk of repeatedly breaking the build. Where there is a risk of breaking the build, there is a "do it on a branch" suggestion.

Another factor driving that suggestion is when there is a mismatch between the release cadence and the time needed to complete the work. In other words, change can take many weeks or months and there will be releases to production before the completion of the work in question.

The branch idea would be OK if the resulting merge back could be instantaneous and guaranteed to be a pain-free merge. That is impossible though for any higher throughput development team that is also doing regular refactorings as they go. Note too that executives often express their displeasure with nonfunctional work taking longer than promised by canceling it.

Branch by abstraction is a methodical procedure which has some clear steps:

Introduce an abstraction to methodically chomp away at that time consuming non-functional change
Start with a single most depended on and least depending component
To not jeopardize anyone else's throughput, work in a place in the codebase that is separate to the existing code
Methodically complete the work, temporarily duplicating tens or hundreds of components
Go live in a 'dark deployment' mode part complete as many times as needed
Scale up your CI infrastructure to guard old and new implementations
When ready, switch over in production (a toggle flip to end 'dark deployment' mode)
Lastly, delete old implementations and the abstraction itself (essential follow up work)

That last is the characterizing difference between the methodical branch by abstraction procedure and the ordinary and lasting abstractions in your codebase. The removal of the abstraction (and old implementation) is the equivalent to the merge back from the branch that you've now avoided.

Here are the steps of the methodical procedure illustrated using UML sequence diagrams

Current branch by abstraction example:

(click to change)

1. Identify the change that will take a while

The classic case discussed for branch by abstraction is the changing of one persistence technology for another. In the original article⇩ for this, it was swapping out Java's 'Hibernate' technology for 'iBatis' (now MyBatis). That was hypothetical for the blog entry. In real life, a few years later ThoughtWorks' Go team replaced iBatis with Hibernate (life indeed does imitate art, Oscar Wilde).

For the telling of the branch by abstraction story here, let's revisit the time-consuming non-functional chore of swapping persistence technologies. From 'old' undesired choice to a new choice. For the hypothetical app, there are hundreds of components that use the same lib/framework and follow its applicable patterns. Only the business name for those persistable things would last. Let's start with "ShoppingCart" because it is tangible to most readers (but remember that there are hundreds of these to slog through).

2. Introduce an abstraction

Somewhere primordial, you'd introduce a new component. A single place that allowed the caller to remain ignorant of whether they were using the old or the new ShoppingCart persistence technology. The choice of old or new ShoppingCart technology would only be determined at boot time. Some development teams might make that a compile-time fixed choice instead of boot time. Those would not be relying on compiler directives like #IFDEF for C and C++. Instead, they would rely on something closer to the build grammar. Build-time and boot-time toggles, by another name.

It is essential to choose an abstraction mechanism that is idiomatically-correct for the language in question. For example, Java's interfaces (and implementations of those) is a smarter choice than making an abstract base class for 'old' and 'new' to extend. C# would be the same. Ruby and Python have mixins, method_missing, and duck-typing to consider. If using Rust, the dev team may consider 'traits'.

What is unlikely, though, is run-time toggles. Those are toggles/flags that can be flipped from old to new (or back again), while the software is in use. Unlikely to be needed but not impossible to code.

2b. Sensible renames

We should generally feel free to tactically rename/re-package (or namespace) as we go. Hopefully, our refactoring IDE (like Intellij) has our back.

3. Only the developers working on the migration are exposed to the new technology, in a crucial primordial section of code.

This is because there is a single toggle/flag flipped for those developers. And that should be a configuration change that is not checked in.

That could literally be code:

if (useNewPersistenceTech) {
    persistenceFactory = new NewPersistenceFactory();
} else {
    persistenceFactory = new OldPersistenceFactory();
}

But it is more likely to be outside code in a YAML, XML, JSON or properties file (local or remote), like this (or better):

# persistenceFactoryClassName=com.yourcompany.yourapp.OldPersistenceFactory
persistenceFactoryClassName=com.yourcompany.yourapp.NewPersistenceFactory

And code that used that at boot time:

pfClassName = props.getProperty("persistenceFactoryClassName");
persistenceFactory = instantiateFrom(pfClassName);

4. Everyone else sees/uses the old tech choice

They need to develop functional deliverables without encountering anything that is transitional, so they continue with the old/legacy implementation. They don't have to do anything to set that up, as it is a default with mere clone/checkout and build from root.

And indeed "everyone else" also means QA/UAT/live deployments too.

5. Repeat tens or hundreds of times until there are two of everything

The point was that the "change that was going to take a while" was going to span a few releases and happen concurrently with the development of other business functionality. Getting the first example done using branch by abstraction was a prerogative. That included the writing of unit and integration tests for it. Best of all would be forking of the unit/integration tests from the old implementation, but I'm never surprised to find that there were next to none to clone/fork. 😞

It is also worth reminding readers here that this is a methodical process. As the developer of this, you should complete one 'old to new' piece and commit/push. You should code everything with the expectation that it will go live but toggled off in production (or compiled out). You can expect that other developers working on functional deliverables will concurrently create new classes in the old technology, too. You may insist that they write tests though if they are adding to your backlog, but otherwise just relax. As long as they are not even close to adding to your backlog faster than you're migrating components to the new system you're going to be OK. Angular-creator Miško Hevery has a story about that (TODO).

As you're coding/committing/pushing small parts of this work towards completion, you are going to try to bring up the app itself and click through it in an exploratory way. You'll also be wanting to see that the UI functional tests (Selenium, etc) are still passing too. If far from completion of the migration, then these experiments can help you to work out which class/component to work on next. The exercise can also help with the calculation of percentage complete for the whole piece of work, which non-developers will be pushing you for. Make sure you've already got stakeholder agreement that it is safe to change estimates as you go. Maybe even more than once. Reminding managers that this methodical process can be paused and resumed helps too.

6. Remove the abstraction

You will go properly live with a toggle flip, after testing and a signoff in UAT. You'll have practiced flipping that toggle back to "old" in UAT after an initial deployment to "new". You will confirm that both work so that you can assure execs that you can do the same in production if needed. Actually, you will have practiced this a few times (practice makes perfect).

After being live for a couple of weeks, you will remove the old implementation too, and do renames if they make sense.

Maybe immediately after that, but it could also be after another couple of weeks, you'll remove the abstraction itself too(commit/push/go-live), ending up with:

As previously mentioned, this last is the characterizing difference to any other use of abstractions in the codebase. Well, versus the terrible decision to do the lengthy non-functional change on a branch and merge back to trunk/master when complete. Those are often late, and risk being canceled part complete. Or they are super stressful at the time of merge.