Abstract Branch by Paul Hammant

Branch By Abstraction?

"Trunk-Based Development and Branch by Abstraction" book is now available on Leanpub

Yes, by abstraction instead of by branching in source control. And no, that doesn't mean sprinkle conditionals into your source code. It means to use an abstraction concept that's idiomatic for the programming language you are using.

Context: Some types of non-functional change take so long to code that someone suggests doing the work on a branch and merging back later. These will be the types of changes that are going to impede the concurrent development of normal functional deliverables. Where the non-functional changes impact many parts of the codebase, there is a risk of repeatedly breaking the build. Where there is a risk of breaking the build, there is a "do it on a branch" suggestion.

Another factor driving that suggestion is when there is a mismatch between the release cadence and the time needed to complete the work. In other words, change can take many weeks or months and there will be releases to production before the completion of the work in question.

The new branch idea would be OK if the resulting merge back could be instantaneous and guaranteed to be a pain-free merge. That is impossible though for any higher throughput development team that is also doing regular refactorings as they go. Note too that executives often express their displeasure with nonfunctional work taking longer than promised by canceling it.

Branch by abstraction is a methodical procedure which has some steps:

  1. Introduce an abstraction to methodically chomp away at that time consuming non-functional change
  2. Start with a single most depended on and least depending component
  3. To not jeopardize anyone else's throughput, work in a place in the codebase that is separate to the existing code
  4. Methodically complete the work, temporarily duplicating tens or hundreds of components
  5. Go live in a 'dark deployment' mode part complete as many times as needed
  6. Scale up your CI infrastructure to guard old and new implementations
  7. When ready, switch over in production (a toggle flip to end 'dark deployment' mode)
  8. Lastly, delete old implementations and the abstraction itself (essential follow up work)

That last is the characterizing difference between the methodical branch by abstraction procedure and the ordinary and lasting abstractions in your codebase. The removal of the abstraction (and old implementation) is the equivalent to the merge back from the branch that you've now avoided.

Here are the steps of the methodical procedure illustrated using a series of changes in a Java example project

Current branch by abstraction example:
(click to change)

0. Starting point: the contrived app used on this page

The classic case discussed for branch by abstraction is the changing of one persistence technology for another. In the original article for this, it was swapping out Java's Hibernate technology for iBatis (now MyBatis). That was hypothetical for the blog entry. In real life, a few years later ThoughtWorks' Go team replaced iBatis with Hibernate (life indeed does imitate art, Oscar Wilde).

For the telling of the branch by abstraction story here, we will focus on a contrived but simple web app. One that presents a single JSON endpoint for a pseudo-random 'hair color' over HTTP.

I ran a code generator to make a Jooby web framework application. The core of which was a neat four-line web application in a single Java class. The generated test class for the same was also fairly simple. Unconventional too, in that it contains a unit and service test method in one class. The first commit in this GitHub repo tells the whole story.

Jooby is a fantastic Java 8 web framework more than ready for productive development and production deployments.

That initial four-line generated web service then morphed into one that has a JSON document response that passes back one of four hair colors back to the caller. Here's what that looks like in the browser:

Changing that generated web-app to be a hair color endpoint (JSON via HTTP/GET):


    

^ the highlighted lines are the ones that changed

And unit + service tests for that:


    

^ the highlighted lines are the ones that changed

See also the diff between Jooby's generated "Hello World" app and hair color JSON endpoint that replaced it.

1. Introduce an abstraction

It is our hair implementation color that we're going to pretend is going to take ages to migrate from old to new. Specifically, we will pretend that it would be the mother of all commits if it were done to completion in one go. Or that it would be many hundreds of commits to migrate all components to a new technology. In our contrived case, it is just one component - 'hair color' - that is restfully accessed over HTTP.

Somewhere primordial, you'd introduce a new component. A single place that allowed the caller to remain ignorant of whether they were using the old or the new 'hair color' technology. The choice of old or new hair color technology would only be determined at boot time. Some development teams might make that a compile-time fixed choice instead of boot time. Those would not be relying on compiler directives like #IFDEF for C and C++. Instead, they would rely on something closer to the build grammar like 'Maven Profiles' (Maven being Java's standard build technology). Build-time and boot-time toggles, by another name.

It is essential to choose an abstraction mechanism that is idiomatically-correct for the language in question. For example, Java's interfaces (and implementations of those) is a smarter choice than making an abstract base class for 'old' and 'new' to extend. C# would be the same. Ruby and Python have mixins, method_missing, and duck-typing to consider. If using Rust, the dev team may consider 'traits'.

Three prod sources (two new) in src/main/java/com/mycompany/


    

^ the highlighted lines are the ones that changed



    

^ new, our hair color abstraction in a general purpose interface



    

^ new, the first (and instantly legacy) implementation of that general purpose interface

Some configuration for the new boot-time toggle choice outside of Java:


    

Boot-time toggles are different to run-time ones. The latter are toggles/flags that can be flipped from old to new (or back again) while the software is in use. This is unlikely to be ever needed as a feature for a branch by abstraction implementation, but not impossible to code.

The test only differs from the previous version in the way that the App is instantiated for the unit test:


    

^ the highlighted line is the only one that changed

See the diff for that commit on Github

Remember that the contrived case being made here for 'hair color' is only about an example that can be listed in a web page.

In real life, you could have hundreds of components but you would only introduce enough abstraction to cover the first component. Choose one that fits most "depended on and least depending" to work with first.

2. Make the new implementation of the abstracted component

One new prod Java source, and all others unchanged:


    
A second configuration file to use at boot-time to set toggle choice for the new hair color implementation:


    
The unit test method cloned for the new hair color implementation:


    

^ besides the cloned method, there's a tactical rename of the old unit test too. The duplicated code here is not the sin you would otherwise think it is.

See the whole commit's diff on Github

Only the developers working on the migration use the new technology. Everyone else uses the old (established) implementation of the abstraction. That last is by default, of course, so the developers working on the new technology should override the boot cycle for the app/service locally for themselves. They make sure to not check that choice in accidentally (somehow).

The bulk of the developers assign to regular functional deliverables, need to develop without encountering anything that is transitional. They continue with the old/legacy implementation should not have to do anything to set that up. As mentioned before that should be the default after a mere clone/checkout (and build) from the root of the project.

Note: "everyone else" also means QA/UAT deployments and testing too. That toggle doesn't get flipped to 'new' for everyone until the work is complete and test automation say it is ready. The plan should also change to push it live sooner rather than later.

2b. New Continuous Integration pipelines

All great development teams have a Continuous Integration server setup to guard the quality of the code in the trunk (master branch for Git and Mercurial). To accommodate branch by abstraction efforts, the simple modification of CI setup is needed: have one whole CI pipeline for each anticipated permutation of toggles going forwards. One more pipeline, too, that represents the previous release. But that one only for as long as a roll-back is a possibility - for the part of the release pertinent to the toggle that was activated in production during (or after) a release. We are racing ahead, though.

A more sophisticated solution is to have one CI pipeline process for all the common steps: compile of prod source, compile of tests, execution of pure unit tests. Then that same CI job fans out to many parallel jobs where service tests are going to test larger parts of the web-service with toggles explicitly set on (or off). Larger means multi-process and definite socket use (HTTP in the case of our example). Again, only permutations of pipeline that are meaningful for forthcoming releases (and that prior release that is still warm) should be setup in the CI infrastructure.

As well as more sophisticated, that design is less 'brute force' too. Additionally, though our application is a web service, it could as easily had a thin UI too, and that would need CI job initiated execution of Selenium-WebDriver tests following the service tests. Each team would work out the economics of 'brute force' versus the 'parallelized after a certain point' alternate. One thing is clear though: a build failing during CI of any of the pipelines means a failure for all, and a rollback of the commit.

2c. Repeat tens or hundreds of times until there are two of each component

The point was that the "change that was going to take a while" was going to span a few releases and happen concurrently with the development other business functionality. Getting the first example done using a branch by abstraction was a prerogative. That included the writing of unit and service tests for it. Best of all would be forking of the unit/service tests from the old implementation, but I'm never surprised to find that there were next to none.

It is also worth reminding readers here that this is a methodical process. As the developer of this, you are going to complete one 'old to new' piece and commit/push. You should code everything with the expectation that it will go live but toggled off in production (or compiled out).

You can expect that other developers working on functional deliverables will concurrently create new classes in the old technology, too. You may insist that they write tests though if they are adding to your backlog, but otherwise relax about that. As long as they are not even close to adding to your backlog faster than you're migrating components to the new system you're going to be OK.

As you are coding/committing/pushing small parts of this work towards completion, at some point you're going to try to bring up the app itself and click through it in an exploratory way. You will also be wanting to see that the UI functional tests (Selenium, etc) are still passing too. If far from completion of the migration, then these experiments can help you to work out which class/component to work on next. The exercise can also help with the calculation of percentage complete for the whole piece of work. Executives will be pushing you for that. Make sure you've already got stakeholder agreement that it is safe to change estimates as you go. Maybe even more than once. Reminding managers that this methodical procedure can be paused and resumed helps too.

3. Go Live

To go live properly the toggle needs to be flipped in production, but only after a period of testing in UAT followed by a signoff. The team should have practiced flipping that toggle back to "old" in UAT after an initial deployment to "new". This confirms that both toggle settings work in either direction. It assures executives that you can do rollback in production if needed, or go live between binary deployments/releases.

4. Remove the abstraction

After being live for a couple of weeks without incident, you will begin the process of removing the old implementation from the codebase. The same moment is a good one for renames in particular, and refactorings in general.

Maybe immediately after that you will remove the abstraction itself too and commit/push/go-live. Or wait another week before starting that.

After old code and abstraction removal, the App class:


    
The now anemic ReleaseToggles class:


    
The new Color enum (extracted):


    
The Release4 class:


    

^ the Release3 class was deleted of course

Release 4 configuration renamed to default:


    
And the tests:


    

^ The 'old' unit test deleted

As mentioned before, deletion of the abstraction when it has outlived its usefulness is the characterizing difference to any other use of abstractions in a codebase.

The implementation now uses Jackson to serialize the color enum directly to JSON:
<dependencies>
  <!-- Server -->
  <dependency>
    <groupId>org.jooby</groupId>
    <artifactId>jooby-netty</artifactId>
  </dependency>

  <dependency>
    <groupId>org.jooby</groupId>
    <artifactId>jooby-jackson</artifactId>
  </dependency>

See the whole commit's diff on Github

5. Caveats

5.1 The scale of non-functional requirement

The case we describe here is only a representative one. In real life, an accomplished developer (with a decent refactoring IDE) would be able to complete the entire migration in a single morning, meaning you would not need branch by abstraction at all. In reality, many large component models with a rework/refactoring agenda, cannot be done so quickly. Here is a depiction:

(source for graphic)

The large component model - each needing to be taken from old to new - is the reason branch by abstraction was devised, not the really small 'non functional change' cases.

5.2 One component at a time, but in order

The best way to migrate such a component model using the branch by abstraction procedure is to focus on the most depended-on and least depending component first. And when that one is complete, do the next one by the same definition.

5.3 Static mutable state

Another complication occurs when there is a pre-existing use of singletons in the codebase. Singleton as outlined in the Design Patterns book, not the Spring/Guice idiom of the same name. If so, you should methodically migrate to 'Service Locator' first. When that is complete you can start a similar methodical migration to dependency injection (DI), and take out the service locator. Not all languages and frameworks suit DI solutions though.

There is a "Google Singleton Detector" technology that languishes now. I was the designer and made the first commit in 2007. It made pretty diagrams illustrating where the entanglement was in a Java classbase. That helped you find the most depended-on and least depending components (as mentioned above). I might get around to blowing the dust off that at GitHub

5.4 Intermediate components

One more thing to consider is that the root thing you would introduce an abstraction for is not as shown in code above. Instead, it is more 'root' like in feel. Imagine the exchange rate between two currencies:

rate = releaseToggles.getExchangeRate("USD", "GBP");

Most likely you would engineer something in the primordial abstraction like this:

rate = releaseToggles.getExchange().getRate("USD", "GBP");

If your codebase remained in the service locator style (having extinguished singletons), that code fragment could appear many times. But, if you have arrived at a DI nirvana, then that ReleaseToggles component exists but is not permeated to all places where exchange rates are required. Instead, the intermediate abstracted component CurrencyExchange is:

public class Pricer{
  private CurrencyExchange exchg;
  public Pricer(CurrencyExchange exchg) {
    this.exchg = exchg;
  }
  public priceThingInCart(/* params */) {
    // use the exchg instance (whichever implementation, old or new).
  }

This because you would not want to break the law of Demeter during the migration more than you had to.