Seapine Labs
Personal tools

Branching Strategies

From Seapine Labs

Jump to: navigation, search

Contents

Seapine Software emphasizes the importance of branching when discussing Surround SCM. The proper use of branches can have an incredibly beneficial impact on a software development project, even if used sparingly.

In Surround SCM, a branch is defined as a clone or copy of a source code repository at a specific point in time or at the current point in time. You might call it a virtual copy because no file copies are made inside the Surround SCM database, but it looks like a copy from the user's point of view.

Branches exist in Surround SCM in a top-down tree list hierarchy. At the top is the mainline, which is the same as saying the root, head, or tip of a codeline. Branches are created under the mainline as children, and the children can have children of their own (to infinity). A parent-child and sibling relationship begins to appear as new branches are created.

Image:Branchtreelist.jpg

The relationships between branches represent visual cues for the different codelines being managed with Surround SCM. Branches that appear lower in the tree represent a greater distance from the mainline. Distance between branches can be good or bad, depending on the type of development environment you are working in.

Different branching strategies can be adopted to manage the different facets of the development process including concurrent development, maintaining multiple product releases, and capturing software configurations. When branching is properly implemented, it is easier to generate metrics and incorporate automated build techniques.

[edit] Branching Strategies

[edit] KISS and Branching Fundamentals

Keep it short and simple. Branching in Surround SCM is extremely fast and simple. While this is a 'good thing', it can quickly lead to overusing branches and complicating the SCM system. Complexity can be avoided through simple strategies, like branch by purpose.

[edit] Merging and Complexity

Complexity in SCM usually is associated with the maintenance of multiple codelines throughout the process. Questions of how to track specific changes and then how to apply those specific changes to multiple release codelines are often the most difficult to answer.

Surround SCM is built in such a way to facilitate merging through multiple codelines. The 3-way merge utility that runs on the server side will, in most cases, be used to automatically merge changes between multiple releases. But the merge utility is not a programmer and only a programmer knows if a merge is good or not. Merging can result in new bugs or require additional code changes, and it can create a lot of overhead to manage.

This is the primary reason that it is important to keep the number of branches to a minimum and to try and keep the codelines as close to the mainline as possible.

For example, the following strategy may cause much more overhead...

Image:Oldcode.jpg

This image reflects creating a new branch off the old one for each new revision that is going to be generated. Merging a change between revisions will be cumbersome and tricky as the codelines diverge from each other over time. The relationship between the parent and child branches also becomes more and more estranged as the codelines evolve concurrently in their own directions. It is also necessary to jump from one branch to the next, going up all the 'steps', to apply a patch using merging. Otherwise the changes need to be replicated by duplicating check ins on each codeline.

However, a strategy like the following one...

Image:Newercode.jpg

This strategy reflects the tenets of branch by purpose, where the goal is to keep the multiple codelines as close as possible to the mainline. This reduces the overhead of merging common changes between multiple product versions. A change no longer has to be merged up and down a large number of steps, and check ins no longer have to be duplicated. The mainline will have the combined history of all prior branches, which allows any specific changes to be replicated using the Rebase action in any given codeline.

Always remember that merging is a high maintenance aspect of SCM tools--in the same way that a check in can cause a bug so can a merge. The key is to adopt a solution that reduces complexity as much as possible, allowing you to spend more time developing software and less time resolving merge conflicts.

[edit] Consider Rebases and Promotes

When designing a branch model, rebases and promotes have to be taken into consideration since these are the methods used to update changes within branches.

With rebases, the only option is to update changes from the immediate parent branch. However, with promotes there is an option to select other branches other than the immediate parent. This can give the impression that promoting changes to branches other than the immediate parent is fine.

When a promote is done from a grandchild to a grandparent merge conflicts will occur. It does not matter even if the changes made on the grandchild are not conflicting with changes on the grandparent.

Skipping a generation when promoting is not recommended and should only be done in critical and rare situations when there is no other way out. If a branch structure dictates doing this often, then the branch structure should be reconsidered.

The main issue, to put it simply, is that branches only update the 'common ancestor' when a promote or rebase comes from one of their immediate branches, like its parent branch and its child branch(es). When a promote is performed from a grandchild to a grandparent, the grandparent doesn't know about the last time that a promote was done from the grandchild.

For example:

Consider the following branch structure -

branch A - grandparent

----branch B - parent

--------branch C - grandchild

all have file '123', version 1

>>>>'Common ancestor' is the last version of the file that both branches have in common. This is the common merge point. In the example above, the 'common ancestor' is the version of file 123 when it was added originally to branch C. When a promote is performed from a grandchild, this 'common ancestor' is never updated. There is no code in place to do this.<<<<<

The first time file 123 is promoted from branch C to branch A, the automerge will probably work. It will look at the changes made on file 123 on branch C, changes on the file on branch A and the 'common ancestor'. It will compare them and merge the files (as long as the changes on both branches are not conflicting).

The next time the file is promoted from branch C to branch A, it does the same thing. But since the 'common ancestor' is not updated, it takes changes made on the file from branch C that were made last time, and the new changes. Since it sees the changes made the previous time already in the latest version of the file on branch A, it creates a conflict. It thinks there are changes on the same line of the file. Guiffy is smarter at realizing that the changes are the same and that there is no conflict. However, after several of these promotes, and even Guiffy will not be able to sort this out.


[edit] Reverse Waterfall Branching

Image:Reversewaterfall.jpg

[edit] Branch By Version

This branch model is also known as 'Branch by Release'. This is probably the easiest to understand and simplest of all the branching models. It basically states that as soon as development has a release for QA, a new branch is created for the developers to start the coding for the next release. The old branch is then used by QA, and eventually is left behind.

While this approach is straightforward, it does have two drawbacks:

  • Requires serial changes to the code such as sequential check ins and check outs, rather than parallel development.
  • Adds complexity and overhead to the support of released versions.

When the software is released, and bugs are reported in a supported version, bug fixes must be made. In a branch by release model, this means you have to go back to that baseline and make the fix. Then, the fix has to be propagated to each subsequent release branch.

Because this approach creates so many baseline branches, propagating the bug fix to all of the branches can become complicated and cause merge conflicts.

Image:Branchbyversion.jpg

[edit] Branch By Purpose

This is the recommended approach for conventional applications and the one we recommend for use with Surround SCM.

The main point of branching by purpose is to only branch when it is absolutely necessary. The goal is to not have to change the branch that you are working on. This is especially helpful when working with Visual Studio. Sometimes it can be tricky changing the source control bindings in Visual Studio and the less you have to do that, the better.

All feature development and bug fixes are made on the mainline branch. A new branch is only created when work on the next release must start and there is still a need to maintain old versions. The new feature development and work on the new release remains on the mainline, and the maintenance of the old release is done in a baseline branch. As maintenance releases come out, snapshot branches are created.

The bug fixes made in the baseline maintenance branch are promoted to the mainline branch to ensure that the new version include all bug fixes from the previous version.

In the following screenshot, feature development is performed in the mainline branch (called "Branch By Purpose" for illustrative reasons). Maintenance of old releases is done in each baseline branch (1.0.x and 2.0.x). As each major and maintenance version is released, a snapshot branch is created (1.0.0, 1.0.1, 1.0.2, 2.0.0 and 2.0.1).

For example, if a bug is discovered in version 1.0.2 the fix is done in the 1.0.x branch. A maintenance version is released, and a corresponding snapshot branch is created (1.0.3). If the bug also affects the 2.0.x releases, the fix could be promoted to the mainline branch where it would be merged with the current codeline. It could then be rebased into the 2.0.x baseline branch. This bug fix could now easily be included in the next 2.0.x release.

Image:Branchbypurpose.jpg

[edit] Branch by Module

There are some software shops whose software releases are comprised of several separate modules. Each release may not necessarily contain the latest version of each module.

For example, a software shop (we'll call "Modules R Us") develops three modules, module A, module B and module C. Each software they release for their customers contain specific versions of each module depending on the customer's need. So even though each module may be on version 5, for example, they need a release that contains version 2 of module A, version 4 of module B and version 3 of module C.

So the question arises:

How do we branch in this situation?

The answer is to use an approach very similar to the Branch by Purpose method.

Like the Branch by Purpose model, all modules would be stored on the mainline branch. All new feature development would take place here.

Then, as feature freeze points for each module arrive, a baseline branch would be created (branch off the specific repository for the module). When that specific version of the module is finished, a snapshot branch is created.

The question may be asked "Why not create a snapshot directly off the mainline branch for each release? Why even create the baseline branches?"

If a software release is made containing version 2 of Module A, and the customer notes a defect, the fix must be made on version 2. What if development on the mainline for Module A is on version 4? Creating baseline branches for each version of each module allows for maintenance of each version separately.

Putting Together the Release

In order to put together each software release for each customer, another baseline branch is created. Unlike the baseline branches for the modules, this baseline branch is created off the root of the mainline branch (contains all three module repositories). For this example, we'll call this branch "Software Release Staging".

The next step is to set a working directory for this branch, this is where the software release will be put together.

After this is set, and we know which version of each module we need, we go to each specific snapshot branch and get the files to the corresponding working directory for the staging branch.

For example, customer "XYZ Company" needs a software release that contains version 2 of Module A, version 1 of Module B and version 3 of Module C.

We go to the snapshot branch containing the latest release of version 2 of Module A (2.0) and perform a get to the working directory associated with Module A on the staging branch. We then go to the snapshot branch containing the latest release of version 1 for Module B (1.1) and perform a get to the working directory associated with Module B on the staging branch. Finally, we go to the snapshot branch containing the latest release of version 3 for Module C (3.1) and perform a get to the working directory associated with Module C on the staging branch.

If you have Surround SCM set to allow check ins without check outs, then all you do have to do is to recursive check in the entire "Software Release Staging" branch.

If you do not, you must first check out the entire branch recursively, setting the "Overwrite" option to "Skip". Another option would be to check out the entire branch tree prior to doing each get.

Once all the files are checked in to the staging branch, you can create a snapshot branch for that release. Make sure to note in the comments which version of each module is contained in this release.

[edit] Branching for Web Projects

Because Web projects tend to be continuous, they have different branching requirements. As requirements are developed they are released as opposed to bundling multiple requirements into a packaged release like you would do with a C++ application.

Many Web developers do not use branching. Instead, all of the work is checked into a mainline or baseline branch. Snapshot branches are used to capture the Web site at different stages of development.

When branching is used, the branches often represent different approval stages for a specific change or changeset. One approach is to use a waterfall branching tree with the most recent changes in the mainline. Changes then trickle down through various 'stages' like QA, Staging, and Production. Snapshot branches can be used to capture the code at specific milestones.

Image:WebBranches.jpg

With this branching approach, users can rebase to move changes between each branch, where each branch represents a different 'stage' in the change lifecycle.

Another advantage is that you can rebase by label. A user can check in a code change and add a label like "Bug Fix 100". A code-admin can then perform a rebase by label "Bug Fix 100", which is a point and click process performed through the Surround GUI rebase dialog. That action is then repeated through each stage, as QA approves a change it is rebased again to staging where it can await approval for production.

The snapshots can then be created for milestones or after each rebase to production occurs. This makes it easy to rollback to an earlier Web site release if necessary.

Here is a great example of when the Surround SCM workflow feature can be a huge benefit to users. New changes can be marked for Review. After the reviewer approves or signs-off on the change, that file or group of changes can then be rebased to the next branch. The rebasing can even be automated using a simple trigger that runs after a change is set to the Approved state.

You can also use triggers or shadow folders to automatically update internal Web servers. As changes move through the different branch stages, you can use either feature to have those changes automatically update Web sites. As developers check in changes, they can jump out to the public dev server to see those changes integrated with other user's changes in real-time. Or, as you rebase through stages, QA and project managers can see those updates and make approvals; they can even send changes to Production when the approved changes are rebased into the Production branch.

[edit] Feature Branches

A feature branch is used to do the bulk of the development work on a codeline. In the 'branch by purpose' method, the feature branch is often the mainline that provides the programmers with consistent work areas for the majority of the work performed.

A feature branch is used to do all the big feature development for major releases. Using branch by version, feature branches are created prior to starting any new major release. For example, versions 2.0, 2.1, 2.5 would all be feature branches while versions 2.0.1, 2.1.1 would not be.

How and when you use feature branches depends on the branching methodology and what is subjectively deemed a 'feature' release as opposed to a 'maintenance' release.

[edit] Task Branches

Task branches are areas where major feature work may be performed. A task branch is designed for a specific requirement or to make a major update. The task branch is usually temporary, and can be a private workspace or a public baseline.

As a workspace, a task branch allows a programmer to clone a public codeline, either mainline or baseline, and use that branch to check in changes for a lengthy task (such as adding a new feature or fixing a complex bug). The programmer can check in changes to the server, review code changes, and perform rebases to stay current with ongoing development. Changes are stored on the server instead of on the programmer's hard drive, ensuring they will be backed up in case of a power outage. After the task is complete and the code is reviewed, the changes can be promoted into the public codeline.

As a baseline, the task branch allows a group of programmers to clone a public codeline, either mainline or baseline, and use that branch to check in changes for a lengthy task, requirement, or feature.

If a task branch is used for a specific requirement, and the requirement is pulled from the release, you can freeze the branch and essentially 'put it away' for later.

An example use case for a task branch using a recent Seapine example:

The TestTrack and Surround GUI Clients both use Qt, a third-party cross-platform GUI library. When a major Qt update was released, from 3.3 to 4.0, the TestTrack and Surround code was branched into task branches. The new Qt code was checked into the branches and development work started on making code changes necessary to work with the new library updates.

If the update is overly complex, and the release date might slip as a result, the Qt task branch can be frozen and the feature development would continue in the mainline.

If the update goes as planned then the entire task branch can be promoted back to the mainline, with the Qt updates and any code updates, which would then be part of the next TestTrack and Surround releases.

[edit] Third-party Library Branches

Managing third-party code can be tricky if there are multiple projects that depend on a single library. Each project often requires a specific version, making it even more complex to control. Following are two ways to manage third-party libraries with Surround SCM.

  • Use separate branches for each library release: This method allows you to store third-party libraries in separate branches, but requires performing two separate gets when doing a build. The first get being your project source code and the second get being the specific library version you need. This can be automated into a build script to make it easy.
  • Use file sharing into the common project areas: Create a separate folder, named Common, for the libraries and share the common source code into the project repositories (e.g., Project1/common). This allows you to branch a project and perform a single get to compile it because all the dependencies exist in the common sub-repository. You can also create root-level task branches when you need to make library updates that affect all projects (cloaking ones that do not share that project), making it easy to maintain the library even if it is shared across multiple projects.

To update third-party code use the Task branch approach as mentioned above.

[edit] Managing Builds

A common task with any change management tool is to capture the source code at a specific milestone. More often than not, these milestones are builds. Some tools have a feature that allows to "tag" or "label" a specific version of a file. While Surround SCM does provide labels, it is recommended that snapshot branches are used for this instead. Snapshot branches capture the file content and the directory structure. Any directory structure change made on other branches is not propagated to the snapshot branch, thus guaranteeing a repeatable build.

[edit] When to Create a Snapshot?

The first item that needs to be identified is the check in policy. There are generally two contradictory, but commonly recommended best practices:

- Check in often.

- Check in only after unit test and review.

In the latter one, one should feel a certain level of confidence that a build can be created at any point in time. If only "stable" check ins are made, then the chances of a compile error or build error are minimized.

If users check in often, before changes are complete, or you have a mix of both check in practices, then there should be a method to determine when it is safe to do a build, where it is safe to do a build from or at least a way to determine which revisions are stable.

One common approach with Surround SCM users is the workflow. With the use of states, files that are ready to be included in a build are designated by a specific state. The build user or script can then do a get of the source code files and use the "latest version to be in this state" flag.

Another approach is that as changes are approved, they are merged into another branch, where the builds are created.

[edit] Failed Builds

If the snapshot is created before the build, the snapshot may capture a failed build. This may or may not be desirable.

One approach is to create the snapshot branch prior to the build. Depending on the need to capture failed builds, the snapshot could be deleted, or simply renamed to indicate a failed build.

If you start seeing several snapshot branches for failed builds, that could be an indicator that a process change is needed.

If one is only interested in successful builds, it may be a better approach to create the snapshot branch after the build.

[edit] Important vs. non Important Builds

Depending on the process in place, the nature of the business, the company culture, etc...there may be two types of builds:

  • Important Builds: These are major milestones, such as an initial release.
  • Non Important Builds: These are your nightly builds that may or may not make it to testing. This may be the case in a continuous integration environment.

If you create all of the snapshot branches together, the snapshot branch list may get cluttered and it also may become difficult to isolate the important builds versus the non important builds.

There are several approaches and these depend on the process in place and how fluid the process needs to be.

One-Tiered Development Branch

This approach is just an extension of the branch by purpose example above. Two baseline branches are added below the development branch, but these are more for organizational purposes than anything else. Each baseline branch would separate the important builds versus the non-important builds. Changes would be rebased from the development branch to the corresponding baseline, and then a snapshot branch is created.

One Tiered Development Approach
One Tiered Development Approach

The figure above shows what this approach may look like. Development still takes place on the development branch WysiWrite 1.x, and as builds are needed they are rebased into its corresponding baseline branch, where the snapshots are created.

Two-Tiered Development Branch

With this approach, create a baseline branch under the main development branch. Daily development would actually take place on this new branch. Daily builds would be captured as child branches (snapshot) of this branch (baseline). Any time an important build is needed, changes are promoted to the main development branch and then a snapshot branch is created. This would place the snapshots for important builds at the save level as the daily development branch.

The two development branches would contain different states of the source code. The daily development branch would contain the "latest and greatest". The main development branch would contain the "latest and greatest" as of the last important build.

Two-Tiered Development Approach
Two-Tiered Development Approach

The figure above illustrates how this approach may look like. Daily development now takes place on the WysiCM 1.0.x Daily branch. Snapshots for daily builds are created here. Whenever an important build is to be created, changes are promoted to the WysiCM 1.0.x branch, where the snapshot is created.

[edit] More Information

There are other articles in this wiki that cover other information that may help you with your build process:

Using the Workflow- Includes a couple of ideas on how to implement a workflow to complement your build process.

E-mail File for Review - Uses a combination of the workflow, custom fields, triggers and scripts to give the ability to e-mail a file for review before it is checked in.

Automating a .NET Build With MSBuild and Surround SCM - An example of using triggers to automate a .NET build.

CruiseControl.NET Integration - How to integrate CruiseControl.NET with Surround SCM.

CruiseControl.NET Example - A Configuration example for continuous integration using CruiseControl.NET.

Apache Ant Integration - How to integrate Apache Ant with Surround SCM.

Nant Integration - How to integrate Nant with Surround SCM.

CruiseControl Integration - How to integrate CruiseControl and Surround SCM.

[edit] Branches and IDEs

[edit] Visual Studio 2005 Web Projects

Branching with Visual Studio 2005 projects can be tricky because the solution files reference HTTP address. According to Microsoft, version control only works if users have the exact same working directories on every machine. This is often not the case for most users since new Visual Studio Web projects are stored in each user's profile directory.

Through testing what we've found is the following:

  • If every user has a common working directory (e.g., C:\Web Projects) the version control works, as long as the Solution file and Project files are located in that directory tree. This may require editing the solution file so that it knows where the project and source code files are located. This is the best approach to take if you have multiple projects in version control and are planning on using branches.
  • If you are using the default user profiles directory, then set your working directory to the My Documents folder.
C:\Documents and Settings\UserA\My Documents

The Visual Studio SCC integration gets confused when looking for the Solution and Project files even if you set the working directories properly to each target folder, but setting the working directory to the My Documents folder makes Visual Studio happy. This is the best approach if you plan on having multiple projects in version control but will not use branching frequently.

[edit] Conclusions

[edit] Other Resources














Issue Management Software | Source Code Control Software | Test Case Management