Software development methodology code branching

2022.01.17 01:50

To actually gain those benefits, however, you need a cohesive structure that everyone understands and follows. Agile teams, especially, need to align their branching strategy with their guiding principles. Branching should support a sustainable delivery pace—not create short-term gains in speed that create complications or technical debt further down the road.

While some teams choose to create more bespoke branching strategies, two branching strategies are widely recognized as industry standards. Their names may be similar, but the strategies themselves take nearly opposite approaches to branching. GitFlow offers teams a deep, detailed structure with a few clearly defined branches: master, development, features, release, and hotfixes.

Each of these GIT branches has a specific purpose and is used to isolate a part of the development pipeline for parallel processing and enhanced quality assurance. Feature and release branches stem from this branch. These branches make it easy to track versioning on a more granular basis and to have various teams or developers working on multiple features at the same time. The release branch is used so you can avoid freezing the development branch while you prepare for a release—thus allowing teams not working on a specific release to continue their work.

Only release tasks, such as QA, bug fixes, and documentation, belong in this branch. Hotfix branches are short-lived branches used exclusively to patch production code.

This is a lot of branches, but many teams appreciate the clear structure and the workflow guidance that GitFlow offers. However, like any branching strategy, it has both strengths and weaknesses. TL;DR : Teams who embrace the somewhat rigid structure of GitFlow can leverage branches to improve pipeline visibility, increase overall flexibility, and organize code more cleanly. Unfortunately, the pros of GitFlow lead to a few major cons as well, which all fall under this one complaint: GitFlow is not very Agile-friendly.

Like most software patterns, few of them are gold standards that all teams should follow. Software development workflow is very dependent on context, in particular the social structure of the team and the other practices that the team follows.

My task in this article is to discuss these patterns, and I'm doing so in the context of a single article where I describe the patterns but intersperse the pattern explanations with narrative sections that better explain context and the interrelationships between them. In thinking about these patterns, I find it useful to develop two main categories. One group looks at integration , how multiple developers combine their work into a coherent whole.

The other looks at the path to production , using branching to help manage the route from an integrated code base to a product running in production. Some patterns underpin both of these, and I'll tackle these now as the base patterns. That leaves a couple of patterns that are neither fundamental, nor fit into the two main groups - so I'll leave those till the end.

If several people work on the same code base, it quickly becomes impossible for them to work on the same files. If I want to run a compile, and my colleague is the middle of typing an expression, then the compile will fail. We would have to holler at each other: "I'm compiling, don't change anything". Even with two this would be difficult to sustain, with a larger team it would be incomprehensible.

The simple answer to this is for each developer to take a copy of the code base. Now we can easily work on our own features, but a new problem arises: how do we merge our two copies back together again when we're done? A source code control system makes this process much easier. The key is that it records every change made to each branch as commit. Not just does this ensure nobody forgets the little change they made to utils.

This leads me to the definition of branch that I'll use for this article. I define a branch as a particular sequence of commits to the code base. The head , or tip , of a branch is the latest commit in that sequence. That's the noun, but there's also the verb, "to branch".

By this I mean creating a new branch, which we can also think of as splitting the original branch into two. Branches merge when commits from one branch are applied to another. The definitions I'm using for "branch" correspond to how I observe most developers talking about them. But source code control systems tend to use "branch" in a more particular way.

I can illustrate this with a common situation in a modern development team that's holding their source code in a shared git repository. One developer, Scarlett, needs to make a few changes so she clones that git repository and checks out the master branch. She makes a couple of changes committing back into her master. Meanwhile, another developer, let's call her Violet, clones the repository onto the her desktop and checks out the master branch. Are Scarlett and Violet working on the same branch or a different one?

They are both working on "master", but their commits are independent of each other and will need to be merged when they push their changes back to the shared repository. According to the definition of branch I gave earlier, Scarlett and Violet are working on separate branches, both separate from each other, and separate from the master branch on the shared repository.

When Scarlett puts aside her work with a tag, it's still a branch according to my definition and she may well think of it as a branch , but in git's parlance it's a tagged line of code. With distributed version control systems like git, this means we also get additional branches whenever we further clone a repository.

If Scarlett clones her local repository to put on her laptop for her train home, she's created a third master branch. The same effect occurs with forking in GitHub - each forked repository has its own extra set of branches. This terminological confusion gets worse when we run into different version control systems as they all have their own definitions of what constitutes a branch.

A branch in Mercurial is quite different to a branch in git, which is closer to Mercurial's bookmark. Mercurial can also branch with unnamed heads and Mercurial folks often branch by cloning repositories. All of this terminological confusion leads some to avoid the term.

A more generic term that's useful here is codeline. I define a codeline as a particular sequence of versions of the code base. It can end in a tag, be a branch, or be lost in git's reflog.

You'll notice an intense similarity between my definitions of branch and codeline. Codeline is in many ways the more useful term, and I do use it, but it's not as widely used in practice. So for this article, unless I'm in the particular context of git or another tool's terminology, I'll use branch and codeline interchangeably. A consequence of this definition is that, whatever version control system you're using, every developer has at least one personal codeline on the working copy on their own machine as soon as they make local changes.

If I clone a project's git repo, checkout master, and update some files - that's a new codeline even before I commit anything. Similarly if I make my own working copy of the trunk of a subversion repository, that working copy is its own codeline, even if there's no subversion branch involved. An old joke says that if you fall off a tall building, the falling isn't going to hurt you, but the landing will. So with source code: branching is easy, merging is harder.

Source control systems that record every change on the commit do make the process of merging easier, but they don't make it trivial. If Scarlett and Violet both change the name of a variable, but to different names, then there's a conflict that the source management system cannot resolve without human intervention.

To make it more awkward this kind of textual conflict is at least something the source code control system can spot and alert the humans to take a look.

But often conflicts appear where the text merges without a problem, but the system still doesn't work. Imagine Scarlett changes the name of a function, and Violet adds some code to her branch that calls this function under its old name. This is what I call a Semantic Conflict. When these kinds of conflicts happen the system may fail to build, or it may build but fail at run-time.

The problem is familiar to anyone who has worked with concurrent or distributed computing. We have some shared state the code base with developers making updates in parallel. We need to somehow combine these by serializing the changes into some consensus update. Our task is made more complicated by the fact that getting a system to execute and run correctly implies very complex validity criteria for that shared state.

There's no way of creating a deterministic algorithm to find consensus. Humans need to find the consensus, and that consensus may involve mixing choice parts of different updates.

Often consensus can only be reached with original updates to resolve the conflicts. I start with: "what if there was no branching". Everybody would be editing the live code, half-baked changes would bork the system, people would be stepping all over each other. And so we give individuals the illusion of frozen time, that they are the only ones changing the system and those changes can wait until they are fully baked before risking the system.

But this is an illusion and eventually the price for it comes due. Who pays? How much? That's what these patterns are discussing: alternatives for paying the piper. Hence the rest of this article, where I lay out various patterns that support the pleasant isolation and the rush of wind through your hair as you fall, but minimizing the consequences of the inevitable contact with the hard ground. The mainline is a special codeline that we consider to be the current state of the team's code.

Whenever I wish to start a new piece of work, I'll pull code from mainline into my local repository to begin working on. Whenever I want to share my work with the rest of the team, I'll update that mainline with my work, ideally using the Mainline Integration pattern that I'll discuss shortly. Different teams use different names for this special branch, often encouraged by the conventions of the version control systems used.

I must stress here that mainline is a single, shared codeline. Usually such teams have a central repository - a shared repository that acts as the single point of record for the project and is the origin for most clones. Starting a new piece of work from scratch means cloning that central repository.

If I already have a clone, I begin by pulling master from the central repository, so it's up to date with the mainline. In this case mainline is the master branch in the central repository. While I'm working on my feature, I have my own personal development branch which may be my local master, or I may create a separate local branch.

If I'm working on this for a while, I can keep up to date with changes in the mainline by pulling mainline's changes at intervals and merging them into my personal development branch. Similarly, if I want to create a new version of the product for release, I can start with the current mainline.

If I need to fix bugs to make the product stable enough for release, I can use a Release Branch. I remember going to talk to a client's build engineer in the early s.

His job was assemble a build of the product the team was working on. He'd send an email to every member of the team, and they would reply by sending over various files from their code base that were ready for integration. He'd then copy those files into his integration tree and try to compile the code base. It would usually take him a couple of weeks to create a build that would compile, and be ready for some form of testing.

In contrast, with a mainline, anyone can quickly start an up-to-date build of the product from the tip of mainline. Furthermore, a mainline doesn't just make it easier to see what the state of the code base is, it's the foundation for many other patterns that I'll be exploring shortly.

On each commit, perform automated checks, usually building and running tests, to ensure there are no defects on the branch. Since Mainline has this shared, approved status, it's important that it be kept in a stable state. Again in the early s, I remember talking to a team from another organization that was famous for doing daily builds of each of their products.

This was considered quite an advanced practice at the time, and this organization was lauded for doing it. What wasn't mentioned in such write ups was that these daily builds didn't always succeed. Indeed it wasn't unusual to find teams whose daily builds hadn't compiled for several months.

To combat this, we can strive to keep a branch healthy - meaning it builds successfully and the software runs with few, if any, bugs. To ensure this, I've found it critical that we write Self Testing Code. This development practice means that as we write the production code, we also write a comprehensive suite of automated tests so that we can be confident that if these tests pass, then the code contains no bugs.

If we do this, then we can keep a branch healthy by running a build with every commit, this build includes running this test suite. Should the system fail to compile, or the tests fail, then our number one priority is to fix them before we do anything else on that branch.

Often this means we "freeze" the branch - no commits are allowed to it other than fixes to make it healthy again.

There is a tension around the degree of testing to provide sufficient confidence of health. Many more thorough tests require a lot of time to run, delaying feedback on whether the commit is healthy.

Teams handle this by separating tests into multiple stages on a Deployment Pipeline. The first stage of these tests should run quickly, usually no more than ten minutes, but still be reasonably comprehensive. I refer to such a suite as the commit suite although it's often referred to as "the unit tests" since the commit suite usually is mostly Unit Tests. Ideally the full range of tests should be run on every commit. However if the tests are slow, for example performance tests that need to soak a server for a couple of hours, that isn't practical.

These days teams can usually build a commit suite that can run on every commit, and run later stages of the deployment pipeline as often as they can. That the code runs without bugs is not enough to say that the code is good. In order to maintain a steady pace of delivery, we need to keep the internal quality of the code high.

A popular way of doing that is to use Pre-Integration Review , although as we shall see, there are other alternatives. Each team should have clear standards for the health of each branch in their development workflow. There is an immense value in keeping the mainline healthy. If the mainline is healthy then a developer can start a new piece of work by just pulling the current mainline and not be tangled up in defects that get in the way of their work.

Too often we hear people spending days trying to fix, or work around, bugs in the code they pull before they can start with a new piece of work. A healthy mainline also smooths the path to production.

A new production candidate can be built at any time from the head of the mainline. The best teams find they need to do little work to stabilize such a code-base, often able to release directly from mainline to production. Critical to having a healthy mainline is Self Testing Code with a commit suite that runs in a few minutes. It can be a significant investment to build this capability, but once we can ensure within a few minutes that my commit hasn't broken anything, that completely changes our whole development process.

We can make changes much more quickly, confidently refactor our code to keep it easy to work with, and drastically reduce the cycle time from a desired capability to code running in production. For personal development branches, it's wise to keep them healthy since that way it enables Diff Debugging.

But that desire runs counter to making frequent commits to checkpoint your current state. I might make a checkpoint even with a failing compile if I'm about to try a different path. The way I resolve this tension is to squash out any unhealthy commits once I'm done with my immediate work.

That way only healthy commits remain on my branch beyond a few hours. If I keep my personal branch healthy, this also makes it much easier to commit to the mainline - I know that any errors that crop up with Mainline Integration are purely due to integration issues, not errors within my codebase alone. This will make it much quicker and easier to find and fix them. Branching is about managing the interplay of isolation and integration.

Having everyone work on a single shared codebase all the time, doesn't work because I can't compile the program if you're in the middle of typing a variable name. So at least to some degree, we need a notion of a private workspace that I can work on for a while. Modern source code controls tools make it easy to branch and monitor changes to those branches. At some point however we need to integrate. Thinking about branching strategies is really all about deciding how and when we integrate.

Developers integrate their work by pulling from mainline, merging, and - if healthy - pushing back into mainline. A mainline gives a clear definition of what the current state of the teams' software looks like. One of the biggest benefits of using a mainline is that it simplifies integration. Without mainline, it's the complicated task of coordinating with everyone in the team that I described above.

With a mainline however, each developer can integrate on their own. I'll walk through an example of how this works. A developer, who I'll call Scarlett, starts some work by cloning the mainline into her own repository. With git, if she doesn't already have a clone of the central repository, she would clone it and checkout the master branch. If she already has the clone, she would pull from mainline into her local master.

She can then work locally, making commits into her local master. While she's working, her colleague Violet pushes some changes onto mainline. As she's working in her own codeline, Scarlett can be oblivious to those changes while she works on her own task. At some point, she reaches a point where she wants to integrate. The first part of this is to fetch the current state of mainline into her local master branch, this will pull in Violet's changes.

Now she needs to combine her changes with those of Violet. Some teams like to do this by merging, others by rebasing. In general people use the word "merge" whenever they talk about bringing branches together, whether they actually use a git merge or rebase operation. I'll follow that usage, so unless I'm actually discussing the differences between merging and rebasing consider "merge" to be the logical task that can be implemented with either.

There's a whole other discussion on whether to use vanilla merges, use or avoid fast-forward merges, or use rebasing. That's outside the scope of this article, although if people send me enough Tripel Karmeliet, I might write an article on that issue. After all, quid-pro-quos are all the rage these days. If Scarlett is fortunate, merging in Violet's code will be a clean merge, if not she'll have some conflicts to deal with.

These may be textual conflicts, most of which the source control system can handle automatically. But semantic conflicts are much harder to deal with, and this is where Self Testing Code is very handy.

Since conflicts can generate a considerable amount of work, and always introduce the risk of a lot of work, I mark them with an alarming lump of yellow. At this point, Scarlett needs to verify that the merged code satisfies the health standards of the mainline assuming mainline is a Healthy Branch.

This usually means building the code and running whatever tests form the commit suite for mainline. She needs to do this even if it's a clean merge, because even a clean merge can hide semantic conflicts. Any failures in the commit suite should be purely due to the merge, since both merge parents should be green. Knowing this should help her track down the problem as she can look at the diffs for clues. With this build and test, she has successfully pulled mainline into her codeline, but - and this is both important and often overlooked - she hasn't yet finished integrating with mainline.

To finish integrating she must push her changes into the mainline. Unless she does this, everyone else on the team will be isolated from her changes - essentially not integrating. Integration is both a pull and a push - only once Scarlett has pushed is her work integrated with the rest of the project.

Many teams these days require a code review step before commit is added to mainline - a pattern I call Pre-Integration Review and will discuss later. Occasionally someone else will integrate with mainline before Scarlett can do her push. In which case she has to pull and merge again. Usually this is only an occasional issue and can be sorted out without any further coordination. I have seen teams with long builds use an integration baton, so that only the developer holding the baton could integrate.

But I haven't heard so much of that in recent years as build times improve. As the name suggests, I can only use mainline integration if we're also using mainline on our product.

One alternative to using mainline integration is to just pull from mainline, merging those changes into the personal development branch. This can be useful - pulling can at least alert Scarlett to changes other people have integrated, and detect conflicts between her work and mainline.

But until Scarlett pushes, Violet won't be able to detect any conflicts between what she's working on and Scarlett's changes. It's common to hear someone say they are integrating the mainline into their branch when they are merely pulling.

I've learned to be wary of that, and probe further to check to see if they mean just a pull or a proper mainline integration. The consequences of the two are very different, so it's important not to confuse the terms.

Another alternative is when Scarlett is in the middle of doing some work that isn't ready for full integration with the rest of the team, but it overlaps with Violet and she wants to share it with her. In that case they can open a Collaboration Branch.

Put all work for a feature on its own branch, integrate into mainline when the feature is complete. With feature branching, developers open a branch when they begin work on a feature, continue working on that feature until they are done, and then integrate with mainline. For example, let's follow Scarlett.

She would pick up the feature to add collection of local sales taxes to their website. She begins with the current stable version of the product, she'll pull mainline into her local repository and then create a new branch starting at the tip of the current mainline.

She works on the feature for as long as it takes, making a series of commits to that local branch. While she's working, other commits are landing on mainline.

So from time to time she may pull from mainline so she can tell if any changes there are likely to impact her feature. Note this isn't integration as I described above, since she didn't push back to mainline. At this point only she is seeing her work, others don't. Some teams like to ensure all code, whether integrated or not, is kept in the central repository.

In this case Scarlett would push her feature branch into the central repository. This would also allow other team members to see what she's working on, even if it's not integrated into other people's work yet.

When she's done working on the feature, she'll then perform Mainline Integration to incorporate the feature into the product.

If Scarlett works on more than one feature at the same time, she'll open a separate branch for each one. Feature Branching is a popular pattern in the industry today. To talk about when to use it, I need to introduce its principal alternative - Continuous Integration.

But first I need to talk about the role of integration frequency. The key is choosing a system and working as a team to fine-tune and improve that approach so you can continue to reduce waste, maximize efficiency, and master collaboration.

Agile is the most common term used to describe development methods. Most software development methodologies are agile with a strong emphasis on iteration, collaboration, and efficiency, as opposed to traditional project management.

The agile process, on the other hand, is more like jazz, which comes together through collaboration, experimentation, and iteration between band members. At the beginning of a project, project managers gather all of the necessary information and use it to make an informed plan of action up front. The approach is plan-driven and rigid, leaving little room for adjustments. Feature driven development is also considered an older methodology. As the name says, this process focuses on frequently implementing client-valued features.

The process is adaptive, improving based on new data and results that are collected regularly to help software developers identify and react to errors.

This kind of focused agile methodology can work for some teams that want a highly structured approach and clear deliverables while still leaving some freedom for iteration. Lean software development comes from the principles of lean manufacturing. At its core, lean development strives to improve efficiency by eliminating waste. The five lean principles provide a workflow that teams use to identify waste and refine processes.

Lean is also a guiding mindset that can help people work more efficiently, productively, and effectively. The philosophies and principles of lean can be applied to agile and other software development methodologies.

Lean development provides a clear application for scaling agile practices across large or growing organizations. Gantt chart. Program management vs. Product Roadmaps. Product Manager. Tips for new product managers.

Tips for presenting product roadmaps. How to prioritize features using NPS. Product analytics. Remote product management. Managing an agile portfolio. Lean portfolio management. Long-term agile planning. What is SAFe? Spotify model. Organizational agility with Scrum Scale. Scaling agile with Rosetta Stone. Using Improvement Kata to support lean. Agile iron triangle. Beyond the basics whitepaper. Dev managers vs scrum masters.

Git branching video. Code reviews. Stress free release. Qa at speed. Technical debt. Incident response. Continuous integration. Product design process customer interview. Collaborative design in agile teams video. What is Agile Marketing? How to create an agile marketing team. Connecting business strategy to development reality. Agile is a competitive advantage.

Cultivating an agile mindset. Going agile.

emgnocalcrib1975's Ownd

0コメント

1000 / 1000