One of the best parts about working at Bocoup is the freedom we have to explore ideas and open source projects. The diverse range of experience and interest we all bring to the table means there are always interesting open conversations taking place in the office; both about the implementation of specific ideas and the broader concepts of architecting software. Listening to and participating in these conversations has been a great way for me to evaluate and reflect on my personal experience and methods for doing things.
Eric O'Connor and I recently had a conversation about version control usage patterns and how they can help/hinder a project's ability to successfully release code, roll back problems, and achieve continuous integration.
Eric knows a lot about executing successful continuous integration, and has contributed to some awesome open source software projects like NGINX and Mozilla. Much of my experience, however, has been more focused on deploying full stack web applications of varying size with backend infrastructure requirements spanning several different server environments. This has made me a believer in the always be deploying philosophy, which is a way to describe a culture where the particulars surrounding the deployment process are part of everyday conversations right from the beginning of the project. Our git workflows and standard deployment processes differ in some interesting ways, even though we both value a healthy and well-thought-out deployment process.
This post will attempt to outline two relatively common git workflows, explore the benefits they can bring to various projects, and hopefully provide enough insight to aid the reader in selecting one!
All Pull Requests Merge to Master
"All to master" is a very common workflow for software projects and front-end web applications. Just like it sounds, this git workflow has all committers submitting pull requests from their checkouts to the upstream master.
- All features are developed on "feature/" branches (or otherwise prefixed, a common pattern when used with an issue tracker is to prefix with an issue number).
- All pull requests are submitted to a common "develop" branch
- Generally a group of approved features is merged into develop and when they are all complete a release branch is made off of develop, prefixed with "release/".
- The release branch is tested, and any bug/regression fixes are merged directly into it.
- At deployment time, this release branch is merged back down into develop and master, and a tag is cut. This tag is named with semantic versioning and becomes the latest release.
- Any bugs/regressions found after the release is cut are dealt with using "hotfix/" prefixed branches, which are tested and merged in the same way as standard "release/" branches.
Achieving Continuous Deployment
For the purpose of our exploration, continuous deployment means a project's ability to both integrate and deploy their latest code automatically to any or all of the following: production environment, canonical project download links, package management repositories.
Submitting all pull requests to master fits with continuous integration and deployment like a glove. Webhooks make it easy to detect changes to master and trigger update events everywhere the code happens to be deployed. Webhooks aren't the only way to cue this process, but they tend to be one of the simplest to execute and understand so it's what we'll be talking about here.
While it is certainly possible to use webhooks and continuous deployment with the gitflow branching model it might require a few extra steps. Specifically webhooks that detect changes to master would have to cause a build server to cut a tag which would then trigger additional webhooks responsible for deploying that tag everywhere the code is deployed. Trouble is, generally projects that benefit the most from the structured gitflow process have other road blocks to continuous deployment.
Gitflow works very, very well when many interconnected code and infrastructure dependencies need to be managed, accounted for, and released in tandem. In projects with many moving parts, one does run the risk of ending up in a bit of dependency hell. Support matrices based off the semantic version numbers can help deal with this internally, but in general the main roadblock to continuous deployment is determining what needs to be released together without manual intervention.
In such projects, it can be difficult or impossible to release only one piece of the puzzle, especially when changes involve modifying the underlying infrastructure, which for many projects, is still a manually executed process and generally "codeless". Tackling this deployment challenge is well outside the scope of this particular post; but have no fear, there is an ever growing body of tools to help! Dependency and configuration management being the main challenge it's worth looking into tools like Docker, Chef, Puppet, and Ansible. All of which strive to make your infrastructure more portable and transparent. A worthy goal for any complex project with lots of moving parts under the hood.
Pros, Cons, and Project Types
It's important that the git workflow used on your project is formalized, and that it's helping set the project up for future success. The "All to master" workflow brings a lot to the table in this regard, especially for projects like a web application or piece of software where all dependencies either live in the repo, are able to be automatically pulled in at build time (think package.json and npm install), or are otherwise already in place on a generally fixed infrastructure with a minimum amount of "moving parts" or no required in infrastructure at all.
Visualizing Changes Over Time
"All to master" also makes it easy to see how the project is changing over time, which is valuable when it comes time to track down a bug or worse yet revert code. Gitflow takes advantage of the "low cost" of branching, but the side effect is that merges into master from the "develop" branch can become massive very quickly; making it difficult to see the individual feature sets contained within. Further, if a project is using git merges instead of rebases even the history on the develop branch results in a less than transparent visualization. This makes it very important that each release has the entirety of its feature-set documented clearly somewhere, preferably in the project's README file.
Reverting an individual feature once it has been released tends to be simpler in an "all to master" workflow as the completion of each individual feature has it's own specific commit hash within master and release associated with it. As a result, going back and reverting an individual feature is still pretty straightforward even on projects that have achieved full continuous deployment and are releasing several times per day.
Within gitflow, post-release reverts are less common. While the set of commits for an individual feature can be tracked down and master cleaned up with git rebase, generally the gitflow process favors developing and deploying "hotfix/" branches to correct issues.
It's worth mentioning that neither workflow really prevents a team from releasing bugs or code that otherwise might need to be reverted. While it's true that the gitflow model keeps several features in a "holding pattern" by using the develop branch, this doesn't make bugs any easier to detect or squash even though it should in theory allow for more time to catch them before they go live.
The best of both worlds?
Some projects find themselves in a situation where they want tight control over what gets released and when for one reason or another. Sometimes in this situation it can seem that there's no point to setting up a workflow that plays nice with continuous integration. Nothing could be further from the truth.
Even if a project is using gitflow, continuous integration can be leveraged to provide a consistently up-to-date environment for the current "develop" branch, encompassing everything that is queued up to be released. This doesn't really address the "visualizing changes over time" problem discussed above, but depending on the project this may not be a priority, especially if the team excels at staying on top of documentation and release notes.
Gitflow, Feature and Release Branches
- High degree of control over what's released
- No risk of unintended releases due to a renegade commit or merge
- Best on projects with many layers of complexity and interdependent moving parts all needing to be released in tandem.
- Complex to roll-back individual features once they have been released, forced to quickly create and deploy a "hotfix".
- Large merges into master making it difficult to visualize how a project is changing overtime using only the git history. Becomes necessary to maintain additional documentation
- Can feed into a perception that long manual testing cycles and/or release processes always result in fewer bugs, which is a fallacy.
All To Master
- Encourages extensive vetting of individual pull requests, especially when used in tandem with continuous deployment.
- Simplicity of individual feature rollback
- When used with continuous integration and deployment, allows for the immediate release of newly developed and approved features by any member of the team without the need to touch any of the underlying infrastructure
- Master is a living entity on a software project and can be changed at any time; release tags are immutable.
- Difficult to coordinate simultaneous release of dependent code or infrastructure changes.
- On larger teams, it's likely a kind of "repo owner bureaucracy" will be needed to ensure code quality standards and adherence to some representation of an overarching feature roadmap.
Picking what's right for you
Neither one is right. There is no right. There is only what works for a project at any given time. It's possible that the best workflow will change as your project evolves.
Deployment and development workflows are living things, which deserve to be constantly evaluated and improved upon. Using version control is an excellent first step in that regard, and while it's certainly true that any process is better than no process, don't let the pain of choosing, learning, and executing a formalized established workflow be an excuse for inaction.