One of the best parts about working at Bocoup is the freedom we have to explore ideas and open source projects. The diverse range of experience and interest we all bring to the table means there are always interesting open conversations taking place in the office; both about the implementation of specific ideas and the broader concepts of architecting software. Listening to and participating in these conversations has been a great way for me to evaluate and reflect on my personal experience and methods for doing things.
Eric O’Connor and I recently had a conversation about version control usage patterns and how they can help/hinder a project’s ability to successfully release code, roll back problems, and achieve continuous integration.
Eric knows a lot about executing successful continuous integration, and has contributed to some awesome open source software projects like NGINX and Mozilla. Much of my experience, however, has been more focused on deploying full stack web applications of varying size with backend infrastructure requirements spanning several different server environments. This has made me a believer in the always be deploying philosophy, which is a way to describe a culture where the particulars surrounding the deployment process are part of everyday conversations right from the beginning of the project. Our git workflows and standard deployment processes differ in some interesting ways, even though we both value a healthy and well-thought-out deployment process.
This post will attempt to outline two relatively common git workflows, explore the benefits they can bring to various projects, and hopefully provide enough insight to aid the reader in selecting one!
“All to master” is a very common workflow for software projects and front-end web applications. Just like it sounds, this git workflow has all committers submitting pull requests from their checkouts to the upstream master.
For the purpose of our exploration, continuous deployment means a project’s ability to both integrate and deploy their latest code automatically to any or all of the following: production environment, canonical project download links, package management repositories.
Submitting all pull requests to master fits with continuous integration and deployment like a glove. Webhooks make it easy to detect changes to master and trigger update events everywhere the code happens to be deployed. Webhooks aren’t the only way to cue this process, but they tend to be one of the simplest to execute and understand so it’s what we’ll be talking about here.
While it is certainly possible to use webhooks and continuous deployment with the gitflow branching model it might require a few extra steps. Specifically webhooks that detect changes to master would have to cause a build server to cut a tag which would then trigger additional webhooks responsible for deploying that tag everywhere the code is deployed. Trouble is, generally projects that benefit the most from the structured gitflow process have other road blocks to continuous deployment.
Gitflow works very, very well when many interconnected code and infrastructure dependencies need to be managed, accounted for, and released in tandem. In projects with many moving parts, one does run the risk of ending up in a bit of dependency hell. Support matrices based off the semantic version numbers can help deal with this internally, but in general the main roadblock to continuous deployment is determining what needs to be released together without manual intervention.
In such projects, it can be difficult or impossible to release only one piece of the puzzle, especially when changes involve modifying the underlying infrastructure, which for many projects, is still a manually executed process and generally “codeless”. Tackling this deployment challenge is well outside the scope of this particular post; but have no fear, there is an ever growing body of tools to help! Dependency and configuration management being the main challenge it’s worth looking into tools like Docker, Chef, Puppet, and Ansible. All of which strive to make your infrastructure more portable and transparent. A worthy goal for any complex project with lots of moving parts under the hood.
It’s important that the git workflow used on your project is formalized, and that it’s helping set the project up for future success. The “All to master” workflow brings a lot to the table in this regard, especially for projects like a web application or piece of software where all dependencies either live in the repo, are able to be automatically pulled in at build time (think package.json and npm install), or are otherwise already in place on a generally fixed infrastructure with a minimum amount of “moving parts” or no required in infrastructure at all.
“All to master” also makes it easy to see how the project is changing over time, which is valuable when it comes time to track down a bug or worse yet revert code. Gitflow takes advantage of the “low cost” of branching, but the side effect is that merges into master from the “develop” branch can become massive very quickly; making it difficult to see the individual feature sets contained within. Further, if a project is using git merges instead of rebases even the history on the develop branch results in a less than transparent visualization. This makes it very important that each release has the entirety of its feature-set documented clearly somewhere, preferably in the project’s README file.
Reverting an individual feature once it has been released tends to be simpler in an “all to master” workflow as the completion of each individual feature has it’s own specific commit hash within master and release associated with it. As a result, going back and reverting an individual feature is still pretty straightforward even on projects that have achieved full continuous deployment and are releasing several times per day.
Within gitflow, post-release reverts are less common. While the set of commits for an individual feature can be tracked down and master cleaned up with git rebase, generally the gitflow process favors developing and deploying “hotfix/” branches to correct issues.
It’s worth mentioning that neither workflow really prevents a team from releasing bugs or code that otherwise might need to be reverted. While it’s true that the gitflow model keeps several features in a “holding pattern” by using the develop branch, this doesn’t make bugs any easier to detect or squash even though it should in theory allow for more time to catch them before they go live.
Some projects find themselves in a situation where they want tight control over what gets released and when for one reason or another. Sometimes in this situation it can seem that there’s no point to setting up a workflow that plays nice with continuous integration. Nothing could be further from the truth.
Even if a project is using gitflow, continuous integration can be leveraged to provide a consistently up-to-date environment for the current “develop” branch, encompassing everything that is queued up to be released. This doesn’t really address the “visualizing changes over time” problem discussed above, but depending on the project this may not be a priority, especially if the team excels at staying on top of documentation and release notes.
Neither one is right. There is no right. There is only what works for a project at any given time. It’s possible that the best workflow will change as your project evolves.
Deployment and development workflows are living things, which deserve to be constantly evaluated and improved upon. Using version control is an excellent first step in that regard, and while it’s certainly true that any process is better than no process, don’t let the pain of choosing, learning, and executing a formalized established workflow be an excuse for inaction.