Real-world Git workflows
Illustrations from multiple sources
Git and teamwork
Git is simple to use for personal work.
The big difference when moving to real-world team-based development is the need to establish policies for who can maintain the code in the shared repository.
Principal approaches (although with many variations) are:
Collocated Contributor Repositories: (Fork and Pull)
Where team members have their changes accepted and pulled by project maintainers. Repos are on a centralized host.
COMP3297: Software Engineering 2
Single Repository, Shared Maintenance: (Shared Repo)
Where every team member is a maintainer and can make changes directly. Relies on policies to restrict access to the main project branch.
Shared Repository
We’ll describe the workflows mostly for a central shared repository model: the most common code sharing model for team-based development.
o Developers pull the latest code from a central shared repository to their own private repository
o They make their changes locally
o They share their changes by pushing them back to the shared repository
COMP3297: Software Engineering 3
Note: The focus on a shared repository model is not a limitation. The basic strategies can be used with different models.
For example, branching workflows work just as well with fully distributed models where each developer maintains both a private and a public repository.
Here developers share their changes by making their branches available in their public repo and pull changes from other developers’ public repos to keep current.
COMP3297: Software Engineering 4
Releasing and Workflow
5
Some workflows are better for certain product release strategies than others. We need to dig into those strategies a little to select an appropriate workflow. There are 2 main steps:
Staging: This is getting the application running in a target environment. Release: This is pushing the application into production where it is available to
users.
A popular approach in both traditional and agile development is to schedule releases. Application development is broken into stages – releases. Each represents a period of time during which the team works on some limited scope which is then released and delivered to users.
Most agile teams integrate continuously while working towards a release.
Many take this further and deploy continuously rather than scheduling releases.
COMP3297: Software Engineering
Continuous Integration (CI)
Developers don’t work on their features in isolation and then submit them for integration at the end of the sprint. Rather, everyone merges their code changes into the central repository many times during a sprint. Often even many times a day.
Integration testing tries to find conflicts in the interfaces and interactions between new, unit tested code and existing code. The idea is if developers do this often, “merge hell” is avoided and any conflicts that do arise should be easier to resolve.
The CI server detects the commit, builds, and then runs tests automatically. This happens several times a day.
If the build breaks, the developer fixes it immediately. This is the highest priority task.
COMP3297: Software Engineering 6
CI helps keep the code in a good state. But there are several more steps needed to get code into production. Those can be automated such that code changes are not just automatically built, but are automatically deployed to a test environment. Extends Continuous Integration to Continuous Delivery. The latest build at this point is always ready to release.
The final decision to deploy – to actually release – to the live production environment is made by some authority.
If we remove this final step and deploy builds as soon as they are past the QA tests, we have Continuous Deployment.
Source Control Commit unit-tested changes (pre-submit)
Build Integration test
Staging
Deploy to test environment. System level tests functional and non-functional tests – soak and load tests, acceptance tests
Production
Deploy to production environment
COMP3297: Software Engineering
Note: Terminology for types of testing varies.
7
Types of Workflow
COMP3297: Software Engineering 8
Two main approaches, with many hybrids:
o Trunk-based Development
Here there is only one long-lived branch – the trunk (or master, or main in Git). Developers commit directly to main for small-scale projects, or work in short-lived feature branches for larger-scale projects.
The goal is simplicity and to avoid the difficulties of merging code that has diverged in separate long-lived branches. Since the team is committing continuously to main, it is working at HEAD and the approach is a good match for CI, and Continuous Deployment and Delivery.
o Branching Workflows
All work is done in branches and changes to main are strictly controlled.
Simple Trunk-based Development
This is the simplest model that can be used for collaborative work. There is only a single main branch in the central repo. No branches.
All developers have open access to it and push their changes to it.
A typical developer’s workflow would be to clone the central repo, then: o Make and commit changes locally and test
o Push to the central repo
COMP3297: Software Engineering 9
Simple Trunk-based Development
Can such a simple strategy work? Yes, under certain conditions.
Advantages:
Fast with minimal bureaucracy
Good for small teams of experienced developers who can collaborate well to avoid merge/rebase conflicts
Can be good for early stage development
Disadvantages:
No opportunity to peer review individual changes before they are pushed to the trunk
Resulting conflicts can compromise scaling
Can’t trust that trunk is always ready to deploy. It may contain broken code
Dangerous if there are inexperienced developers in the team
Bad where strict control over modifications is required, such as for a valuable, well established product
COMP3297: Software Engineering 10
Code Reviews
Code reviews – having other developers read and comment on your code changes – is a very effective way to keep code quality high. Ideally it is done in combination with testing.
In all workflows we want the main branch to stay “pure” and contain only finished work that has been approved.
To add peer review, when you want to merge a change into the project, you need to tell other developers about it so that it can be reviewed, discussed, modified, and approved (if your process will require approval).
This requires some way of isolating the change so that it can be reviewed. In Git, this is done by developing in branches off the trunk.
Then the code review and approval process is initiated by making a pull request.
COMP3297: Software Engineering 11
Pull Requests
A pull request tells other developers you’ve completed a feature and have pushed to a branch.
You are asking for your changes to be reviewed and approved – usually by a maintainer.
Ultimately, it is a request by you to have your changes merged into the trunk.
It is such an effective mechanism for quality control that special support for managing pull requests is provided on all Git hosting platforms.
(Pull requests on GitHub and Bitbucket; Merge requests on GitLab)
COMP3297: Software Engineering 12
Pull Requests
After a pull request is created, it acts as a forum to discuss the change and track the discussion.
Other developers review the change, discuss it, propose changes, and can make and commit modifications.
Common to set process rules to ensure a change has been sufficiently reviewed and approved before it is added to the trunk.
Often, only a particular developer, such as a project maintainer, has permissions to merge the change and close the pull request.
COMP3297: Software Engineering 13
COMP3297: Software Engineering 14
Branching workflows
Here development of any feature or bug-fix must take place in a dedicated branch rather than in the trunk. This solves many of the disadvantages of simple trunk-based development:
Individual changes can easily be reviewed via pull requests
Developers can collaborate to work on a feature without impacting the trunk The trunk is much less likely to receive broken code.
There are many branching workflows. They differ in what branches are required to manage a project’s lifecycle, and how long those branches live.
Prefer workflows with short-lived branches to avoid merge problems.
Modifying the simple trunk-based strategy such that all work is carried out in short- lived branches can provide many of its benefits without significant disadvantages. But some new issues arise:
• the workflow becomes more complex
• there is a potential bottleneck introduced by the pull request process
COMP3297: Software Engineering 15
To develop a new feature:
Get the latest state of main into your local repo
Create a new local branch for the feature
Develop and commit to the feature branch. Keep it short-lived.
When finished and tested, push the feature branch to the central repo Create a pull request
Make and commit changes that come out of the pull request discussions Feature is merged into main
Still has some potential for bad feature code in main, although much less likely. But if deploying directly and continuously from main, it’s good to have the ability to turn off a new feature with Feature Flags – these act like toggles.
Simple branching workflow
COMP3297: Software Engineering 16
Extending simple Trunk-based
Development
Branch directly of the trunk and come back as a pull request.
Short-lived branches (sometimes called topic branches) really must be short-lived or else we are heading back to merge hell.
Branches are not shared in a team for communal development. Preferably no more than one person works on a branch. Only shared for review.
COMP3297: Software Engineering 17
Trunk-based Development: Release Branches
It’s common to tag commits that represent releases to identify the release by number (say v2.01).
In Git, a tag is like a branch pointer that doesn’t change. It marks a commit as an important point in the repo’s history.
That can be all that is needed to mark a release point if the team is releasing continuously.
If the time between release is longer, then it is likely that bug fixes will be needed before the next release. In that case a release branch is created for each release.
Development continues on the trunk and the release branch stays stable apart from bug fixes until it is deleted after it is no longer in production.
COMP3297: Software Engineering 18
Fixing bugs in Releases
COMP3297: Software Engineering 19
The rule is to fix it in the trunk first. Reproduce the problem on the trunk, fix it and then cherry pick that commit into the release branch.
This is to avoid the possibility of fixing something in a feature branch and forgetting to apply the fix to the trunk as well.
demo6$ git checkout main
demo6$ git cherry-pick feature^
Or use the commit’s SHA-1 hash
Cherry-picking:
But in our case we want to cherry-pick from main into release
“Given one or more existing commits, apply the change each one introduces, recording a new commit for each.”
Usually used to take the changes made in a single commit on some branch, and apply them on another branch.
feature
com2 com3
com4
com0
com1
com5
main
HEAD
com2
com5
feature
com3
com3*
com4
com0
com1
main
COMP3297: Software Engineering
HEAD
20
More complex Branching Workflows
For companies that wanted much stricter control over development, and clean branches named in a systematic way, GitFlow became very popular.
It has two principal, never-ending branches – one for production code and the other for development.
Features are developed off the development branch and pull requests are raised to merge back into it when finished. This is strictly controlled.
Releases are prepared in dedicated branches off the development branch and are eventually merged into production and also back into development.
Works well for teams with a lot of junior developers, but the tight management can frustrate senior developers.
Not so agile as trunk-based development and too many bottlenecks for startups. One strategy is to relax the rules in early stages of development.
COMP3297: Software Engineering 21
Master
Develop
Feature
Release
Hotfix
GitFlow
Good for products with scheduled releases. Not good for continuous deployment.
Production code and official release history. Kept pure.
Pre-production code where features are integrated
Where new features are developed. Branches from Develop
Release preparation. No new features can be added. Is deleted after merging into Master and Develop.
Fixes for production code. Merges into Master and Develop.
Tags
COMP3297: Software Engineering 22
History: Linear or Non-linear.
What sort of history do you want?
Engineers/companies have strong preferences about whether to keep a full history of every commit , making all branch merges easily visible in the history, or whether to change the history to simplify it.
Need to understand these issues to understand arguments about the risks/benefits of various workflow implementations.
COMP3297: Software Engineering 23
History: Linear or Non-linear.
Example: Say we perform our new development work in a feature branch off main. We commit often, without each commit necessarily being a logical unit of work (could be a “go to lunch commit”). Likely some commits even break the code since they are works in progress.
When we are ready to merge the finished work back into main, it turns out that Git can do a fast-forward merge. We have a clean linear history, but it makes it harder to see the series of commits that together represent the history of the feature. We also end up with a series of non-meaningful commits in main.
We can force Git to create a merge-commit ( –no-ff ) whenever we merge into main, but a downside is if there are many feature branches, history can become very complex.
COMP3297: Software Engineering 24
From the fast-forward example:
We can force Git to create a merge-commit by using the –no-ff option:
demo4$ git merge –no-ff feature
History: Linear or Non-linear. Techniques
com2
feature
main
HEAD
com3
com0
com1
A more readable history?
But can become unreadable:
com2
com3
New merge- commit
feature
com0 com1
mcom
main
HEAD
COMP3297: Software Engineering
25
We can do the opposite of the previous slide. Prevent three-way merges by rebasing. Rebasing takes the changes committed on a branch and replays them on another to give a linear history.
(Take care: Dangerous/annoying if the branch has already been pushed to a remote repo.)
demo5$ git rebase main
History: Linear or Non-linear. Techniques
com2
com3
HEAD
feature
com0 com1
com4
main
com2 com4
com3
com2*
HEAD
feature
com0 com1
com3*
COMP3297: Software Engineering
26
main
Finish by fast-forward merging
History: Linear or Non-linear. Techniques
If we have a series of commits that, together represent a single coherent change, we can combine them into a single commit.
Safe to rewrite history like this if we haven’t yet shared the commits.
Use an interactive rebase to squash the commits into a single commit.
demo6$ git rebase -i HEAD~2
This will open an editor where we can specify which commit should be squashed into the other.
COMP3297: Software Engineering
OneFlow: a simplified GitFlow
There is only one “eternal” branch in the repo: master.
It contains the project history. Unlike GitFlow, it is readable!
Like GitFlow, every production release is based on the previous release. Also not good for continuous delivery or deployment
Feature branches mostly exist only in a developer’s local repo. They are pushed to the central repo only if multiple developers need to work on them, or as a backups. After they are integrated into master, they are deleted from the central repo. master is the one branch that persists there.
Integrating a feature
rebase -i
merge –no-ff
rebase -i + merge –no-ff
two commits squashed into one
COMP3297: Software Engineering
easy to revert
28
clean history
clean history
easy to revert
OneFlow
Release branches branch from some commit in master. Tip is tagged with the version number. Merged into master for permanent versioning and branch is deleted.
Hotfix branches are handled similar to release branches.
COMP3297: Software Engineering 29
GitHub Flow
main
commit open branch changes pull
request
review deploy merge
More suitable for continuous deployment than GitFlow and OneFlow since it is not designed around releases.
Like many of the branching flows, main must be stable and should never have anything pushed to it that has not been tested or has potential to break the build.
Thus anything in main will be deployable. At GitHub, every branch that is pushed has tests run on it automatically.
The workflow is similar to the simple branching workflow from earlier.
Except there is an option to deploy to production to verify changes before merging into main. (This can be a bottleneck – waiting in the deployment queue)
If it breaks, just roll back by deploying from current main. The deployment is then
abandoned, and the bad code never reaches main.
COMP3297: Software Engineering 30
main
GitHub Flow
Potential deployment queue bottleneck
branch
commit changes
open review pull
request
deploy
merge
In everyday use at github.com, all changes are deployed to production first.
To do this in a disciplined way, main must be locked until the change can be merged.
Only one pull request can be deployed at a time.
o Request to deploy branch is added to deployment queue.
o main is merged into the branch for deployment
o Build and run tests on branch
o Deploy, maybe to canary servers first – staged deployment o Deploy to production
o If all OK, merge to main
o Unlock main
Bottleneck means you can’t deploy every change continuously at large scales.
COMP3297: Software Engineering
31
Why not just deploy from main? Like before, may need to fix defects, but development has continued on main.
Hotfixes merged into main then cherry-picked into release branch and deployed.
Release Flow Microsoft (2018)
Release by sprint to avoid scaling issue in GitHub Flow continuous deployment
Controlled by pull request
main
Sprint 129
Sprint 130
32
Developers discouraged from making long-lived branches for
features by using feature flags to keep features away from
some/all users until ready for them to use them.
COMP3297: Software Engineering
Release Flow Microsoft (2018)
main
Sprint 129 Sprint 130
Ring-based deployment to limit risk can mean Release M129 needs to be kept, while M130 is deployed to the “fast ring” servers.
But eventually, it is replaced and M129 can be deleted.
COMP3297: Software Engineering
33