- Git:Mastering Version Control
- Ferdinando Santacroce Aske Olsson Rasmus Voss Jakub Nar?bski
- 5565字
- 2021-07-08 10:46:58
Chapter 5. Obtaining the Most – Good Commits and Workflows
Now that we are familiar with Git and versioning systems, it's time to look at the whole thing from a much higher perspective to become aware of common patterns and procedures.
In this chapter, we will walk through some of the most common ways to organize and build meaningful commits and repositories. We will obtain not only a well-organized code stack, but also a meaningful source of information.
The art of committing
While working with Git, committing seems the easiest part of the job: you add files, write a short comment, and then, you're done. However, it is because of its simplicity that often, especially at the very beginning of your experience, you acquire the bad habit of doing terrible commits: too late, too big, too short, or simply equipped with bad messages.
Now, we will take some time to identify possible issues, drawing attention to tips and hints to get rid of these bad habits.
Building the right commit
One of the harder skills to acquire while programming in general is to split the work in small and meaningful tasks.
Too often, I have experienced this scenario. You start to fix a small issue in a file. Then, you see another piece of code that can be easily improved, even if it is not related to what you are working on now. You can't resist it, and you fix it. At the end and after a small time, you find yourself with tons of concurrent files and changes to commit.
At this point, things get worse, because usually, programmers are lazy people. So, they don't write all the important things to describe changes in the commit message. In commit messages, you start to write sentences such as "Some fixes to this and that", "Removed old stuff", "Tweaks", and so on, without anything that helps other programmers understand what you have done.

Courtesy of http://xkcd.com/1296/
At the end, you realize that your repository is only a dump where you empty your index now and then. I have seen some people committing only at the end of the day (and not every day) to keep a backup of the data or because someone else needed the changes reflected on their computer.
Another side effect is that the resulting repository history becomes useless for anything other than retrieving content at a given point in time.
The following tips can help you turn your Version Control System (VCS) from a backup system into a valuable tool for communication and documentation.
Make only one change per commit
After the routine morning coffee, you open your editor and start to work on a bug, BUG42
. While working around fixing the bug in the code, you realize that fixing BUG79
will require tweaking just a single line of code. So, you fix it. However, you not only change that awful class name, but also add a good-looking label to the form and make a few more changes. The damage is done.
How can you wrap up all that work in a meaningful commit now? Maybe, in the meantime, you went home for lunch, talked to your boss about another project, and you can't even remember all the little things you did.
In this scenario, there is only one way to limit the damage: split the files to commit among more than one commit. Sometimes, this helps to reduce the pain, but it is only palliative. Very often, you modify the same file for different reasons, so doing this is quite difficult, if not impossible. The last hope is to use git add -p
command, that let's you to stage only some modification on a file, grouping them in different commit to separate topics.
The only way to definitely solve this problem is to only make one change per commit. It seems easy, I know, but it is quite difficult to acquire this ability. There are no tools for this. No one, but you, can help. It only needs discipline, the most lacking virtue in creative people such as programmers.
There are some tips to pursue this aim; let's have a look at them together.
Split up features and tasks
As said earlier, breaking up the things to do is a fine art. If you know and adopt some Agile movement techniques, you will have probably faced these problems. So, you have an advantage; otherwise, you will need some more effort, but it is not something that you can't achieve.
Consider that you have been assigned to add the Remember me check in the login page of a web application, like the one shown here:

This feature is quite small, but implies changes at different levels. To accomplish this, you'll have to:
- Modify the UI to add the check control
- Pass the "is checked" information through different layers
- Store this information somewhere
- Retrieve this information when needed
- Invalidate (set it to false) following some kind of policy (after 15 days, after 10 logins, and so on)
Do you think you can do all these things in one shot? Yes? You are wrong! Even if you estimate a couple of hours for an ordinary task, remember that Murphy's law is in ambush. You will receive four calls, your boss will look for you for three different meetings, and your computer will go up in flames.
This is one of the first things to learn: break up every work into small tasks. It does not matter whether you use timeboxing techniques such as the Pomodoro Technique; small things are always easy to handle. I'm not talking about split hairs, but try to organize your tasks into things you can do in a defined amount of time, hopefully a bunch of half hours, not days.
So, take a pen and paper and write down all the tasks, as we did earlier with the login page example. Do you think you can do all those things in a small amount of time? Maybe yes, maybe not: some tasks are bigger than others. That's OK; this is not a scientific method. It's a matter of experience. Can you split a task and create two other meaningful tasks? Do it.
Are you unable to do it? No problem; don't try to split tasks if they lose meaning.

Write commit messages before starting to code
Now, you have a list of tasks to do; pick the first and… start to code? No! Take another piece of paper and describe every task's step with a sentence. Magically, you will realize that every sentence can be the message of a single commit, where you describe the features you deleted, added, or changed in your software.
This kind of prior preparation helps you define modifications to implement (letting better software design to emerge). It also focuses on what is important and lowers down the stress of thinking at the versioning part of the work during the coding session. While you are facing a programming problem, your brain floods with little implementation details related to the code you are working on. So, the fewer the distractions, the better.
This is one of the best versioning-related hints I ever received. If you have just a quarter of an hour to spare, I recommend that you read the Preemptive commit comments blog post (https://arialdomartini.wordpress.com/2012/09/03/pre-emptive-commit-comments/) by Arialdo Martini. This is where I learnt this trick.
Include the whole change in one commit
Making more than one change per commit is a bad thing. However, splitting a single change into more than one commit is also considered harmful. As you may know, in some trained teams, you do not simply push your code to production. Before that, you have to pass some code quality reviews, where someone else tries to understand what you did to decide if your code is good or not (that is, why there are pull requests, indeed). You could be the best developer in the world. However, if the person at the other end can't get a sense of your commits, your work would probably be refused.
To avoid these unpleasant situations, you have to follow a simple rule: don't do partial commits. If time's up, if you have to go to that damn meeting (programmers hate meetings) or whatever, remember that you can save your work at any moment without committing, using the git stash
command. If you want to close the commit, because you want to push it to the remote branch for backup purposes, remember that Git is not a backup tool. Back up your stash on another disk, put it in the cloud, or simply end your work before leaving, but don't do commits like they are episodes of a TV series.
One more time, Git is a software tool like any other and it can fail. Don't think that just because you are using Git or other versioning systems, you don't need backup strategies. Back up local and remote repositories just like you back up all the other important things.
Describe the change, not what you have done
Too often, I read (and often I wrote) commit messages such as "Removed this", "Changed that", "Added that one", and so on.
Imagine that you are going to work on the common "lost password" feature on your website. Probably, you will find a message like this adequate: "Added the lost password retrieval link to the login page". This kind of commit message does not describe what modifications the feature brings to you, but what you did (and not everything). Try to answer sincerely. If you are reading a repository history, do you want to read what every developer did? Or is it better to read the feature implemented in every single commit?
Try to make the effort, and start writing sentences where the change itself is the subject, not what you did to implement it. Use the imperative present tense (for example, fix, add, implement), describing the change in a small subject sentence, and then, add some details (when needed) in other lines of text. "Implement the password-retrieval mechanism" is a good commit message subject. If you find it useful, then you can add some other information to get a well formed message like this:
"Implement the password retrieval mechanism - Add the "Lost password?" link into the login page - Send an email to the user with a link to renew the password"
Have you ever written a changelog for a software by hand? I did; it's one of the most boring things to do. If you don't like writing changelogs, like me, think of the repository history as your changelog. If you take care of your commit messages, you would get a beautiful changelog for free!
In the next section, I will group some other useful hints about good commit messages.
Don't be afraid to commit
Fear is one of the most powerful emotions. It can drive a person to do the craziest thing on Earth. One the most common reactions to fear is breakdown. You don't know what to do, so you end up doing nothing.
This is a common reaction even when you begin to use a new tool such as Git, where gaining confidence can be difficult. For the fear of making a mistake, you don't commit until you are obligated. This is the real mistake; be scared. In Git, you don't have to be scared. Maybe the solution is not obvious; maybe you have to dig on the Internet to find the right way. However, you can get off with small or no consequences, ever (well, unless you are a hard user of the --hard
option).
On the contrary, you have to make the effort to commit often, as soon as possible. The more frequently you commit, the smaller are your commits; the smaller are your commits, the easier it is to read and understand the changelog. It is also easier to to cherry-pick commits and do code reviews. To help myself get used to committing this way, I followed this simple trick: write the commit message in Visual Studio before starting to write any code.

Try to do the same in your IDE or directly in the Bash shell; it helps a lot.
Isolate meaningless commits
The golden rule is to avoid meaningless commits. However, sometimes, you need to commit something that is not a real implementation, but only a cleanup, such as deleting old comments, formatting rearrangement, and so on.
In these cases, it is better to isolate this kind of code change in separate commits. By doing this, you prevent another team member from running towards you with a knife in his hand, frothing at the mouth. Don't commit meaningless changes and mix up them with real ones. Otherwise, other developers (and you, after a couple of weeks) will not understand them while diffing.
The perfect commit message
Let me be honest; the perfect message does not exist. If you work alone, you will probably find the best way for you. However, when in a team, there are different minds and different sensibilities, so what is good for me may not be as good for another.
In this case, you have to sit around a table and discuss. You should try to end up with a shared standard that probably would not be the one you prefer, but at least is a way to start a common path.
Rules for a good commit message really depend on the way you and your team work every day, but some common hints can be applied by everyone. They are described in the following sections.
Writing a meaningful subject
The subject of a commit is the most important part; its role is to make clear what the commit contains. Avoid technical details of other things that a common developer can understand on opening the code. Focus on the big picture. Remember that every commit is a sentence in the repository history. So, wear the hat of the changelog reader and try to write the most convenient sentence for him, not for you. Use the present tense, and write a sentence with a maximum of 50 characters.
A good subject is one like this, "Add the newsletter signup in homepage".
As you can see, I used the imperative past tense. More importantly I didn't say what I have done, but what the feature does: it added a newsletter signup box to my website.
The 50 char rule is due to the way you use Git from the shell or GUI tools. If you start to write long sentences, reviewing logs and so on can become a nightmare. So, don't try to be the Stephen King of commit messages. Avoid adjectives and go straight to the point. You can then write additional details lines.
Another thing to remember is to start with capital letters. Do not end sentences with periods; they are useless and even dangerous.
Adding bulleted details lines, when needed
Often, you can't say all that you want in 50 chars. In this case, use details lines. In this situation, the common rule is to leave a blank line after the subject, use a dash, and go no longer than 72 chars:
"Add the newsletter signup in homepage - Add textbox and button on homepage - Implement email address validation - Save email in database"
In these lines, add a few details, but not too many. Try to describe the original problem (if you fixed it) or the original need, why these functionalities have been implemented (what problem solves), and understand the possible limitations or issues.
Tie other useful information
If you use some issue and project-tracking systems, write down the issue number, bug IDs or everything else that helps:
"Add the newsletter signup in homepage - Add textbox and button on homepage - Implement email address validation - Save email in database #FEAT-123: closed"
Special messages for releases
Another useful thing is to write special format commit messages for releases so that it will be easier to find them. I usually decorate subjects with some special characters, but nothing more. To highlight a particular commit, such as a release one, there is the git tag
command, remember?
Conclusions
At the end, my suggestion is to try to compose your personal commit message standard by following previous hints, looking at message strategies adopted by great projects and teams around the Web, but especially by doing it. Your standard will change for sure as you evolve as a software developer and Git user. So, start as soon as possible, and let time help you find the perfect way to write a commit message.
At least, don't imitate them: http://www.commitlogsfromlastnight.com.
Adopting a workflow – a wise act
Now that you learned how to perform good commits, it's time to fly higher and think about workflows. Git is a tool for versioning, but as with other powerful tools, like knives, you can cut tasty sashimi or relieve yourself of some fingers.
The things that separate a great repository from a junkyard are the way you manage releases, the way you react when there is a bug to fix in a particular version of your software, and the way you act when you have to make users beta-test the incoming features.
These kinds of actions belong to ordinary administration for a modern software project. However, very often, I still see teams get out of breath because of the poor versioning workflows.
In this second part of the chapter, we will take a quick look at some of the common workflows alongside the Git versioning system.
Centralized workflows
As we used to do in other VCSes, such as Subversion and so on, even in Git, it is common to adopt a centralized way of work. If you work in a team, it is often necessary to share repositories with others, so a common point of contact becomes indispensable.
We can assume that if you are not alone in your office, you would adopt one of the variations of this workflow. As we know, we can get all the computers of our co-workers as remote, in a sort of peer-to-peer configuration. However, you usually don't do this, because it becomes too difficult to keep every branch in every remote in sync.

How they work
In this scenario, you usually follow these simple steps:
- Someone initializes the remote repository (in a local Git server, on GitHub, or on Bitbucket).
- Other team members clone the original repository on their computer and start working.
- When the work is done, you push it to the remote to make it available to other colleagues.
At this point, it is only a matter of internal rules and patterns. It is improbable that you and your colleague will work together simultaneously in the master
branch, unless you are indomitable masochists.
Feature branch workflow
At this point, you probably will choose a feature branch approach, where every single developer works on their branch. When the work is done, the feature
branch is ready to be merged with the master
branch. You will probably have to merge back from the master
branch first because one of your other colleagues has merged a feature
branch after you started yours, but after that you basically have finished.

GitFlow
The GitFlow workflow comes from the mind of Vincent Driessen, a passionate software developer from the Netherlands. You can find his original blog post at http://nvie.com/posts/a-successful-git-branching-model.
His workflow has gained success over the years, at the point that many other developers (including me), teams and companies started to use it. Atlassian, a well-known company that offers Git related services such as Stash or Bitbucket, integrates the GitFlow directly in its GUI tool, the SourceTree.
Even the GitFlow workflow is a centralized one, and it is well described by this figure:

This workflow is based on the use of some main branches. What makes these branches special is nothing other than the significance we attribute to them. These are not special branches with special characteristics in Git, but we can certainly use them for different purposes.
The master branch
In GitFlow, the master
branch represents the final stage. Merging your work in it is equal to making a new release of your software. You usually don't start new branches from the master
branch. You do it only if there is a severe bug you have to fix instantly, even if that bug has been found and fixed in another evolving branch. This way to operate makes you superfast when you have to react to a painful situation. Other than this, the master
branch is where you tag your release.
Hotfixes branches
Hotfixes branches are branches derived only from the master
branch, as we said earlier. Once you have fixed a bug, you merge the hotfix
branch onto master
so that you get a new release to ship. If the bug has not been resolved anywhere else in your repository, the strategy would be to merge the hotfix
branch even into the develop
branch. After that, you can delete the hotfix
branch, as it has hit the mark.
In Git, there is a trick to group similar branches: you have to name them using a common prefix followed by a slash /
. For the hotfix
branches, I recommend the hotfix/<branchName>
prefix (for example hotfix/LoginBug
of hotfix/#123
for those who are using bug-tracking systems, where #123
is the bug ID).
These branches are usually not pushed to remote. You push them only if you need the help of other team members.
The develop branch
The develop
branch is a sort of beta software branch. When you start to implement a new feature, you have to create a new branch starting from the develop
branch. You will continue to work in that branch until you complete your task.
After the task is completed, you can merge back to the develop
branch and delete your feature
branch. Just like hotfix
branches, these are only temporary branches.
Like the master
branch, the develop
branch is a never-ending branch. You will never close nor delete it.
This branch is pushed and shared to a remote Git repository.
The release branch
At some point, you need to wrap up the next release, including some of the features you implemented in the last few weeks. To prepare an incoming release, you have to branch from develop
, assigning at the branch a name composed of the release
prefix. This will be followed by the numeric form of your choice for your release
branch (for example release/1.0
).
Pay attention. At this stage, no more new features are allowed! You cannot merge develop
onto the release
branch. You can only create new branches from that branch for bug fixing. The purpose of this intermediate branch is to give the software to beta testers, allowing them to try it and send you feedback and bug tickets.
If you have fixed some bugs onto the release
branch, the only thing to remember is to merge them even into the develop
branch, just to avoid the loss of the bug fix. The release
branch will not be merged back to develop
.
You can keep this branch throughout your life, until you decide that the software is both mature and tested sufficiently to go in production. At this point, you merge the release
branch onto the master
branch, making, in fact, a new release.
After the merge to master
, you can make a choice. You could keep the release branch open, if you need to keep alive different releases; otherwise, you can delete it. Personally, I always delete the release
branch (as Vincent suggests), because I generally do frequent, small, and incremental releases (so, I rarely need to fix an already shipped release). As you certainly remember, you can open a brand new branch from a commit (a tagged one in this case) whenever you want. So, at most, I will open it from that point only when necessary.
This branch is pushed and shared to a common remote repository.
The feature branches
When you have to start the implementation of a new feature, you have to create a new branch from the develop
branch. Feature branches start with the feature/
prefix (for example, feature/NewAuthenitcation
or feature/#987
if you use some feature- tracking software, as #987
is the feature ID).
You will work on the feature release until you finish your work. I suggest that you frequently merge back from develop
. In the case of concurrent modifications to the same files, you will resolve conflicts faster if you resolve them earlier. Then, it is easier to resolve one or two conflicts a time than dozens at the end of the feature work.
Once your work is done, you merge the feature onto develop
and you are done. You can now delete the feature
branch.
Feature branches are mainly private branches. However, you could push them to the remote repository if you have to collaborate on it with some other team mates.
Conclusion
I recommend that you take a look at this workflow, as I can assure you that there was never a situation that I failed to solve using this workflow.
You can find a deeper explanation with the ready-to-use Git command on Vincent Driessen's blog. You can even use GitFlow commands Vincent made to customize his Git experience. Check them out on his GitHub account at https://github.com/nvie/gitflow.
The GitHub flow
The previously described GitFlow has tons of followers, but it is always a matter of taste. Someone else found it too complex and rigid for their situation. In fact, there are other ways to manage software repositories that have gained consensus during the last few years.
One of these is the workflow used at GitHub for internal projects and repositories. This workflow takes the name of GitHub flow. It was first described by the well-known Scott Chacon, former GitHubber and ProGit book author, on his blog at http://scottchacon.com/2011/08/31/github-flow.html.
This workflow, compared to GitFlow, is better tailored for frequent releases. When I say frequent, I say very frequently, even twice a day. Obviously, this kind of flow works better on web projects, because to deploy it, you have to only put the new release on the production server. If you develop desktop solutions, you need a perfect oiled update mechanism to do the same.
GitHub software basically doesn't have releases, because they deploy to production regularly, even more than once a day. This is possible due to a robust Continuous Delivery structure, which is not so easy to obtain. It deserves some effort.
The GitHub flow is based on these simple rules.
Anything in the master branch is deployable
Just like GitFlow, even in GitHub flow, deployment is done from the master
branch. This is the only main branch in this flow. In GitFlow, there are not hotfix
, develop
, or other particular branches. Bug fixes, new implementation, and so on are constantly merged onto the master
branch.
Other than this, code in the master
branch is always in a deployable state. When you fix or add something new in a branch and then merge it onto the master
branch, you don't deploy automatically, but you can assume your changes will be up and running in a matter of hours.
Branching and merging constantly to the master
branch, which is the production-ready branch, can be dangerous. You can easily introduce regressions or bugs, as no one other than you can assure you have done a good job. This problem is avoided by a social contract commonly adopted by GitHub developers. In this contract, you promise to test your code before merging it to the master
branch, assuring that all automated tests have been successfully completed.
Creating descriptive branches off of the master
In GitFlow, you always branch from the master
branch. So, it's easy to get a forest of branches to look at when you have to pull one. To better identify them, in GitHub flow, you have to use descriptive names to get meaningful topic branches. Even here, it is a matter of good manners. If you start to create branches named stuff-to-do
, you would probably fail in adopting this flow. Some examples are new-user-creation
, most-starred-repositories
, and so on (note the use of dashes). Using a common way to define topics, you will easily find branches you are interested in, looking for topics' keywords.
Pushing to named branches constantly
Another great difference between GitHub flow and GitFlow is that in GitHub flow, you push feature branches to the remote regularly, even if you are the only developer involved and interested in it. This is done even for backup purposes. Even if I already exposed my opinion in merit, I can't say this is a bad thing.
A thing I appreciate about GitFlow is that this habit of pushing every branch to the remote gives you the ability to see, with a simple git fetch
command, all the branches currently active. Due to this, you can see all the work in progress, even that of your team mates.
Opening a pull request at any time
In Chapter 3, Git Fundamentals – Working Remotely, we talked about GitHub and made a quick try with pull requests. We have seen that basically they are for contributing. You fork someone else's repository, create a new branch, make some modifications, and then ask for a pull request from the original author.
In GitHub flow, you use pull requests massively. You can even ask another developer of your team to have a look at your work and help you, give you a hint, or review the work done. At this point, you can start a discussion about using the GitHub pull request to chat and involve other people, putting their usernames in CC. In addition, the pull request feature lets you comment even a single line of code in a different view, letting users involved proficiently discuss the work under revision.
Merging only after a pull request review
You can now understand that the pull requested branch stage we saw earlier becomes a sort of review stage. Here, other users can take a look at the code and even simply leave a positive comment, just a +1 to make other users know that they are confident about the job, and they approve its merge into master
.
After this step, when the CI server says that the branch still passes all the automated tests, you are ready to merge the branch in master
.
Deploying immediately after review
At this stage, you merge your branch into master
, and the work is done. The deployment is not instantly fired, but at GitHub, they have a very straight and robust deploy procedure. They deploy big branches with 50 commits, but also branches with a single commit and a single line of code change, because deployment is very quick and cheap for them.
This is the reason why they can afford such a simple branching strategy, where you put on the master
branch, and then you deploy, without the need to pass through the develop
or release
stage branch, like in GitFlow.
Conclusions
I consider this flow very responsive and effective for web-based projects, where basically you deploy to production without focusing too much on versions of your software. Using only the master
branch to derive and integrate branches is faster than light. However, this strategy could be applied only if you have these prerequisites:
- A centralized remote ready to manage pull requests (as GitHub does)
- A good shared agreement about branch names and pull requests usage
- A very robust deploy system
This is a big picture of this flow. For more details, I recommend that you visit the GitHub related page at https://guides.github.com/introduction/flow/index.html.

Other workflows
Obviously, there are many other workflows. I will spend just few words on the one that (fortunately) convinced Linus Torvalds to realize the Git VCS.
The Linux kernel workflow
The Linux kernel uses a workflow that refers to the traditional way in which Linus Torvalds has driven its evolution during these years. It is based on a military-like hierarchy.
Simple kernel developers work on their personal branches, rebasing the master
branch in the reference repository. Then they push their branches to the lieutenant developer's master
branch. Lieutenants are developers who Linus assigned to particular topics and areas of the kernel because of their experience. When a lieutenants have done their work, they push it to the benevolent dictator master
branch (Linus branch). Then, if things are OK (it is not simple to cheat him), Linus would push his master
branch to the blessed repository, the one that developers use to rebase from, before starting their work.

Summary
In this chapter, we became aware of the effective ways to use Git. I personally consider this chapter the most important for a new Git user, because it applies some rules and discipline so that you will obtain the most from this tool. So, pick up a good workflow (make your own, if necessary), and pay attention to your commits. This is the only way to become a good versioning system user, not only in Git.
In the next chapter, we will see some tips and tricks to use Git even if you have to deal with Subversion servers. Then, we will take a quick look at migrating from Subversion to Git.
- HornetQ Messaging Developer’s Guide
- Learning Neo4j
- Learning NServiceBus(Second Edition)
- R語言數據分析從入門到精通
- Testing with JUnit
- PyTorch自動駕駛視覺感知算法實戰
- INSTANT FreeMarker Starter
- Mastering OpenCV 4
- C# Multithreaded and Parallel Programming
- Extending Unity with Editor Scripting
- 新印象:解構UI界面設計
- Django Design Patterns and Best Practices
- ASP.NET 4.0 Web程序設計
- Mastering ASP.NET Core 2.0
- 嵌入式C編程實戰