官术网_书友最值得收藏!

  • Git:Mastering Version Control
  • Ferdinando Santacroce Aske Olsson Rasmus Voss Jakub Nar?bski
  • 5565字
  • 2021-07-08 10:46:58

Chapter 5. Obtaining the Most – Good Commits and Workflows

Now that we are familiar with Git and versioning systems, it's time to look at the whole thing from a much higher perspective to become aware of common patterns and procedures.

In this chapter, we will walk through some of the most common ways to organize and build meaningful commits and repositories. We will obtain not only a well-organized code stack, but also a meaningful source of information.

The art of committing

While working with Git, committing seems the easiest part of the job: you add files, write a short comment, and then, you're done. However, it is because of its simplicity that often, especially at the very beginning of your experience, you acquire the bad habit of doing terrible commits: too late, too big, too short, or simply equipped with bad messages.

Now, we will take some time to identify possible issues, drawing attention to tips and hints to get rid of these bad habits.

Building the right commit

One of the harder skills to acquire while programming in general is to split the work in small and meaningful tasks.

Too often, I have experienced this scenario. You start to fix a small issue in a file. Then, you see another piece of code that can be easily improved, even if it is not related to what you are working on now. You can't resist it, and you fix it. At the end and after a small time, you find yourself with tons of concurrent files and changes to commit.

At this point, things get worse, because usually, programmers are lazy people. So, they don't write all the important things to describe changes in the commit message. In commit messages, you start to write sentences such as "Some fixes to this and that", "Removed old stuff", "Tweaks", and so on, without anything that helps other programmers understand what you have done.

Courtesy of http://xkcd.com/1296/

At the end, you realize that your repository is only a dump where you empty your index now and then. I have seen some people committing only at the end of the day (and not every day) to keep a backup of the data or because someone else needed the changes reflected on their computer.

Another side effect is that the resulting repository history becomes useless for anything other than retrieving content at a given point in time.

The following tips can help you turn your Version Control System (VCS) from a backup system into a valuable tool for communication and documentation.

Make only one change per commit

After the routine morning coffee, you open your editor and start to work on a bug, BUG42. While working around fixing the bug in the code, you realize that fixing BUG79 will require tweaking just a single line of code. So, you fix it. However, you not only change that awful class name, but also add a good-looking label to the form and make a few more changes. The damage is done.

How can you wrap up all that work in a meaningful commit now? Maybe, in the meantime, you went home for lunch, talked to your boss about another project, and you can't even remember all the little things you did.

In this scenario, there is only one way to limit the damage: split the files to commit among more than one commit. Sometimes, this helps to reduce the pain, but it is only palliative. Very often, you modify the same file for different reasons, so doing this is quite difficult, if not impossible. The last hope is to use git add -p command, that let's you to stage only some modification on a file, grouping them in different commit to separate topics.

The only way to definitely solve this problem is to only make one change per commit. It seems easy, I know, but it is quite difficult to acquire this ability. There are no tools for this. No one, but you, can help. It only needs discipline, the most lacking virtue in creative people such as programmers.

There are some tips to pursue this aim; let's have a look at them together.

Split up features and tasks

As said earlier, breaking up the things to do is a fine art. If you know and adopt some Agile movement techniques, you will have probably faced these problems. So, you have an advantage; otherwise, you will need some more effort, but it is not something that you can't achieve.

Consider that you have been assigned to add the Remember me check in the login page of a web application, like the one shown here:

This feature is quite small, but implies changes at different levels. To accomplish this, you'll have to:

  • Modify the UI to add the check control
  • Pass the "is checked" information through different layers
  • Store this information somewhere
  • Retrieve this information when needed
  • Invalidate (set it to false) following some kind of policy (after 15 days, after 10 logins, and so on)

Do you think you can do all these things in one shot? Yes? You are wrong! Even if you estimate a couple of hours for an ordinary task, remember that Murphy's law is in ambush. You will receive four calls, your boss will look for you for three different meetings, and your computer will go up in flames.

This is one of the first things to learn: break up every work into small tasks. It does not matter whether you use timeboxing techniques such as the Pomodoro Technique; small things are always easy to handle. I'm not talking about split hairs, but try to organize your tasks into things you can do in a defined amount of time, hopefully a bunch of half hours, not days.

So, take a pen and paper and write down all the tasks, as we did earlier with the login page example. Do you think you can do all those things in a small amount of time? Maybe yes, maybe not: some tasks are bigger than others. That's OK; this is not a scientific method. It's a matter of experience. Can you split a task and create two other meaningful tasks? Do it.

Are you unable to do it? No problem; don't try to split tasks if they lose meaning.

Write commit messages before starting to code

Now, you have a list of tasks to do; pick the first and… start to code? No! Take another piece of paper and describe every task's step with a sentence. Magically, you will realize that every sentence can be the message of a single commit, where you describe the features you deleted, added, or changed in your software.

This kind of prior preparation helps you define modifications to implement (letting better software design to emerge). It also focuses on what is important and lowers down the stress of thinking at the versioning part of the work during the coding session. While you are facing a programming problem, your brain floods with little implementation details related to the code you are working on. So, the fewer the distractions, the better.

This is one of the best versioning-related hints I ever received. If you have just a quarter of an hour to spare, I recommend that you read the Preemptive commit comments blog post (https://arialdomartini.wordpress.com/2012/09/03/pre-emptive-commit-comments/) by Arialdo Martini. This is where I learnt this trick.

Include the whole change in one commit

Making more than one change per commit is a bad thing. However, splitting a single change into more than one commit is also considered harmful. As you may know, in some trained teams, you do not simply push your code to production. Before that, you have to pass some code quality reviews, where someone else tries to understand what you did to decide if your code is good or not (that is, why there are pull requests, indeed). You could be the best developer in the world. However, if the person at the other end can't get a sense of your commits, your work would probably be refused.

To avoid these unpleasant situations, you have to follow a simple rule: don't do partial commits. If time's up, if you have to go to that damn meeting (programmers hate meetings) or whatever, remember that you can save your work at any moment without committing, using the git stash command. If you want to close the commit, because you want to push it to the remote branch for backup purposes, remember that Git is not a backup tool. Back up your stash on another disk, put it in the cloud, or simply end your work before leaving, but don't do commits like they are episodes of a TV series.

One more time, Git is a software tool like any other and it can fail. Don't think that just because you are using Git or other versioning systems, you don't need backup strategies. Back up local and remote repositories just like you back up all the other important things.

Describe the change, not what you have done

Too often, I read (and often I wrote) commit messages such as "Removed this", "Changed that", "Added that one", and so on.

Imagine that you are going to work on the common "lost password" feature on your website. Probably, you will find a message like this adequate: "Added the lost password retrieval link to the login page". This kind of commit message does not describe what modifications the feature brings to you, but what you did (and not everything). Try to answer sincerely. If you are reading a repository history, do you want to read what every developer did? Or is it better to read the feature implemented in every single commit?

Try to make the effort, and start writing sentences where the change itself is the subject, not what you did to implement it. Use the imperative present tense (for example, fix, add, implement), describing the change in a small subject sentence, and then, add some details (when needed) in other lines of text. "Implement the password-retrieval mechanism" is a good commit message subject. If you find it useful, then you can add some other information to get a well formed message like this:

"Implement the password retrieval mechanism

 - Add the "Lost password?" link into the login page - Send an email to the user with a link to renew the password"

Have you ever written a changelog for a software by hand? I did; it's one of the most boring things to do. If you don't like writing changelogs, like me, think of the repository history as your changelog. If you take care of your commit messages, you would get a beautiful changelog for free!

In the next section, I will group some other useful hints about good commit messages.

Don't be afraid to commit

Fear is one of the most powerful emotions. It can drive a person to do the craziest thing on Earth. One the most common reactions to fear is breakdown. You don't know what to do, so you end up doing nothing.

This is a common reaction even when you begin to use a new tool such as Git, where gaining confidence can be difficult. For the fear of making a mistake, you don't commit until you are obligated. This is the real mistake; be scared. In Git, you don't have to be scared. Maybe the solution is not obvious; maybe you have to dig on the Internet to find the right way. However, you can get off with small or no consequences, ever (well, unless you are a hard user of the --hard option).

On the contrary, you have to make the effort to commit often, as soon as possible. The more frequently you commit, the smaller are your commits; the smaller are your commits, the easier it is to read and understand the changelog. It is also easier to to cherry-pick commits and do code reviews. To help myself get used to committing this way, I followed this simple trick: write the commit message in Visual Studio before starting to write any code.

Try to do the same in your IDE or directly in the Bash shell; it helps a lot.

Isolate meaningless commits

The golden rule is to avoid meaningless commits. However, sometimes, you need to commit something that is not a real implementation, but only a cleanup, such as deleting old comments, formatting rearrangement, and so on.

In these cases, it is better to isolate this kind of code change in separate commits. By doing this, you prevent another team member from running towards you with a knife in his hand, frothing at the mouth. Don't commit meaningless changes and mix up them with real ones. Otherwise, other developers (and you, after a couple of weeks) will not understand them while diffing.

The perfect commit message

Let me be honest; the perfect message does not exist. If you work alone, you will probably find the best way for you. However, when in a team, there are different minds and different sensibilities, so what is good for me may not be as good for another.

In this case, you have to sit around a table and discuss. You should try to end up with a shared standard that probably would not be the one you prefer, but at least is a way to start a common path.

Rules for a good commit message really depend on the way you and your team work every day, but some common hints can be applied by everyone. They are described in the following sections.

Writing a meaningful subject

The subject of a commit is the most important part; its role is to make clear what the commit contains. Avoid technical details of other things that a common developer can understand on opening the code. Focus on the big picture. Remember that every commit is a sentence in the repository history. So, wear the hat of the changelog reader and try to write the most convenient sentence for him, not for you. Use the present tense, and write a sentence with a maximum of 50 characters.

A good subject is one like this, "Add the newsletter signup in homepage".

As you can see, I used the imperative past tense. More importantly I didn't say what I have done, but what the feature does: it added a newsletter signup box to my website.

The 50 char rule is due to the way you use Git from the shell or GUI tools. If you start to write long sentences, reviewing logs and so on can become a nightmare. So, don't try to be the Stephen King of commit messages. Avoid adjectives and go straight to the point. You can then write additional details lines.

Another thing to remember is to start with capital letters. Do not end sentences with periods; they are useless and even dangerous.

Adding bulleted details lines, when needed

Often, you can't say all that you want in 50 chars. In this case, use details lines. In this situation, the common rule is to leave a blank line after the subject, use a dash, and go no longer than 72 chars:

"Add the newsletter signup in homepage

- Add textbox and button on homepage
- Implement email address validation
- Save email in database"

In these lines, add a few details, but not too many. Try to describe the original problem (if you fixed it) or the original need, why these functionalities have been implemented (what problem solves), and understand the possible limitations or issues.

Tie other useful information

If you use some issue and project-tracking systems, write down the issue number, bug IDs or everything else that helps:

"Add the newsletter signup in homepage

- Add textbox and button on homepage
- Implement email address validation
- Save email in database 

#FEAT-123: closed"

Special messages for releases

Another useful thing is to write special format commit messages for releases so that it will be easier to find them. I usually decorate subjects with some special characters, but nothing more. To highlight a particular commit, such as a release one, there is the git tag command, remember?

Conclusions

At the end, my suggestion is to try to compose your personal commit message standard by following previous hints, looking at message strategies adopted by great projects and teams around the Web, but especially by doing it. Your standard will change for sure as you evolve as a software developer and Git user. So, start as soon as possible, and let time help you find the perfect way to write a commit message.

At least, don't imitate them: http://www.commitlogsfromlastnight.com.

Adopting a workflow – a wise act

Now that you learned how to perform good commits, it's time to fly higher and think about workflows. Git is a tool for versioning, but as with other powerful tools, like knives, you can cut tasty sashimi or relieve yourself of some fingers.

The things that separate a great repository from a junkyard are the way you manage releases, the way you react when there is a bug to fix in a particular version of your software, and the way you act when you have to make users beta-test the incoming features.

These kinds of actions belong to ordinary administration for a modern software project. However, very often, I still see teams get out of breath because of the poor versioning workflows.

In this second part of the chapter, we will take a quick look at some of the common workflows alongside the Git versioning system.

Centralized workflows

As we used to do in other VCSes, such as Subversion and so on, even in Git, it is common to adopt a centralized way of work. If you work in a team, it is often necessary to share repositories with others, so a common point of contact becomes indispensable.

We can assume that if you are not alone in your office, you would adopt one of the variations of this workflow. As we know, we can get all the computers of our co-workers as remote, in a sort of peer-to-peer configuration. However, you usually don't do this, because it becomes too difficult to keep every branch in every remote in sync.

How they work

In this scenario, you usually follow these simple steps:

  1. Someone initializes the remote repository (in a local Git server, on GitHub, or on Bitbucket).
  2. Other team members clone the original repository on their computer and start working.
  3. When the work is done, you push it to the remote to make it available to other colleagues.

At this point, it is only a matter of internal rules and patterns. It is improbable that you and your colleague will work together simultaneously in the master branch, unless you are indomitable masochists.

Feature branch workflow

At this point, you probably will choose a feature branch approach, where every single developer works on their branch. When the work is done, the feature branch is ready to be merged with the master branch. You will probably have to merge back from the master branch first because one of your other colleagues has merged a feature branch after you started yours, but after that you basically have finished.

GitFlow

The GitFlow workflow comes from the mind of Vincent Driessen, a passionate software developer from the Netherlands. You can find his original blog post at http://nvie.com/posts/a-successful-git-branching-model.

His workflow has gained success over the years, at the point that many other developers (including me), teams and companies started to use it. Atlassian, a well-known company that offers Git related services such as Stash or Bitbucket, integrates the GitFlow directly in its GUI tool, the SourceTree.

Even the GitFlow workflow is a centralized one, and it is well described by this figure:

This workflow is based on the use of some main branches. What makes these branches special is nothing other than the significance we attribute to them. These are not special branches with special characteristics in Git, but we can certainly use them for different purposes.

The master branch

In GitFlow, the master branch represents the final stage. Merging your work in it is equal to making a new release of your software. You usually don't start new branches from the master branch. You do it only if there is a severe bug you have to fix instantly, even if that bug has been found and fixed in another evolving branch. This way to operate makes you superfast when you have to react to a painful situation. Other than this, the master branch is where you tag your release.

Hotfixes branches

Hotfixes branches are branches derived only from the master branch, as we said earlier. Once you have fixed a bug, you merge the hotfix branch onto master so that you get a new release to ship. If the bug has not been resolved anywhere else in your repository, the strategy would be to merge the hotfix branch even into the develop branch. After that, you can delete the hotfix branch, as it has hit the mark.

In Git, there is a trick to group similar branches: you have to name them using a common prefix followed by a slash /. For the hotfix branches, I recommend the hotfix/<branchName> prefix (for example hotfix/LoginBug of hotfix/#123 for those who are using bug-tracking systems, where #123 is the bug ID).

These branches are usually not pushed to remote. You push them only if you need the help of other team members.

The develop branch

The develop branch is a sort of beta software branch. When you start to implement a new feature, you have to create a new branch starting from the develop branch. You will continue to work in that branch until you complete your task.

After the task is completed, you can merge back to the develop branch and delete your feature branch. Just like hotfix branches, these are only temporary branches.

Like the master branch, the develop branch is a never-ending branch. You will never close nor delete it.

This branch is pushed and shared to a remote Git repository.

The release branch

At some point, you need to wrap up the next release, including some of the features you implemented in the last few weeks. To prepare an incoming release, you have to branch from develop, assigning at the branch a name composed of the release prefix. This will be followed by the numeric form of your choice for your release branch (for example release/1.0).

Pay attention. At this stage, no more new features are allowed! You cannot merge develop onto the release branch. You can only create new branches from that branch for bug fixing. The purpose of this intermediate branch is to give the software to beta testers, allowing them to try it and send you feedback and bug tickets.

If you have fixed some bugs onto the release branch, the only thing to remember is to merge them even into the develop branch, just to avoid the loss of the bug fix. The release branch will not be merged back to develop.

You can keep this branch throughout your life, until you decide that the software is both mature and tested sufficiently to go in production. At this point, you merge the release branch onto the master branch, making, in fact, a new release.

After the merge to master, you can make a choice. You could keep the release branch open, if you need to keep alive different releases; otherwise, you can delete it. Personally, I always delete the release branch (as Vincent suggests), because I generally do frequent, small, and incremental releases (so, I rarely need to fix an already shipped release). As you certainly remember, you can open a brand new branch from a commit (a tagged one in this case) whenever you want. So, at most, I will open it from that point only when necessary.

This branch is pushed and shared to a common remote repository.

The feature branches

When you have to start the implementation of a new feature, you have to create a new branch from the develop branch. Feature branches start with the feature/ prefix (for example, feature/NewAuthenitcation or feature/#987 if you use some feature- tracking software, as #987 is the feature ID).

You will work on the feature release until you finish your work. I suggest that you frequently merge back from develop. In the case of concurrent modifications to the same files, you will resolve conflicts faster if you resolve them earlier. Then, it is easier to resolve one or two conflicts a time than dozens at the end of the feature work.

Once your work is done, you merge the feature onto develop and you are done. You can now delete the feature branch.

Feature branches are mainly private branches. However, you could push them to the remote repository if you have to collaborate on it with some other team mates.

Conclusion

I recommend that you take a look at this workflow, as I can assure you that there was never a situation that I failed to solve using this workflow.

You can find a deeper explanation with the ready-to-use Git command on Vincent Driessen's blog. You can even use GitFlow commands Vincent made to customize his Git experience. Check them out on his GitHub account at https://github.com/nvie/gitflow.

The GitHub flow

The previously described GitFlow has tons of followers, but it is always a matter of taste. Someone else found it too complex and rigid for their situation. In fact, there are other ways to manage software repositories that have gained consensus during the last few years.

One of these is the workflow used at GitHub for internal projects and repositories. This workflow takes the name of GitHub flow. It was first described by the well-known Scott Chacon, former GitHubber and ProGit book author, on his blog at http://scottchacon.com/2011/08/31/github-flow.html.

This workflow, compared to GitFlow, is better tailored for frequent releases. When I say frequent, I say very frequently, even twice a day. Obviously, this kind of flow works better on web projects, because to deploy it, you have to only put the new release on the production server. If you develop desktop solutions, you need a perfect oiled update mechanism to do the same.

GitHub software basically doesn't have releases, because they deploy to production regularly, even more than once a day. This is possible due to a robust Continuous Delivery structure, which is not so easy to obtain. It deserves some effort.

The GitHub flow is based on these simple rules.

Anything in the master branch is deployable

Just like GitFlow, even in GitHub flow, deployment is done from the master branch. This is the only main branch in this flow. In GitFlow, there are not hotfix, develop, or other particular branches. Bug fixes, new implementation, and so on are constantly merged onto the master branch.

Other than this, code in the master branch is always in a deployable state. When you fix or add something new in a branch and then merge it onto the master branch, you don't deploy automatically, but you can assume your changes will be up and running in a matter of hours.

Branching and merging constantly to the master branch, which is the production-ready branch, can be dangerous. You can easily introduce regressions or bugs, as no one other than you can assure you have done a good job. This problem is avoided by a social contract commonly adopted by GitHub developers. In this contract, you promise to test your code before merging it to the master branch, assuring that all automated tests have been successfully completed.

Creating descriptive branches off of the master

In GitFlow, you always branch from the master branch. So, it's easy to get a forest of branches to look at when you have to pull one. To better identify them, in GitHub flow, you have to use descriptive names to get meaningful topic branches. Even here, it is a matter of good manners. If you start to create branches named stuff-to-do, you would probably fail in adopting this flow. Some examples are new-user-creation, most-starred-repositories, and so on (note the use of dashes). Using a common way to define topics, you will easily find branches you are interested in, looking for topics' keywords.

Pushing to named branches constantly

Another great difference between GitHub flow and GitFlow is that in GitHub flow, you push feature branches to the remote regularly, even if you are the only developer involved and interested in it. This is done even for backup purposes. Even if I already exposed my opinion in merit, I can't say this is a bad thing.

A thing I appreciate about GitFlow is that this habit of pushing every branch to the remote gives you the ability to see, with a simple git fetch command, all the branches currently active. Due to this, you can see all the work in progress, even that of your team mates.

Opening a pull request at any time

In Chapter 3, Git Fundamentals – Working Remotely, we talked about GitHub and made a quick try with pull requests. We have seen that basically they are for contributing. You fork someone else's repository, create a new branch, make some modifications, and then ask for a pull request from the original author.

In GitHub flow, you use pull requests massively. You can even ask another developer of your team to have a look at your work and help you, give you a hint, or review the work done. At this point, you can start a discussion about using the GitHub pull request to chat and involve other people, putting their usernames in CC. In addition, the pull request feature lets you comment even a single line of code in a different view, letting users involved proficiently discuss the work under revision.

Merging only after a pull request review

You can now understand that the pull requested branch stage we saw earlier becomes a sort of review stage. Here, other users can take a look at the code and even simply leave a positive comment, just a +1 to make other users know that they are confident about the job, and they approve its merge into master.

After this step, when the CI server says that the branch still passes all the automated tests, you are ready to merge the branch in master.

Deploying immediately after review

At this stage, you merge your branch into master, and the work is done. The deployment is not instantly fired, but at GitHub, they have a very straight and robust deploy procedure. They deploy big branches with 50 commits, but also branches with a single commit and a single line of code change, because deployment is very quick and cheap for them.

This is the reason why they can afford such a simple branching strategy, where you put on the master branch, and then you deploy, without the need to pass through the develop or release stage branch, like in GitFlow.

Conclusions

I consider this flow very responsive and effective for web-based projects, where basically you deploy to production without focusing too much on versions of your software. Using only the master branch to derive and integrate branches is faster than light. However, this strategy could be applied only if you have these prerequisites:

  • A centralized remote ready to manage pull requests (as GitHub does)
  • A good shared agreement about branch names and pull requests usage
  • A very robust deploy system

This is a big picture of this flow. For more details, I recommend that you visit the GitHub related page at https://guides.github.com/introduction/flow/index.html.

Other workflows

Obviously, there are many other workflows. I will spend just few words on the one that (fortunately) convinced Linus Torvalds to realize the Git VCS.

The Linux kernel workflow

The Linux kernel uses a workflow that refers to the traditional way in which Linus Torvalds has driven its evolution during these years. It is based on a military-like hierarchy.

Simple kernel developers work on their personal branches, rebasing the master branch in the reference repository. Then they push their branches to the lieutenant developer's master branch. Lieutenants are developers who Linus assigned to particular topics and areas of the kernel because of their experience. When a lieutenants have done their work, they push it to the benevolent dictator master branch (Linus branch). Then, if things are OK (it is not simple to cheat him), Linus would push his master branch to the blessed repository, the one that developers use to rebase from, before starting their work.

Summary

In this chapter, we became aware of the effective ways to use Git. I personally consider this chapter the most important for a new Git user, because it applies some rules and discipline so that you will obtain the most from this tool. So, pick up a good workflow (make your own, if necessary), and pay attention to your commits. This is the only way to become a good versioning system user, not only in Git.

In the next chapter, we will see some tips and tricks to use Git even if you have to deal with Subversion servers. Then, we will take a quick look at migrating from Subversion to Git.

主站蜘蛛池模板: 秦皇岛市| 正安县| 宜宾市| 安塞县| 扎囊县| 阜宁县| 清丰县| 富平县| 南涧| 保山市| 铜陵市| 澄城县| 芷江| 阜平县| 达日县| 广昌县| 湘乡市| 应城市| 西安市| 嘉峪关市| 连南| 潼关县| 白朗县| 华容县| 乌兰浩特市| 万荣县| 抚顺县| 商南县| 枝江市| 高安市| 探索| 香格里拉县| 四会市| 朝阳市| 南溪县| 东山县| 方正县| 纳雍县| 文成县| 工布江达县| 大田县|