- Mastering Git
- Jakub Nar?bski
- 3777字
- 2021-07-09 19:37:29
Creating a new commit
Before starting to develop with Git, you should introduce yourself with a name and an e-mail, as shown in Chapter 1, Git Basics in Practice. This information will be used to identify your work, either as an author or as a committer. The setup can be global for all your repositories (with git config --global
, or by editing the ~/.gitconfig
file directly), or local to a repository (with git config
, or by editing .git/config
). The per-repository configuration overrides the per-user one (you will learn more about it in Chapter 10, Customizing and Extending Git). You might want to use your company e-mail for work repositories, but your own non-work e-mail for public repositories you work on.
A relevant fragment of the appropriate config
file could look similar to this:
[user] name = Joe R. Hacker email = joe@company.com

Fig 1. The graph of revisions (the DAG) for a starting point of an example project, before creating a new commit. The current branch is master, and its tip is at revision c7cd3; this is also currently checked out revision, which can be referred to as HEAD.
The DAG view of creating a new commit
Chapter 2, Exploring Project History, introduced the concept of Directed Acyclic Graph (DAG) of revisions. Contributing to the development of a project usually consists of creating new revisions of the said project, and adding them as commit nodes to the graph of revisions.
Let's assume that we are on the master
branch, as shown in Fig 1 of the preceding section, and that we want to create a new version (the details of this operation will be described in more detail later). The git commit
command will create a new commit object—a new revision node. This commit will have as a patent the checked out revision (c7cd3
in the example). That revision is found by following refs starting from HEAD
; here, it is HEAD
to master
to c7cd3
chain.
Then Git will move the master
pointer to the new node, creating a situation as in Fig 2. In it, the new commit is marked with a thick red outline, and the old position of the master
branch is shown semi-transparent. Note that the HEAD
pointer doesn't change; all the time it points to master
:

Fig 2: The graph of revisions (the DAG) for an example project just after creating a new commit, starting from the state given by Fig 1
The new commit, a3b79
, is marked with the thick red outline. The tip of the master
branch changes from pointing to commit c7cd3
to pointing to commit a3b79
, as shown with the dotted line.
The index – a staging area for commits
Each of your files inside the working area of the Git repository can be either known or unknown to Git (be a tracked file). The files unknown to Git can be either untracked or ignored (you can find more information about ignoring files in Chapter 4, Managing Your Worktree).
Files tracked by Git are usually in either of the two states: committed (or unchanged) or modified. The committed state means that the file contents in the working directory is the same as in the last release, which is safely stored in the repository. The file is modified if it has changed compared to the last committed version.
But, in Git, there is another state. Let's consider what happens when we use the git add
command to add a file, but did not yet create a new commit adding it. A version control system needs to store such information somewhere. Git uses something called the index for this; it is the staging area that stores information that will go into the next commit. The git add <file>
command stages the current contents (current version) of the file, adding it to the index.
Note
If you want to only mark a file for addition, you can use git add -N <file>
; this stages empty contents for a file.
The index is a third section storing copy of a project, after a working directory (which contains your own copy of the project files, used as a private isolated workspace to make changes), and a local repository (which stores your own copy of a project history, and is used to synchronize changes with other developers):

Fig 3. Working directory, staging area, and the local git repository; creating a new commit
The arrows show how the Git commands copy contents, for example, git add
takes the content of the file from the working directory and puts it into the staging area.
Creating a new commit requires the following steps:
- You make changes to files in your working directory, usually modifying them using your favorite editor.
- You stage the files, adding snapshots of them (their current contents) to your staging area, usually with the
git add
command. - You create a new revision with the
git commit
command, which takes the files as they are in the staging area and stores that snapshot permanently to your local repository.
At the beginning (and just after the commit), the tracked files in the working directory, in the staging area, and in the last commit (the committed version) are identical.
Usually, however, one would use a special shortcut, the git commit -a
command (which is git commit --all
), which will take all the changed tracked files, add them to the staging area (as if with git add -u
, at least in modern Git), and create a new commit (see Fig 3 of this section). Note that the new files still need to be explicitly git add
to be tracked, and to be included in the new commit.
Examining the changes to be committed
Before committing the changes and creating a new revision (a new commit), you would want to see what you have done.
Git shows information about the pending changes to be committed in the commit message template, which is passed to the editor, unless you specify the commit message on the command line, for example, with git commit -m "Short description"
. This template is configurable (refer to Chapter 10, Customizing and Extending Git for more information).
Note
You can always abort creating a commit by exiting editor without any changes or with an empty commit message (comment lines, that is, lines beginning with #
, do not count).
In most cases, you would want to examine changes for correctness before creating a commit.
The status of the working directory
The main tool you use to examine which files are in which state: which files have changes, whether there are any new files, and so on, is the git status
command.
The default output is explanatory and quite verbose. If there are no changes, for example, directly after clone, you could see something like this:
$ git status On branch master nothing to commit, working directory clean
If the branch (you are on the master
branch in this example) is a local branch intended to create changes that are to be published and to appear in the public repository, and is configured to track its upstream branch, origin/master
, you would also see the information about the tracked branch:
Your branch is up-to-date with 'origin/master'.
In further examples, we will ignore it and not include this information.
Let's say you add two new files to your project, a COPYING
file with the copyright and license, and a NEWS
file, which is currently empty. In order to begin tracking a new COPYING
file, you use git add COPYING
. Accidentally, you remove the README
file from the working directory with rm README
. You modify Makefile
and rename rand.c
to random.c
with git mv
(without modifying it).
The default, long format, is designed to be human-readable, verbose, and descriptive:
$ git status On branch master Changes to be committed: (use "git reset HEAD <file>..." to unstage) new file: COPYING renamed: src/rand.c -> src/random.c Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: Makefile deleted: README Untracked files: (use "git add <file>..." to include in what will be committed) NEWS
As you can see, Git does not only describe which files have changed, but also explains how to change their status—either include in the commit, or remove from the set of pending changes (more information about commands in use
in git status
output can be found in Chapter 4, Managing Your Worktree). There are up to three sections present in the output:
- Changes to be committed: This is about the staged changes that would be committed with
git commit
(without the–a
option). It lists files whose snapshot in the staging area is different from the version from the last commit (HEAD
). - Changes not staged for commit: This lists the files whose working area contents are different from their snapshot in the staging area. Those changes would not be committed with
git commit
, but would be committed withgit commit -a
as changes in the tracked files. - Untracked files: This lists the files, unknown to Git, which are not ignored (refer to Chapter 4, Managing Your Worktree for how to use
gitignores
to make files to be ignored). These files would be added with the bulkadd
command,git add .
, in top directory. You can skip this section with--untracked-files=no
(-uno
for short).
One does not need to make use of the flexibility that the explicit staging area gives; one can simply use git add
just to add new files, and git commit –a
to create the commit from changes to all tracked files. In this case, you would create commit from both the Changes to be committed and Changes not staged for commit sections.
There is also a terse --short
output format. Its --porcelain
version is suitable for scripting because it is promised to remain stable, while --short
is intended for user output and could change. For the same set of changes, this output format would look something like this:
$ git status --short A COPYING M Makefile D README R src/rand.c -> src/random.c ?? NEWS
In this format, the status of each path is shown using a two-letter status code. The first letter shows the status of the index (the difference between the staging area and the last commit), and the second letter shows the status of the worktree (the difference between the working area and the staging area):

Not all the combinations are possible. Status letters A
, R
, and C
are possible only in the first column, for the status of the index.
A special case, ??
, is used for the unknown (untracked) files and !!
for ignored files (when using git status --short --ignored
). Note that not all the possible outputs are described here; the case where we have just done a merge that resulted in merge conflicts is not shown in this table, but is left to be described in Chapter 7, Merging Changes Together.
Examining differences from the last revision
If you want to know not only which files were changed (which you get with git status
), but also what exactly you have changed, use the git diff
command:

Fig 4. Examining the differences between the working directory, staging area, and local git repository
In the last section, we learned that in Git there are three stages: the working directory, the staging area, and the repository (usually the last commit). Therefore, we have not one set of differences but three, as shown in Fig 4. You can ask Git the following questions:
- What have you changed but not yet staged, that is, what are the differences between the staging area and working directory?
- What have you staged that you are about to commit, that is, what are the differences between the last commit (
HEAD
) and staging area?
To see what you've changed but not yet staged, type git diff
with no other arguments. This command compares what is in your working directory with what is in your staging area. These are the changes that could be added, but wouldn't be present if we create commit with git commit
(without -a
): Changes not staged for commit in the git status
output.
To see what you've staged that will go into your next commit, use git diff --staged
(or git diff --cached
). This command compares what is in your staging area to the content of your last commit. These are the changes that would be added with git commit
(without -a
): Changes to be committed in the git status
output. You can compare your staging area to any commit with git diff --staged <commit>
; HEAD
(the last commit) is just the default.
You can use git diff HEAD
to compare what is in your working directory with the last commit (or arbitrary commit with git diff <commit>
). These are the changes that would be added with the git commit -a
shortcut.
If you are using git commit –a
, and not making use of the staging area, usually it is enough to use git diff
to check the changes which will be in the next commit. The only issue is the new files that are added with bare git add
; they won't show in the git diff
output unless you use git add --intent-to-add
(or its equivalent git add -N
) to add new files.
Unified Git diff format
Git, by default and in most cases, will show the changes in unified diff output format. Understanding this output is very important, not only when examining changes to be committed, but also when reviewing and examining changes (for example, in code review, or in finding bugs after git bisect
has found the suspected commit).
Note
You can request only statistics of changes with the --stat
or --dirstat
option, or just names of the changed files with --name-only
, or file names with type of changes with --name-status
, or tree-level view of changes with --raw
, or a condensed summary of extended header information with --summary
(see later for an explanation of what extended header means and what information it contains). You can also request word diff, rather than line diff, with --word-diff
; though this changes only the formatting of chunks of changes, headers and chunk headers remain similar.
Diff generation can also be configured for specific files or types of files with appropriate gitattributes. You can specify external diff helper, that is, the command that describes the changes, or you can specify text conversion filter for binary files (you will learn more about this in Chapter 4, Managing Your Worktree).
If you prefer to examine the changes in a graphical tool (which usually provides side-by-side diff), you can do it by using git difftool
in place of git diff
. This may require some configuration, and will be explained in Chapter 10, Customizing and Extending Git.
Let's take a look at an example of advanced diff from Git project history . Let's use the diff from the commit 1088261f
from the git.git
repository. You can view these changes in a web browser, for example, on GitHub; this is the third patch in this commit:
diff --git a/builtin-http-fetch.c b/http-fetch.c similarity index 95% rename from builtin-http-fetch.c rename to http-fetch.c index f3e63d7..e8f44ba 100644 --- a/builtin-http-fetch.c +++ b/http-fetch.c @@ -1,8 +1,9 @@ #include "cache.h" #include "walker.h" -int cmd_http_fetch(int argc, const char **argv, const char *prefix) +int main(int argc, const char **argv) { + const char *prefix; struct walker *walker; int commits_on_stdin = 0; int commits; @@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, int get_verbosely = 0; int get_recover = 0; + prefix = setup_git_directory(); + git_config(git_default_config, NULL); while (arg < argc && argv[arg][0] == '-') {
Let's analyze this patch line after line:
- The first line,
diff --git a/builtin-http-fetch.c b/http-fetch.c
, is a git diff header in the formdiff --git a/file1 b/file2
. Thea/
andb/
filenames are the same unless rename or copy is involved (such as in our case), even if the file is added or deleted. The--git
option means that diff is in thegit
diff
output format. - The next lines are one or more extended header lines. The first three lines in this example tell us that the file was renamed from
builtin-http-fetch.c
tohttp-fetch.c
and that these two files are95%
identical (which information was used to detect this rename):similarity index 95% rename from builtin-http-fetch.c Rename to http-fetch.c
Note
Extended header lines describe information that cannot be represented in an ordinary unified diff (except for information that file was renamed). Besides similarity (or dissimilarity) score like in example they can describe the changes in file type (example from non-executable to executable).
- The last line in extended diff header, which, in this example is
index f3e63d7..e8f44ba 100644
tells us about the mode of given file (100644
means that it is an ordinary file and not a symbolic link, and that it doesn't have executable permission bit; these three are only file permissions tracked by Git), and about shortened hash of pre-image (the version of the file before the given change) and post-image (the version of the file after the change). This line is used bygit am --3way
to try to do a three-way merge if the patch cannot be applied itself. For the new files, pre-image hash is0000000
, the same for the deleted files with post-image hash. - Next is the unified diff header, which consists of two lines:
--- a/builtin-http-fetch.c +++ b/http-fetch.c
- Compared to the
diff -U
result, it doesn't have from-file-modification-time or to-file-modification-time after source (pre-image) and destination or target (post-image) filenames. If the file was created, the source would be/dev/null
; if the file was deleted, the target would be/dev/null
.Note
If you set the
diff.mnemonicPrefix
configuration variable totrue
, in place of thea/
prefix for pre-image andb/
for post-image in this two-line header, you can instead have thec/
prefix for commit,i/
for index,w/
for worktree, ando/
for object, respectively, to show what you compare. - Next comes one or more hunk of differences; each hunk shows one area where the files differ. Unified format hunks start with the line describing where the changes were in the file:
@@ -1,8 +1,9 @@
This line is in the format
@@ from-file-range to-file-range @@
. The from-file-range is in the form-<start line>,<number of lines>
, and to-file-range is+<start line>,<number of lines>
. Both start-line and number-of-lines refer to the position and length of hunk in pre-image and post-image, respectively. If number-of-lines is not shown, it means that it is0
. In this example, the changes, both in pre-image (file before the changes) and post-image (file after the changes) begin at the first line of the file, and the fragment of code corresponding to this hunk of diff has8
lines in pre-image, and9
lines in post-image (one line is added). By default, Git will also show three unchanged lines surrounding changes (three context lines). Git will also show the function where each change occurs (or equivalent, if any, for other types of files; this can be configured with.gitattributes
); it is like the-p
option in GNU diff:@@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char
- Next is the description of where and how files differ. The lines common to both the files begin with a space (" ") indicator character. The lines that actually differ between the two files have one of the following indicator characters in the left print column:
+
: A line was added here to the second file-
: A line was removed here from the first file
Note
Note that the changed line is denoted as removing the old version and adding the new version of the line.
In the plain word-diff format, instead of comparing file contents line by line, added words are surrounded by
{+
and+}
, while removed by[-
and-]
. - If the last hunk includes, among its lines, the very last line of either version of the file, and that last line is incomplete, (which means that the file does not end with the end-of-line character at the end of hunk) you would find:
\ No newline at end of file
This situation is not present in the presented example.
So, for the example used here, first chunk means that cmd_http_fetch
was replaced by main
and the const char *prefix;
line was added:
#include "cache.h" #include "walker.h" -int cmd_http_fetch(int argc, const char **argv, const char *prefix) +int main(int argc, const char **argv) { + const char *prefix; struct walker *walker; int commits_on_stdin = 0; int commits;
See how for the replaced line, the old version of the line appears as removed (-
) and the new version as added (+
).
In other words, before the change, the appropriate fragment of the file, that was then named builtin-http-fetch.c
, looked similar to the following:
#include "cache.h" #include "walker.h" int cmd_http_fetch(int argc, const char **argv, const char *prefix) { struct walker *walker; int commits_on_stdin = 0; int commits;
After the change, this fragment of the file that is now named http-fetch.c
, looks similar to this instead:
#include "cache.h" #include "walker.h" int main(int argc, const char **argv) { const char *prefix; struct walker *walker; int commits_on_stdin = 0; int commits;
Selective commit
Sometimes, after examining the pending changes as explained, you realize that you have two (or more) unrelated changes in your working directory that should belong to two different logical changes; it is the tangled working copy problem. You need to put those unrelated changes into separate commits, as separate changesets. This is the type of situation that can occur even when trying to follow the best practices.
One solution is to create commit as-is, and fix it later (split it in two). You can read how to do this in Chapter 8, Keeping History Clean.
Sometimes, however, some of the changes are needed now, and shipped immediately (for example bug fix to a live website), while the rest of the changes are work in progress, not ready. You need to tease those changes apart into two separate commits.
Selecting files to commit
The simplest situation is when these unrelated changes touch different files. For example, if the bug was in the view/entry.tmpl
file and only in this file, and there were no other changes to this file, you can create a bug fix commit with the following command:
$ git commit view/entry.tmpl
This command will ignore changes staged in the index (what was in the staging area), and instead record the current contents of a given file or files (what is in the working directory).
Interactively selecting changes
Sometimes, however, the changes cannot be separated in this way. The changes to the file are tangled together. You can try to tease them apart by giving the --interactive
option to git commit
:
$ git commit --interactive staged unstaged path 1: unchanged +3/-2 Makefile 2: unchanged +64/-1 src/rand.c *** Commands *** 1: status 2: update 3: revert 4: add untracked 5: patch 6: diff 7: quit 8: help What now>
Here, Git shows us the status and the summary of changes to the working area (unstaged
) and to the staging area / the index (staged
)—the output of the status
subcommand. The changes are described by the number of added and deleted files (similar to what the git diff --numstat
command shows):
What now> h status - show paths with changes update - add working tree state to the staged set of changes revert - revert staged set of changes back to the HEAD version patch - pick hunks and update selectively diff - view diff between HEAD and index add untracked - add contents of untracked files to the staged set of changes *** Commands *** 1: status 2: update 3: revert 4: add untracked 5: patch 6: diff 7: quit 8: help
To tease apart changes, you need to choose the patch
subcommand (for example, with 5
or s
). Git will then ask for the files with the Update>>
prompt; you then need to select the files to selectively update with their numeric identifiers, as shown in the status, and type return
. You can say *
to select all the files possible. After making the selection, end it by answering with an empty line. (You can skip directly to patching files with the --patch
option.)
Git will then display all the changes to the specified files on a hunk-by-hunk basis, and let you choose, among others, one of the following options for each hunk:
y - stage this hunk n - do not stage this hunk q - quit; do not stage this hunk or any of the remaining ones s - split the current hunk into smaller hunks e - manually edit the current hunk ? - print help
The hunk output and the prompt look similar to this:
@@ -16,7 +15,6 @@ int main(int argc, char *argv[]) int max = atoi(argv[1]); + srand(time(NULL)); int result = random_int(max); printf("%d\n", result); Stage this hunk [y,n,q,a,d,/,j,J,g,e,?]? y
In many cases, it is enough to simply select which of those hunks of changes you want to have in the commit. In extreme cases, you can split a chunk into smaller pieces, or even manually edit the diff.
Creating a commit step by step
Interactively selecting changes to commit with git commit --interactive
doesn't unfortunately allow to test the changes to be committed. You can always check that everything works after creating a commit (compile and/or run tests), and then amend it if there are any errors. There is, however, an alternative solution.
You can prepare commit by putting the pending changes into the staging area with git add --interactive
, or an equivalent solution (like graphical Git commit tool for Git, for example, git gui
). The interactive commit is just a shortcut for interactive add followed by commit, anyway. Then you should examine these changes with git diff --cached
, modifying them as appropriate with git add <file>
, git checkout <file>
, and git reset <file>
.
In theory, you should also test these changes whether they are correct, checking that at least they do not break the build. To do this, first use git stash save --keep-index
to save the current state and bring the working directory to the state prepared in the staging area (the index). After this command, you can run tests (or at least check whether the program compiles and doesn't crash). If tests pass, you can then run git commit
to create a new revision. If tests fail, you should restore the working directory while keeping the staging area state with the git stash pop --index
command; it might be required to precede it with git reset --hard
. The latter might be needed because Git is overly conservative when preserving your work, and does not know that you have just stashed. First, there are uncommitted changes in the index prevent Git from applying the stash, and second, the changes to the working directory are the same as stashed, so of course they would conflict.
You can find more information about stashes, including how they work, in Chapter 4, Managing Your Worktree.
Amending a commit
One of the better things in Git is that you can undo almost anything; you only need to know how. No matter how carefully you craft your commits, sooner or later, you'll forget to add a change, or mistype the commit message. That's when the --amend
flag of the git commit
command comes in handy; it allows you to change the very last commit really easily. Note that you can also amend the merge commits (for example, fix a merging error).
Note
If you want to change a commit deeper in history (assuming that it was not published, or at least, there isn't anyone who based their work on the old version of the said commit), you need to use interactive rebase or some specialized tool, such as StGit (a patch stack management interface on top of Git). Refer to Chapter 8, Keeping History Clean, for more information.

Fig 5. The DAG of revisions, C1 to C2, before amending a topmost (most recent) and currently checked out commit, which is named C5. Here, we have used numbers instead of SHA-1 to be able to indicate related commits.
If you just want to correct the commit message, you simply commit again, without any staged changes, and fix it (note that we use git commit
without the -a
/ --all
flag):
$ git commit --amend
If you want to add some more changes to that last commit, you can simply stage them as normal with git add
and then commit again as shown in the preceding example, or make the changes and use git commit -a --amend
:

Fig 6. The DAG of revisions after amending the last commit (revision C5) on Fig 5. Here, the new commit C5 is old commit C5 with changes (amended); it replaces old commit place in history.
There is a very important caveat: you should never amend a commit that has already been published! This is because amend effectively produces a completely new commit object that replaces the old one, as can be seen on Fig 6. If you're the only person who had this commit, doing this is safe. However, after publishing the original commit to a remote repository, other people might already have based their new work on that version of the commit. Replacing the original with an amended version will cause problems downstream. You will find more about this issue in Chapter 8, Keeping History Clean.
If you try to push (publish) a branch with the published commit amended, Git would prevent overwriting the published history, and ask to force push if you really want to replace the old version (unless you configure it to force push by default). The old version of commit before amending would be available in the branch reflog and in the HEAD reflog; for example, just after amend, it would be available as @{1}
. Git would keep the old version for a month, by default, unless manually purged.
- Learning ASP.NET Core 2.0
- Processing互動編程藝術(shù)
- Podman實戰(zhàn)
- Web前端應(yīng)用開發(fā)技術(shù)
- 軟件工程基礎(chǔ)與實訓(xùn)教程
- 實戰(zhàn)Java高并發(fā)程序設(shè)計(第2版)
- Troubleshooting Citrix XenApp?
- Spring 5 Design Patterns
- 精通Spring:Java Web開發(fā)與Spring Boot高級功能
- 和孩子一起學(xué)編程:用Scratch玩Minecraft我的世界
- 計算機應(yīng)用基礎(chǔ)(Windows 7+Office 2010)
- WCF全面解析
- 系統(tǒng)分析師UML用例實戰(zhàn)
- Python程序設(shè)計現(xiàn)代方法
- 流暢的Python