Needless to say, git is good. I was using SVN earlier and we migrated to git in mid of 2019; Yeah, finally!! SVN has its own limitations but worked well for us, a team that was sitting together. Git is feature-rich. Has very good command line capability; it is so good that even on windows I use git bash. However, it comes with some learning curve; simply because of its local & remote repository concepts and millions of options which were missing in other version control system. The basics of git are easy, but as you venture beyond it can get a little hard. However, all your day to day needs can be taken care of with basic knowledge. By default, my dev environment had an older version of git installed. It had many limitations; The latest version is much better and has a simpler syntax. I recommend installing the latest version of git.
I went through multiple online materials, video lectures, and tried different things until I and my team got comfortable with git. As I started using git, I found that there is no short and sweet documentation. This motivated me to create this post. My aim is to create an introduction to git and provide just enough knowledge such that one can comfortably perform the basic tasks with git.
Initializing & cloning the repostory
The first thing that one needs to do is copy a git repository. I am assuming that there is a repository available. Now go to the folder where you want to copy the repository. I personally create a folder called “dev” on my machine and would run these commands there.
$ mkdir dev
$ cd dev
$ git init
This is to initialize the git repository for the first time. Inside dev folder it will create a .git folder. After initializing git the next thing one need to do, is clone a repository from some remote repository. Here I am using https://github.com/siddswork/gradle.git as my remote repository from which I will be cloning. If you don’t have a repository of your own you can create one at github for free.
$ git clone https://github.com/myWorks/testing.git
OR
$ git clone https://github.com/myWorks/testing.git myWork
The “git clone” command copied all data from remote repository to a local repository on your system. I provided with two options: The first command will clone the repository in current directory, but if you want the repository under your own directory use the second command which creates a directory called myWork inside dev folder. If you have used the second option go inside the folder myWork.
Note: The git URL used is not a real URL
Show Log
Now that you have cloned the repository. You may like to see the check-ins that went into the repository i.e. you want to look at the history. The default command is:
$ git log
However this is quite verbose. This command will go on showing you status of all changes to this repository. The data is displayed one page at a time, similar to when you use more command. However, if you are interested in seeing only the last 3 changes instead of all changes, use this:
$ git log -3
Or maybe you are interested in seeing the check-ins in a particular directory. Then go to that directory and run:
$ git log -3 .
If you want to look at the history for a particular file, do this:
$ git log -3 <filename>
All the above commands are very verbose or wordy. Some less verbose and condense alternatives are:
$ git log --oneline
All the commands listed above don’t show the files that got checked-in or changed with each commit, instead it mainly shows the comments for each commit. As I am used to seeing files that got checked in while viewing the logs in SVN, I prefer to use below command:
$ git log --oneline --stat .
Below is command that shows the log nicely and in a condensed format:
$ git log --pretty=format:"%h %ad | %s%d [%an]" --graph --date=short --stat
The command is impossible to remember. It will be crazy if I have to type this command every time I need to see logs. This is my problem with git; It is feature rich, making things hard to remember. But not to worry git has a way to create alias (not shell alias that we are use to in Unix/Linux systems but gits own alias). Below is how we can create an alias within git:
$ git config --global alias.<strong>showlog </strong>"log --pretty=format:'%h %ad | %s%d [%an]' --graph --date=short --stat"
$ git showlog
Now that you have created an alias, lets look at all the aliases that you have using:
$ git alias
Git’s environments: local to remote
Remote Environment
+----------------------------+ ------>------+
| Remote Repository |------+ |
+----------------------------+ | |
A | Clone V Fetch |
____________ | ____________ | ____________ | _______|___________
Push | V | |
+----------------------------+ -----+ |
+--<---| Local Repository | -------<------+ pull /
| +----------------------------+ | clone
| A |
| Checkout | Commit V
V +----------------------------+ |
| | Staging Area | |
| +----------------------------+ |
| A |
| | Add |
| +----------------------------+ ------<-------+
+-->---| Working Directory | Local Environment
+----------------------------+
The boxes here describe the 4 levels where data exists. The lowest level is working directory, where you make changes. These are plain files on the file system. The next level is the staging area where you keep things that are ready to go into your local repository. As the data is not yet in the repository we call this area staging area. Once you commit, data moves from staging area to local repository. This repository is on your machine and changes in local repository will not reflect in remote repository until it’s pushed to it.
Below are the commands from the above illustration along with their brief descriptions:
- add: Stages local changes from working directory to Staging Area
- commit: commits only what is in staging area, i.e. from Staging area to Local repository
- push: pushes content from local repository to remote repository
- checkout: switches current branches, data from local repository to working directory
- clone: creates local repository and working directory from remote repository
- fetch: updates references for remote branches
- pull: fetches and merges to local branch and working directory
Here you may pause and look at remote and local repositories in details.
Branching
Now that you have the repository, you can create a branch of its own and start working there. Branch starts with creating a local copy of the repository where you can make your change. You give a name to it and then later you can also push it to the remote repository. Pushing your branch to remote repository will ensure that your changes are also stored remotely, such that if your machine crashes you can retrieve all your changes from the remote repository.
Lets start by finding out our current branch and its name:
$ git branch
* master
We are in the master branch, ideally we should not be changing directly in the master branch. Also as master branch has been set as the default branch, it came when we did “git clone”. To create a new branch called myDev which is a copy of master, you can run the following command:
$ git pull # ensure you have the latest copy of the repository
$ git checkout -b myDev
Now execute git branch again and you will see that you are in myDev branch.
$ git branch
master
* myDev
The new branch is created locally on your system. To push the new local branch to remote repository, do the following:
$ git push --set-upstream origin myDev
You can verify the remote branch by checking branches on the github page.
Switching between branches
Now that you have your own branch and a master branch, you can switch between branches. This is easy to do using below command:
$ git branch
master
* myDev
$ git checkout master
$ git branch
* master
myDev
$ git checkout myDev
master
* myDev
Any changes that you have done will be preserved while switching between branches.
Checking in changes: add, commit, restore
Now that you are in myDev branch, you can start making changes. Say you modify an existing file file_1 and created a new file called file_2. You can check what has been modified using:
$ git status .
On branch myDev
Changes not staged for commit:
(use "git add …" to update what will be committed)
(use "git restore …" to discard changes in working directory)
modified: file_1
Untracked files:
(use "git add …" to include in what will be committed)
file_2
no changes added to commit (use "git add" and/or "git commit -a")
The dot(.) at the end to tells git to show only the changes done in current folder. Above command will tell you that file_1 which it already has in the repository has been modified but not staged for commit and file_2 is in untracked files list (meaning it’s a new file that git is seeing for the 1st time).
Say for some reason you don’t want the modifications you did in file_1 and revert it back to the version in repository. Then use below command, to revert back the changes to what is there in the repository.
$ git restore file_1
Now, if you execute “git status .” we will only see file_2 under Untracked files.
Lets, quickly make a change in file_1 again. Now “git status .” should show file_1 under “Changes not staged for commit:” section. No to stage these two files use below commands.
$ git add file_1
$ git add file_2
This will stage both the files, instead of two commands you can stage them together using one command by placing the file names one after the other. Now if you do git status, the output will be something like this.
$ git status .
On branch myDev
Changes to be committed:
(use "git restore --staged …" to unstage)
modified: file_1
new file: file_2
To unstage, say file_2 use:
$ git restore --staged file_2
This come handy, in case of accidental staging. If needed, the same command can also be used on file_1. This restore option is available in newer version of git. If you search the internet for the same operation you will find command like “git reset HEAD file_2” or “git checkout — file_2”. Personally, I find these commands little confusing and needs little bit more understanding of git.
Note: there is no way to recover uncomitted files from git. In the above case the changes made to file_2 are lost forever.
Finally, to commit the changes to local repository use below command:
$ git commit
This command will open vi editor on Linux like systems with information about the files getting committed. These are the staged files. The cursor will be at the 1st line where one need to write comment for the commit.
Now to push the changes from local repository to the remote repository you need to execute a push.
$ git push
If your remote repository is protected by password git will ask you to enter the password.
Remote repository basics
By now you must have cloned a remote repository. Below is how you can view the url of the remote repository:
$ git remote -v
origin https://github.com/myWorks/testing.git (fetch)
origin https://github.com/myWorks/testing.git (push)
fetch, pull & push
git fetch command pulls the data to your local repository—it doesn’t automatically merge it with any of your work or modify what you’re currently working on. You have to merge it manually into your work when you’re ready.
git pull command will fetch and then merge a remote branch into your current branch.
git push command pushes all the commits that you did onto your local repository to the remote repository. This command works only if you cloned from a server to which you have write access and if nobody has pushed in the meantime. If you and someone else clone at the same time and they push upstream and then you push upstream, your push will rightly be rejected. You’ll have to pull down their work first and incorporate it into yours before you’ll be allowed to push.
More on Branching
Branching is one of git’s core functionality and this is what sets it apart from other version control systems. In git one can have a separate branch for every feature that he/she is developing. These branches need not exist in remote repository and only be part of local repository for a finite time (till we need).
To start with lets say you need to start working on a requirement called “data correction”. To start with, lets check the available branches using:
$ git branch
* master
The above command shows the list of branches in your repository. Currently you have only one branch master. The * incicate that you are pointing to it.
To start working on data correction requirement, we need to create a branch called “data_correction”. So we do the following:
$ git branch data_correction # creates a branch
$ git checkout data_correction # checkout a branch
or you could also do:
$ git checkout -b data_correction # creates and checkout at the same time
If your working directory or staging area has uncommitted changes we should commit the changes or stash them. It’s best to have a clean working state when you switch branches. To keep thing simple at this point lets commit them and proceed.
Once the branch has been switched. You started working on the feature and added/modified few files and commited them to the branch.
$ git add additional_file modified_file
$ git commit
Suddenly you need to work on a urgent fix or a feature immediately. To do so, you should not use the data_correction branch, insted create a new branch out of master. To do so, we need to switch back to master and create a branch called urgent_fix.
$ git checkout master
$ git checkout -b urgent_fix
# make your changes
$ git add urgent_change
$ git commit
Now its time to merge the fix with master.
$ git checkout master
$ git merge urgent_fix
Now that we have delivered the urgent_fx, we can delete the branch using:
$ git branch -d urgent_fix
Even though I showed command line option for merging, this is not a preferred option. Once you are done with your commit in your personal branch execute git push to move the change from your local repository to remote repository. Next log into github and there create a pull request. You also have the option to assign a reviewer. Once some one reviews the code, he or she can add a review comment, approve/disapprove the change. Lastly reviewer gets the opportunity to merge the change from the personal branch to master or some other branch.
git diff
When you execute git diff, it tries to show if there is any difference between the file(s) in working directory and staging area and if the file(s) are not in staging area it tries to show diff between working areas and local repository.
As an example, say we have file, test in the local repository (which we have added using below steps)
$ echo "line 1" > test
$ git add test
$ git commit -m "Adding test"
Now add “line 2” at the end of test file and execute git diff test (see below).
$ echo "line 2" >> test
$ cat test
line 1
line 2
$ git diff test
<strong>diff --git a/5_example/test b/5_example/test</strong>
<strong>index 89b24ec..7bba8c8 100644</strong>
<strong>--- a/5_example/test</strong>
<strong>+++ b/5_example/test</strong>
@@ -1 +1,2 @@
line 1
+line 2
Here diff first tried to find difference between working directory and staging area. But the file is not in staging area. Next it tried to show diff between working directory and local repostory. That is the diff we see.
Now lets add the updated file to staging area and check diff
$ git add test
$ git diff test
Above git diff test shows no output. This is because git first tried to find difference between working directory and staging area. The file is present in both working directory and staging area and are identical so no diff. The command stops searching further.
Now lets modify the test file and run git diff test again.
$ echo "line 3" >> test
$ git diff test
<strong>diff --git a/5_example/test b/5_example/test</strong>
<strong>index 7bba8c8..a92d664 100644</strong>
<strong>--- a/5_example/test</strong>
<strong>+++ b/5_example/test</strong>
@@ -1,2 +1,3 @@
line 1
line 2
+line 3
Above git diff test shows difference between working directory and staging area. The command stops searching further.
Now we can tell git diff to start from staged area instead of working directory. This can be done by using –staged or –cached options.
$ git diff --staged test
<strong>diff --git a/5_example/test b/5_example/test</strong>
<strong>index 89b24ec..7bba8c8 100644</strong>
<strong>--- a/5_example/test</strong>
<strong>+++ b/5_example/test</strong>
@@ -1 +1,2 @@
line 1
+line 2 $
One can also tell git diff to compare with a specific version in local repository by using SHA1 value or symbolic name.
- HEAD is a symbolic name, which in general refers to the latest commit on the current branch. It is a pointer to the latest commit on the current branch
$ git diff test diff --git a/5_example/test b/5_example/test index 7bba8c8..a92d664 100644 --- a/5_example/test +++ b/5_example/test @@ -1,2 +1,3 @@ line 1 line 2 +line 3 $ git diff HEAD test diff --git a/5_example/test b/5_example/test index 89b24ec..a92d664 100644 --- a/5_example/test +++ b/5_example/test @@ -1 +1,3 @@ line 1 +line 2 +line 3 $ git diff --staged HEAD test diff --git a/5_example/test b/5_example/test index 89b24ec..7bba8c8 100644 --- a/5_example/test +++ b/5_example/test @@ -1 +1,2 @@ line 1 +line 2
Now lets commit test file one by one.
$ git commit -m "adding line 2"
$ git commit -am "adding line 3"
$ git log --graph --pretty=format:'%C(yellow)%h%C(cyan)%d%Creset %s %C(white)- %an, %ar%Creset' -3
* c4fcf55 (HEAD -> master) adding line 3 - siddhartha baidya, 7 minutes ago
* fdd4670 adding line 2 - siddhartha baidya, 8 minutes ago
* 8504aa4 adding test - siddhartha baidya, 49 minutes ago
Now lets see the diference between the test file in working directory and the first version in local repository having SHA1, 8504aa4.
$ git diff 8504aa4 test diff --git a/5_example/test b/5_example/test index 89b24ec..a92d664 100644 --- a/5_example/test +++ b/5_example/test @@ -1 +1,3 @@ line 1 +line 2 +line 3
What Next?
Now that you are know how to work with git, the next step would be to decide on your workflow and branching strategy. There are different workflows and branching strategies and they all have their pros and cons. You need to decide which one suits your project. You can start by looking at:
- GitHub Flow
- Git Flow
- You can look at my version of git flow! It is a combination of multiple disciplines, but primarily influenced by trunk based development.
- Lastly, if you need a more in-depth understanding of various branching stratigies then you should look into “Patterns for Managing Source Code Branches” by Martin Fowler.