Getting started with Git
Git is a version control system
Open source software, developed by Linus Torvald, created to manage large open source project code. In this hands-on you will learn how to:
Setup Git;
6 basic Git commands: git init, git status, git add, git commit, git push, git pull;
Recover stuff;
Publish your local repo online on GitHub;
Resolve conflicts.
Prerequisites
Your laptop;
Git installed (installation instructions here contains info for all platforms);
Anaconda Distribution.
If you got lost during the workshop, here is an extended version of the Git tutorial that you can use later.
Launching Command Prompt
The commands in these tutorials will utilize the command prompt because of its incredible flexibility and usability. We will also introduce visual interfaces to do the same actions later.
Finishing Git setup
Git relies on your name and email address to identify who did what on the code. You can write any name (not necessary the one that you've used during the GitHub registration), but the email address should be = GitHub email address.
Let's set that up:
This name and email address will be associated to all "commits" (bundle of changes) applied to the code.
You may also be aware that different system encore characters differently. Let's adjust Git behavior to sort it out:
We will also configure Git to use a text editor that is easier to use than the default one:
Your first Git repository
In any folder that does not already contain a Git repository type:
Now let's check that the .git folder was created (it's a marker for Git repositories):
Start tracking the history of a file, for example README.md:
What did we add to what? Introduction to the staging area concept in Git:
Save the changes operated on tracked files with a "good" message (read more on how to write commit messages here):
Check changes you made with respect to the last "staged" changes (for all touched files):
Check the past commits:
Forgetting changes, recovering files, moving around the commit tree
Detaching the HEAD (by mistake? on purpose? ):
Getting a single file back (i.e. discard changes in the local file tree, recover the state of a given file at at given commit)
Ignoring things is very useful
Create a ".gitignore" file in the root folder of your repository
The file contains a list of "filters" to check whether or not a change should be tracked by Git. Filter may tool like:
(Ignoring all changes to files with the .temp extension, temporary checkpoints created by Jupyter Notebook or Lab, and all files in the "intermediate_results" folder)
Collaboration - working with a remote
Objective: put the artifacts and their history in a location where it can be retrieved by others, and they can contribute to the project.
Step 1: Create the shared repository (Git Lab - create empty repository; i.e. no readme file, just to make it easier to work)
Step 2: link your local repository to the shared repository. The link is called an "upstream". The shared repository is where you will get the latest version from (fetch) and push your latests commits (push).
This information is provided by GitLab when you create the shared repository! You get the repo URL from GitLab/GitHub
With this, the following commands will push and pull changes to and from the "remote":
Getting started with collaborating
Getting someone else's code: (this is eased by GitLab/GitHub interface)
Note that "git clone" sets up the "remote" automatically
Work in groups, pick one of your repository, all members of the groups should have a local copy.
Everyone pull changes from the remote (we are all on the same revision of the artefacts)
Action 1: everyone touches a _different_ file. commit, push. Observe how nicely it runs.
Action 2: everyone touches the _same file_, observe push failure. Find out who "won" the commit race.
All those who could not "push", must pull. Check the content of the conflicting files
Remove the tags, edit the content so that it makes sense. Commit, push. Check if the others were faster than you!
Notes
Documentation
Leave a README file at the root of the repository. This file must contains a general description of the content of the repository. The README file indicates where the rest of the documentation is - or fully describes the project. Up to you!
Things to document:
What is the project about
Key technology used in the project (programming language, tools used to compile/run)
Foldering structure (what's what in here)
How to run, test, compile, execute content of the repository
commands to run the tool
where input/output files are located and what they contain
how to compile your thesis ! (if you use LaTeX)
License
The code you will be putting in GitLab / GitHub may be private or public. Your choice. If made public, and with the intent of sharing it with others (regardless of who they are), you MUST pick a license for your work.
In general, use an open license, allowing others to do what they want with your code. Worst license: no license (basic copyright laws apply, and you can't find out what those are...)
Miscellaneous
If you would like to look through a more comprehensive tutorial on Git (written for software engineers), this is a great guide.
References
Geoff Hoffman. Git 101 - Crash Course in Version Control using Git. https://www.slideshare.net/phpguru/git-101-31908275. [Accessed 6 March 2020].
Last updated