Lab Session 2: Dotfiles, Branches, Conflics and GitHub#

Friday 02-03-2023, 9AM-11AM & 12AM-2PM

Instructor: Facundo Sapienza

Useful links:

1. Dotfiles#

We are going to make a local setup of bash, git, conda and more following the .dotfiles repository. Take some time to read the .README inside the repository. The .README is self explanatory, but you can follow these instructions.

  1. Fork the dotfiles repository in your personal GitHub account by clicking the Fork bottom in the upper right corner in the GitHub website for the repository.

  2. Clone your folk repository in your home directory in the JupyterHub. There are two important things to keep in mind when cloning the repository i) Clone the repository as a bare repository (git clone --bare), and ii) Change the name of the local repository folder to .dotfiles. You can do these with

git clone --bare https://github.com/<YOUR USERNAME>/dotfiles.git $HOME/.dotfiles
  1. Define an alias for manipulating files inside .dotfiles

alias gdot='git --git-dir=$HOME/.dotfiles --work-tree=$HOME'
  1. Checkout the main branch into $HOME:

gdot switch -f main

If you visualize the hidden files in you home directory (View > Show Hidden Files), you will see a series of configuration files with the names .bashrc, gitconfig, condarc, etc. Explore a little bit what is inside these files.

Some of these files have Fernando’s information. Change for yours and then commit and push those changes to your fork of .dotfiles. In order to do so, you can use the gdot alias. For example, if your changed .gitconfig, you can do

gdot add .bashrc
gdot commit -m"Change username"
gdot push

The command gdot status will show all the untracked files, which in your home directory is likely a lot. To disable this behavior, use the following, which tells git to remove all untracked files and directories from status listings

gdot config --local status.showUntrackedFiles no 

1.1. Setting up git#

You can configure your git preferences from the terminal with the general syntax

git config --global <setting> <option> 

For example, you can configure your name and email:

git config --global user.name "Facu Sapienza"
git config --global user.email "fsapienza@berkeley.edu"

You should also configure your default editor (remember we visited some popular editors in Lab01! Have those commands handy because you will need to use them). For example, if you want to use nano as default editor

git config --global core.editor "nano"

Another way of doing this is by modifying the file .gitconfig in your home directory. If you already have .dotfiles installed in your machine, you would be able to the see following line inside .gitignore:

[core]
	excludesfile = ~/.gitignore
	editor = nano

which indicates the default editor being used by git.

2. Branching#

Let’s start with some basic warm up. This is something we have seen along the lectures, but now is your turn of doing it. We are going to emulate two programmers working in different branches of the same repository and then merging them. For the purposes of this exercise, we recommend you to create a test repository just as we did in previous labs and lectures. You can do this by creating a new folder in the JupyterHub and then git init. Add some test files in that repository and commit those changes.

2.1. Merging two branches#

We start working in our main branch of test repository. Now we want to try new experiments in a new branch. If everything goes well with our experiment, we may wish to merge it to main.

  1. Create a new branch, for example by using git switch - c <branchname>. You can also create a new branch with git branch <branchname> and move to the new branch with git checkout <branchname>.

  2. Make new changes in this new branch. Add new text files. Remember to commit these changes.

  3. Switch to main again and make changes in a different file.

  4. Merge both branches. Merge the experimental branch into main using git merge.

  5. Delete the old branch (See more info here):

    1. From your local repository: git branch -d <branchname>

    2. From your remote repository: git push <remote name> -d <branch name>

2.2. Solving conflicts in the same file#

While git is very good at merging, if two different branches modify the same file in the same location, git simply can’t decide which change should prevail. At that point, human intervention is necessary to make the decision. Git will help you by marking the location in the file that has a problem, but it’s up to you to resolve the conflict. Let’s see how that works by intentionally creating a conflict.

  1. Create a new branch and make some changes to an specific file, let’s say text.txt. Remember to commit this changes!.

  2. Comeback to the main branch and do some more modifications to the same file text.txt in the same position where you made changes from the other branch. Commit the changes again.

  3. Now, if you try to git merge <newbranch> now (as we did in the previous section), you will see the next message error

Auto-merging text.txt
CONFLICT (content): Merge conflict in text.txt
Automatic merge failed; fix conflicts and then commit the result.'
  1. If you see now the contents of text.txt, you will see that both changes are overlapped in the same file (you can just see the contents by using cat text.txt or opening the file with any text editor). In order to solve the conflict, you need to edit the file manually and decide with changes to keep. Use any text editor (micro, emacs, etc) to edit such file and keep the changes you want.

  2. Commit your changes and solve the conflict.

3. GitHub#

3.1. Authentitication#

For this course, we are going to be using an authentication tool developed to keep our GitHub credentials safe as we work in the cloud. In order to authenticate, go to the shared and make a copy of GHAUTH.ipynb in your home directory. Open the notebook and execute the cell containing the following commands

import gh_scoped_creds
%ghscopedcreds

This will produce a message with a link and the corresponding password. After doing this for the first time, you need to go to the configuration page and give permission to the app to access your repositories. After this, you can authenticate with GitHub by just running the code inside GHAUTH.ipynb. Notice that this gives you permissions for 8 hours.

3.2 GitHub check-in#

Be sure you know how to create and manipulate repositories in GitHub.

  1. Create a new repository in GitHub inside your personal account. For this, you can decide to create an empty repository or fill it with some basic content (for example, a README.md file). For now, let’s create an empty repository.

  2. Now, synchronize this repository with a new local repository or one that you already have (eg, the test repository from previous exercises). In order to do this, GitHub already provides you with the instructions to do this.

  3. Push the local changes of your repository into GitHub using git push. Remember to authenticate before.

  4. In GitHub (the website, no in your Hub session), edit and commit one of the text files.

  5. Pull these changes to your local repository using git pull.

  6. Create a new branch in your local repository and push it to GitHub. What happens when you do this? If you try to git push changes in a new branch that you just created, you will receive the following message error fatal: The current branch <branch name> has no upstream branch. This is because you have just created the branch in local and no in remote. Instead, the first time you push a file to a new branch you have to do git push -u origin <branch name> (-u is just a shortcut for --set-upstream).

3.3. Collaborating on GitHub with a small team#

We are going to set up a shared collaboration with one partner (the person sitting next to you). This will show the basic workflow of collaborating on a project with a small team where everyone have write privileges to the same repository.

We will have two people, let’s call them Alice and Bob, sharing a repository. Alice will be the owner of the repository and she will give Bob write privileges.

We begin with a simple synchronization example, much like we just did above, but now between two people instead of one person. Otherwise it’s the same:

  • Alice creates a new repository in GitHub with some basic text files on it (this could be the same one that you use for the previous exercise).

  • Bob clones Alice’s repository.

  • Bob makes changes to a file and commits them locally.

  • Bob pushes his changes to GitHub.

  • Alice pulls Bob’s changes into her own repository.

Next, we will have both parties make non-conflicting changes each, and commit them locally. Then both try to push their changes:

  • Alice adds a new file, alice.txt to the repo and commits.

  • Add a tag to this stage of the repository. What is a tag and how to make one?

  • Bob adds bob.txt and commits.

  • Alice pushes to GitHub.

  • Bob tries to push to GitHub. What happens here?

The problem is that Bob’s changes create a commit that conflicts with Alice’s, so git refuses to apply them. It forces Bob to first do the merge on his machine, so that if there is a conflict in the merge, Bob deals with the conflict manually (git could try to do the merge on the server, but in that case if there’s a conflict, the server repository would be left in a conflicted state without a human to fix things up). The solution is for Bob to first pull the changes.

4. Reconstructing past versions#

Sometimes we make accidental changes to some of the files in a repository, or maybe we just want to comeback to a previous version. In any case, it is easy to restore or even recover old versions of files that have been track in a commit message.

For these next examples, we are going to use the git checkout command to restore past versions of a file. This can lead to some confussion, since this is the same command we use for changing branches. In a sense, git checkout does both the work of git switch and git restore. You can do all the following exercises with git restore instead of git checkout.

4.1. Restoring old versions#

For this example, we are going to make modifications to one of the files in our repository and then recover some of the older versions.

  1. Make more than one change in the same file in your repository, for example you can use write some new text inside text.txt. With echo "..." >> text.txt you will print new lines at the end of text.txt (with > you will just overwrite all the contents).

  2. Try to restore previous version of such file by using

git checkout <commit> <filename>

or

git restore --source=<commit> <filename>

You will need to specify the stage at which you want to restore the file. You can do this by looking at the log of the repository (git log, git slog, git log --all). This is why commit messages are so important!

Observation: you can also see old versions of your files directly on GitHub, in case you need to inspect previous versions of files.

4.2. Recovering deleted files#

Now, let’s practice deleting and recovering an specific file

  1. Remove one of the files in your test repository. You can also just create a new file and remove it. To do so, use the git rm <file name>. You can take a look to this link to see some flags you can add to this command.

  2. If you haven’t commit your changes, you can recover the file just by coming back to the previous snapshot of the repository by using git checkout HEAD

  3. Now, if you make a commit after removing the changes, you need to do a little bit more of work. Use git slog -- <filename> to see all the history associated to the file you removed and then use git checkout <commit> -- <filename> to recover it.

5. Other useful commands#

If you already finished all the previous tasks, you are welcome to explore some more useful git commands!

Can you think in a situation where these commands may result useful? For example, instead of using git pull, can you do the same with git fetch and then git merge? If so, what can be the advantage of doing such a thing?