Gentle Introduction to GitHub¶

Ahmed Moustafa

Quick Review¶

https://www.menti.com/alnpj76xuxwu

https://www.mentimeter.com/app/presentation/alx5mivoqmqxen4s2jh158wwjovgfhrh

Before & After¶

Source: Towards Data Science: Getting Started with Git and GitHub

What is Version Control?¶

Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later

  • Tracks changes made to files over time
  • Enables collaboration and sharing of files
  • Maintains a history of changes made to files
  • Facilitates revert to a previous version of a file if necessary

Why is Version Control Important in Data Science?¶

  • Manages the history and evolution of data science projects
  • Facilitates collaboration and sharing among data scientists
  • Helps maintain and organize different versions of code and data
  • Enables tracking and reproducibility of data science projects

Git & GitHub¶

  • Git is a distributed version control system that is used to manage and track changes made to files.
  • GitHub is a web-based platform that provides hosting for Git repositories and facilitates collaboration among data scientists.

How to Obtain Git?¶

https://git-scm.com/

How Does Git Work?¶

Source: NeSI's Git: Reference Sheet

How Does Git Work? local¶

Command Description
clone Copies a remote repository into the current directory
init Creates a new empty repo in the current directory
add Adds files to the staging area
status Lists changes in the working directory, and staged files
commit Records everything in the staging area to the repository
reset Removes all files from staging area (opposite of add)

How Does Git Work? remote¶

Command Description
fetch Gets status of origin
pull Incorporates changes from origin into local repo
push Incorporates changes from local repo into origin

Contributing to a Repository¶

  • Fork a Repository: To contribute to a repository, you must first fork it to create a copy of the codebase on your own GitHub account
  • Make Changes: Make the desired changes to the code and commit the changes to your local repository
  • Push Changes: Push the changes to your forked repository on GitHub.
  • Create a Pull Request (PR): Create a pull request to request the changes be merged into the original repository.

Contributing to a Repository¶

Source: edav.info: Chapter 6 GitHub/git Resources

GitHub Example¶

  • Go to repository https://github.com/ahmedmoustafa/hello-world
  • Fork the repo under your account
  • Edit notebook hello-world.ipynb, using for example, Colab or Codespace
  • Add a cell, write a print statement to print your full name
  • Commit and push your changes
  • Submit a pull request (PR)

GitHub Exercise¶

  • Go to repository https://github.com/ahmedmoustafa/bug-in-the-code
  • Fork the repo under your account
  • Edit notebook bug-in-the-code.ipynb, using for example, Colab or Codespace
  • There is a syntax error.
  • Fix it.
  • The first to submit a pull request with the fix will have have a bonus

So, as a general coding rule of thumb¶