Workshops

Git and GitHub for Public Health

(workshop development in progress in collaboration with Dr. Corinne Riddell (UC Berkeley))

This workshop will be taught at the following locations:

  • California Department for Public Health (May 2024)
  • Society for Epidemiologic Research (SER) conference (June 2024)

Open source materials for this workshop can be found here.


Workshop Description:

Version control, the practice of tracking and managing changes to statistical code, is essential for reducing errors in a statistical analysis. However, many epidemiologists are not trained to do this and are unsure how it fits with institutional review board (IRB) protocols and privacy standards. In this workshop, we will provide an introduction to git and GitHub, aiming to equip epidemiologists with version control tools that also meet ethical standards.

We will start with an overview of key concepts in version control and the installation and setup of git. Attendees will then create a GitHub repository and implement a version control workflow for projects that they work on alone. We will cover concepts such as branching, pulling, committing, and pushing. We will provide sample data and code that they will update – committing the changes locally and pushing them to the main GitHub repository.

We will then expand the workflow to projects done within a team. Attendees will be put into groups and practice working in a repository together. They will become familiar with the challenges that arise when working simultaneously on files, such as merge conflicts, and how to resolve these.

Attendees will be introduced to using git via the command line, a git client, and the R studio git pane. We will cover general tips on using git, best practices around data storage internal versus external to a repository, and an overview of how GitHub workflows can be compliant with IRB requirements.