Alice Hannah
Data Innovation Team, Data Division
20 August 2024
Version control is the practice of tracking and managing changes to files.
Does this look familiar?
├── stats-publication
│ ├── publication-analysis-code.R
│ ├── publication-analysis-code-v2.R
│ ├── publication-analysis-code-v2 NEW METHODOLOGY.R
│ ├── publication-analysis-code-v2 NEW METHODOLOGY AB-changes.R
│ ├── publication-analysis-code-final.R
│ ├── publication-analysis-code-final April 2023.R
Git is a free and open source software for version control.
It allows you to:
Record any changes you make to files (these records are called ‘commits’)
Undo changes and revert to previous version of files (if required)
Collaborate with others on the same project using ‘branches’
Commits contain information on:
Git tracks changes to the content of files, not just the file as a whole. This means the information above can be recorded for changes as small as one character on one line of code.
A version controlled code repository will usually contain files for one project and can contain:
A .gitignore
file tells Git which files in the code repository it shouldn’t track changes to.
Can include specific files, folders or file extensions
Use the Scottish Government template .gitignore to get started.
More information on using Git safely can be found in the Duck Book.
GitHub is a web interface for hosting version controlled code and can be used to:
Make code publicly available (although repositories can also be private)
Facilitate code review (using ‘pull requests’)
Manage projects using tools such as issue tracking
Navigate Git history and view previous versions of files
View other people’s code and collaborate
Scottish Government Analysis GitHub organisation
Git can be used without GitHub
GitHub is often used as the main copy of a code repository (or ‘remote’). Analysts or developers can take a copy (or ‘clone’) of the repository from GitHub to work on locally.
Use Git locally to track changes and regularly ‘push’ to GitHub
Use GitHub to facilitate code review and merging of branches
Preferable to lots of copies of the same file with various names!
Reproducible Analytical Pipelines (RAP)
Reproducible: You can rerun your code as it was at any point in time.
Auditable: You have a record of when changes were made and why.
Transparent: Code is publicly available on GitHub and available for others to review or reuse.
Good quality: Code review is built into the GitHub workflow.
Familiarise yourself with the Version Control guidance page on SharePoint
Explore the public code repositories in the Scottish Government Analysis GitHub organisation to see how other teams are using Git and GitHub
Read the Scottish Government GitHub User Responsibilities document
Create a free GitHub account (if you don’t already have one)
ONS Learning Hub (contact ONS to request an account)
Duck Book
Open sourcing analytical code (Government Analysis Function)
Alice Hannah
RAP Developer
Data Innovation Team, Scottish Government
Email: alice.hannah@gov.scot
GitHub: alice-hannah