Reproducible Analytical Pipelines

Fit for the Future: Leading a modern statistical service

Alice Byers

Data Innovation Team, Data Division

25 April 2024

About me

  • Reproducible Analytical Pipeline (RAP) developer

  • Background in statistics and data analysis

Why should I care about RAP?


  • Less time spent producing routine analysis; more time for other work

  • Better quality analysis

What is RAP?

  • An overall approach to carrying out analysis

  • A set of principles that ensure analysis is:

    • reproducible
    • auditable
    • efficient
    • high quality

RAP principles (1)

In order to achieve the full benefits, at a minimum a RAP must:

  • Minimise manual steps

  • Be built using open-source software; e.g. R, Python

  • Be peer reviewed by colleagues

RAP principles (2)

  • Be version controlled; e.g. Git

  • Be open to anyone; e.g. code published on GitHub

  • Follow good practice for quality assurance

  • Contain well-commented code and have documentation embedded

Examples of projects

  • Routine ‘traditional’ publications

  • Shiny dashboards

  • Ministerial briefings

  • FOIs

  • One-off analysis

Stages of analysis

  • Data extraction

  • Data cleaning

  • Data analysis

  • Modelling

  • Data visualisation

  • Reporting

Where to start

  • Following all 7 principles will help you to achieve the full benefits
  • Following one or some of the principles will still bring benefits!

  • Start small, review and iterate

RAP Support

  • Long-term support (~ 4-6 months) with regular meetings to review progress

  • One-off call to answer questions and provide advice

  • The aim is to help you to develop your RAP skills, not to write code for you.

  • Get in touch to discuss your project