Skip to content

Exercise TT-E2 — Docs & Continuous Integration

Continuous Integration and ROOT Data Analysis with ATLAS Open Data


Objective

Students will learn how to:

  • Connect analysis code, input data, and automated workflows using GitHub Actions.
  • Modify selection cuts and observe how CI triggers new plots.
  • Understand how CI/CD ensures reproducibility and transparency in physics analyses.

This exercise extends the concepts from the “From My Code to Our Code” lecture:

“If CI passes, your result is production-ready.”


Repository Structure

All relevant files live in:

exercises/TT-E2-docs-and-ci/ci-parallel-root/

Inside, you’ll find:

File / Folder Purpose
data.csv, mc.csv Lists of ROOT files (real and simulated). Each line is one file URL.
gamma-gamma-analysis-v1.py Python analysis script — processes a ROOT file, applies photon cuts, produces histograms.
root-to-png.py Converts merged ROOT histograms to PNG images.
plots/ Output folder where the Action saves plots. Contains a .gitkeep file to keep the directory in Git.
.github/workflows/process-atlas-root-open-data-v2.yml CI workflow that automates the full analysis.

How the Workflow Works

  1. Trigger

  2. Manually via Actions > Run workflow (default input: data.csv).

  3. Automatically every 10 hours (schedule trigger).
  4. Automatically on any push changing:

    • data.csv, mc.csv
    • gamma-gamma-analysis-v1.py
    • root-to-png.py
    • or the workflow YAML itself.
  5. Setup job

  6. Reads the CSV file line-by-line and creates a job matrix: one ROOT file per job.

  7. Extracts a short name (e.g. data, mc) from the filename.

  8. Parallel analysis

  9. Runs in Docker using the rootproject/root:latest image.

  10. Executes:

    bash python3 gamma-gamma-analysis-v1.py <ROOT file URL> * Each job produces:

    • plots/histogram_<sample>.png
    • plots/histogram_<sample>.root
    • These are uploaded as CI artifacts.
  11. Merge and publish

  12. Collects all individual histograms.

  13. Merges them via ROOT’s hadd.
  14. Converts the merged .root file to .png.
  15. Commits both into the repository under plots/.

What Students Should Do

1. Modify the Analysis

Open gamma-gamma-analysis-v1.py and adjust:

  • Photon pT threshold, η range, or isolation criteria.
  • The histogram binning or range. Commit and push — the CI will automatically re-run and update plots.

2. Experiment with Data Files

Edit data.csv or mc.csv:

  • Add or remove ROOT file URLs.
  • Swap between real data and MC samples.

Each change triggers a fresh CI run and new plots appear in plots/.

3. Observe CI Behavior

Go to the Actions tab and watch jobs execute in parallel:

  • Setup → process each file → merge results → push new plots.

Open the updated PNGs under exercises/TT-E2-docs-and-ci/ci-parallel-root/plots/ to see how your cuts affect the final distributions.

4. Optional: Tune Triggers

Open .github/workflows/process-atlas-root-open-data-v2.yml and:

  • Adjust the cron expression ("0 */10 * * *") to run more or less often.
  • Add or remove files from the paths: list under push: to control automatic triggers.

Learning Outcomes

By completing this exercise, you will:

  • Understand continuous integration as part of scientific analysis.
  • Learn to automate data pipelines with GitHub Actions.
  • Recognize the value of reproducibility — every change is versioned, tested, and documented.
  • Produce real physics plots using ATLAS Open Data, generated by CI.

Notes

  • If no plots appear, check Actions > Run logs > merge-and-publish.
  • Use the default data first (data.csv), then test with mc.csv.
  • You can re-run any workflow from the web interface to compare results.