Skip to content

Backup and Recovery · PostgreSQL

Recovery Leadership for On-Call Rotations

A four-week program on the human side of recovery — leading an incident, writing a post-mortem worth reading, and building rotation practices that do not exhaust the team.

About this cohort

Technical recovery is only half the work. This cohort focuses on the operator-leader role during and after an incident: how to set tempo, how to delegate the right tasks to the right shoulders, and how to write a post-mortem that engineers will actually open. We use recorded incident audio and transcripts (anonymised, with permission) to dissect decisions in slow motion, and we practice the small rituals — pre-incident readiness checks, intra-rotation handovers, decompression after long nights — that decide whether a rotation is sustainable.

Inclusions

  • Recorded incident dissections with explicit decision points
  • Templates for incident command, scribe, and communicator roles
  • Post-mortem writing workshop with peer feedback
  • Rotation health audit framework you can run on your own team
  • Two live mock-incident sessions in week three and week four

By the end you can

  • 01 Lead an incident bridge with clear roles and a calm tempo
  • 02 Write a post-mortem that engineers outside the room can act on
  • 03 Audit your on-call rotation against sustainable practice

Programme lead

Choi Areum

Learner Success Manager who pairs cohort members with the right incident scenarios. She also coordinates our paired drills and post-program follow-ups.

Common questions

From the cohort

  • The week-four mock incident was exhausting in the right way. I rewrote my team’s post-mortem template that weekend.

    Areum L.

    Areum L.

    Senior operator 4.8/5

  • Useful, careful pacing; the audit framework gave me language for a conversation I had been avoiding with my manager.

    Joon-ho

    Joon-ho

    4.6/5 · survey

QueryPilot Academy