Incident, keeper failure
Trigger: vigil.keeper.miss.count > 0, or fewer than 2 of 3 keepers reporting healthy.
Response
- Check which keeper is down.
praetor keepers listshows last action block per keeper. - Investigate root cause:
- VPS health: SSH in, check process logs, disk space, memory.
- RPC: is the RPC URL responsive?
- Wallet: does the keeper wallet have enough native ETH for gas?
- Stake: is the keeper still meeting
keeper_min_stake_wei?
- Bring the keeper back:
- Restart the keeper binary.
- Watch for the next
KeeperStakedor successfulLiquidationExecutedevent.
- If the keeper is unrecoverable:
praetor keepers slash --keeper <addr> --reason "unrecoverable_failure"- Bring up a replacement keeper on a new VPS.
- Update Cohort partners that one keeper rotated.
- If fewer than 2 keepers are healthy at any time:
- Praetor multisig invokes
Plinth.pause("keeper redundancy lost")until at least 2 are back. - Withdrawals still work; new positions are paused.
- Communicate via Lantern dashboard banner and Mirror post.
- Post-mortem within 7 days.