An opinionated list of Python frameworks, libraries, tools, and resources
Overview I finished the DR Toolkit thinking I had covered the important parts of disaster recovery: runbooks, RTO/RPO targets, post-mortems. Then I mapped out the actual incident lifecycle and realized everything I built sits at the edges. The middle part (detecting the incident, correlating signals across regions, finding the root cause while the primary region is actively failing) was not cove