CACOP
SOLOONGOINGChaos engineering cost analysis for Kubernetes
Overview
Most teams know that downtime is expensive, but nobody can tell you exactly how much a pod crash costs in CPU waste, memory spikes, and recovery time. CACOP puts a number on it.
The system injects controlled failures into a local Kubernetes cluster using Chaos Mesh - pod kills, network partitions, CPU stress tests. While the chaos runs, a FastAPI backend collects metrics from Prometheus, correlates resource spikes with the injected failure, and converts the inefficiency into simulated dollar costs using configurable cloud pricing models.
The PLG stack (Prometheus, Loki, Grafana) watches everything in real-time, and the dashboard lets you compare costs across different failure scenarios. It turns chaos engineering from "did the system recover?" into "how much did that recovery cost?"
Key Features
Controlled chaos injection via Chaos Mesh
Real-time cost simulation with configurable cloud pricing
Prometheus metrics correlation with failure events
PLG stack (Prometheus, Loki, Grafana) observability
Comparative cost analysis across failure scenarios
Local Kubernetes cluster via Minikube - no cloud bill surprises