CACOP

SOLOONGOING

Chaos engineering cost analysis for Kubernetes

★ 0 stars·0 forks·Updated 1 month ago

Overview

Most teams know that downtime is expensive, but nobody can tell you exactly how much a pod crash costs in CPU waste, memory spikes, and recovery time. CACOP puts a number on it.

The system injects controlled failures into a local Kubernetes cluster using Chaos Mesh - pod kills, network partitions, CPU stress tests. While the chaos runs, a FastAPI backend collects metrics from Prometheus, correlates resource spikes with the injected failure, and converts the inefficiency into simulated dollar costs using configurable cloud pricing models.

The PLG stack (Prometheus, Loki, Grafana) watches everything in real-time, and the dashboard lets you compare costs across different failure scenarios. It turns chaos engineering from "did the system recover?" into "how much did that recovery cost?"