Build SLO dashboards in a day

A lightweight recipe for rolling out error budget dashboards that engineers actually check.

Colorful charts displayed on a laptop

SLOs lose their power if nobody sees them. We standardized a small set of dashboards that can be rolled out in a day for any new service.

Start with one golden signal

Pick a single request path or queue that represents real user outcomes. Define success criteria and calculate the rolling error rate using a 28-day window to smooth weekend traffic swings.

Laptop showing graphs and charts

Visualize budgets, not just uptime

Our default Grafana panel shows remaining error budget, burn rate, and a forecasted exhaustion date. A separate panel highlights the last five deploys so regressions line up with releases.

Close the loop with alerts

We wire two alerts: a slow-burn notification at 2x burn rate during business hours and a hard page at 14x anytime. Both include links to the dashboard and relevant runbooks.

With shared templates, teams can measure reliability quickly without reinventing the tooling for every service.