Back to projects
SaaS2023Confidential — public SaaS company

Halo Ops

AI incident-response copilot for on-call engineers

Dark AI Ops command-center visualization for Halo Ops
14m → 90s
01
Mean time to ack
-63%
02
Mean time to resolve
Healthy
03
On-call burnout
[ Overview ]

Halo Ops is an incident-response copilot that triages alerts, drafts postmortems and runs remediation runbooks — turning 2am pages into 6-minute resolutions and giving on-call engineers a real night's sleep.

01

The challenge

Alert fatigue had on-call engineers paging out within weeks. Postmortems lagged sprints. Runbook drift meant nothing on the wiki matched production.

02

Our approach

We wired Claude into the alerting stack via Temporal workflows, gave it scoped tool access to PagerDuty, Grafana and the deployment system, and forced every action through a typed runbook contract that can be replayed and audited.

03

The outcome

Mean time to acknowledge dropped from 14 minutes to 90 seconds, mean time to resolve fell 63%, and on-call burnout scores returned to healthy ranges for the first time in two years.

04

What we built

  • Temporal-orchestrated agent with typed runbook contracts
  • Scoped tool access: PagerDuty, Grafana, deploy system
  • Auto-drafted postmortems linked to evidence
  • Replayable, auditable action trace per incident
Next project

Atlas Semantic

Edge-deployed hybrid search across 80M enterprise documents

Continue