Iter 37 — adds a fleet-summary gauge to the iter-36 Prometheus
exposition. Ops dashboards now answer "how many leaders / followers
/ no-sync nodes are there right now" in one scrape, without having
to scrape every per-node series and aggregate client-side.
# HELP wifi_densepose_mesh_node_total Per-state node count across the fleet
# TYPE wifi_densepose_mesh_node_total gauge
wifi_densepose_mesh_node_total{state="leader"} 1
wifi_densepose_mesh_node_total{state="follower"} 2
wifi_densepose_mesh_node_total{state="no_sync"} 0
- leader / follower split derived from snapshot.is_leader
- no_sync = total_nodes_in_state - nodes_with_snapshot
(so a node that has sent CSI frames but never a sync packet
shows up here, which is what an operator wants to alert on)
Implementation factored as a free function `fleet_role_counts` so the
math is testable without spinning up the axum handler. Same pattern
iter 18 (update_csi_fps_ema) and iter 30 (sync_snapshot) used.
Test added (9/9 sync_snapshot_helper_tests now green):
fleet_role_counts_classifies_correctly
Three cases:
- empty fleet → (0, 0)
- 1 leader + 2 followers → (1, 2)
- all-leaders edge case → (2, 0) (election prevents this in
practice but the gauge math must still be consistent)
Useful Grafana queries this unlocks:
- sum(wifi_densepose_mesh_node_total{state="follower"})
→ total reachable follower count
- wifi_densepose_mesh_node_total{state="no_sync"} > 0
→ alert when any node has dropped off the mesh
Co-Authored-By: claude-flow <ruv@ruv.net>