Commit Graph

2 Commits

Author SHA1 Message Date
ruv 21ec163941 fix(swarm): resolve 16 bugs from deep review of ADR-062
CRITICAL:
- Delete stale nvs_provision.bin before provisioning each node
- Fix log filename mismatch: swarm_health.py now finds qemu_node{i}.log
  with node_{i}.log fallback
- CI swarm-test job builds firmware instead of downloading missing artifact
- Accept both qemu_flash.bin and qemu_flash_base.bin as base image

HIGH:
- Replace broad "heap" substring match with precise regex patterns
  (HEAP_ERROR, heap_caps_alloc.*failed, etc.) to avoid false positives
- Guard os.geteuid() with hasattr for Windows compatibility
- Offset SLIRP ports by +100 to avoid collision with aggregator on 5005
- Assertions now WARN (not vacuous PASS) when no parseable data found

MEDIUM:
- Mark network_partitioned_recovery as "(future)" in ADR-062
- Fix node_id prefix dedup bug (node_1 no longer matches node_10)
- Add duplication note in qemu_swarm.py pointing to swarm_health.py
- Document implicit TDM auto-assignment in ADR YAML schema
- swarm_health.py only checks sensor nodes for frame production
- Fix channel 0 treated as falsy

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-14 12:36:25 -04:00
ruv a8f5276d9b feat(qemu): ADR-062 QEMU swarm configurator for multi-ESP32 testing
YAML-driven orchestrator for testing multiple ESP32-S3 QEMU instances
with configurable topologies (star/mesh/line/ring), role-based nodes
(sensor/coordinator/gateway), and swarm-level health assertions.

New files:
- ADR-062: architecture decision record
- qemu_swarm.py: main orchestrator (1097 lines)
  - YAML config parsing with schema validation
  - 4 topology implementations with TAP/SLIRP fallback
  - Per-node NVS provisioning via provision.py --dry-run
  - Signal-safe cleanup, dry-run mode, JSON results output
- swarm_health.py: 9-assertion health oracle (653 lines)
- 7 preset configs: smoke (2n/15s), standard (3n/60s),
  large-mesh (6n/90s), line-relay (4n/60s), ring-fault (4n/75s),
  heterogeneous (5n/90s), ci-matrix (3n/30s)
- CI: swarm-test job in firmware-qemu.yml

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-14 12:24:06 -04:00