fix(firmware): add vTaskDelay(1) yields in process_frame() at tier>=2 to fix WDT storm (#683)

At edge tier>=2 on N16R8 PSRAM boards, `process_frame()` runs
`update_multi_person_vitals()` (4 persons × 256 history samples) plus
`wasm_runtime_on_frame()` back-to-back before returning to `edge_task()`.
The existing `vTaskDelay(1)` in `edge_task()` only fires *after*
`process_frame()` returns — under sustained 30 pps CSI load on PSRAM
boards this leaves IDLE1 on Core 1 starved long enough for the 5-second
Task Watchdog Timer to fire.

Fix: add two `vTaskDelay(1)` calls inside `process_frame()`, both gated
on `s_cfg.tier >= 2`:
1. After `update_multi_person_vitals()` (Step 11)
2. After `wasm_runtime_on_frame()` dispatch (Step 14)

Tier 0/1 paths are unaffected. Validated on COM7 (N16R8 board):
`Edge DSP task started on core 1 (tier=2)`, no WDT panics in 20 s.

Also bump firmware version 0.6.5 → 0.6.6 and refresh all 6 release_bins
with the new build (8MB + 4MB variants, built 2026-05-21).

Fix-marker RuView#683 added to scripts/fix-markers.json.

Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
ruv 2026-05-21 09:20:21 -04:00
parent cbcb389cb6
commit c58f49f21a
8 changed files with 19 additions and 4 deletions

View File

@ -849,6 +849,8 @@ static void process_frame(const edge_ring_slot_t *slot)
/* --- Step 11: Multi-person vitals --- */
update_multi_person_vitals(slot->iq_data, n_subcarriers, sample_rate);
/* Yield after multi-person DSP so IDLE1 can feed Core 1 watchdog (#683). */
if (s_cfg.tier >= 2) vTaskDelay(1);
/* --- Step 12: Delta compression --- */
if (s_cfg.tier >= 2) {
@ -894,6 +896,8 @@ static void process_frame(const edge_ring_slot_t *slot)
wasm_runtime_on_frame(phases, amplitudes, variances,
n_subcarriers,
(const edge_vitals_pkt_t *)&s_latest_pkt);
/* Yield after WASM dispatch to feed Core 1 watchdog (#683). */
vTaskDelay(1);
}
}

View File

@ -1,3 +1,3 @@
0.6.5
git-sha: d72e06fc8
built: 2026-05-20
0.6.6
git-sha: cbcb389cb (pre-commit)
built: 2026-05-21

View File

@ -1 +1 @@
0.6.5
0.6.6

View File

@ -222,6 +222,17 @@
"forbid": ["/csi_collector_init.*node_id\\s*=\\s*1[^0-9]/"],
"rationale": "release_bins/ shipped v0.4.3.1 binaries that lacked csi_collector_set_node_id() — every provisioned node reported node_id=1 over UDP regardless of NVS value, making a 4-node deployment look like a single node. main.c must call csi_collector_set_node_id(g_nvs_config.node_id) immediately after nvs_config_load() and before wifi_init_sta(). Reverting silently breaks multi-node deployments with no build-time error.",
"ref": "https://github.com/ruvnet/RuView/issues/679"
},
{
"id": "RuView#683",
"title": "ESP32-S3 edge tier>=2: vTaskDelay(1) after multi-person vitals and WASM dispatch prevents IDLE1 starvation / WDT storm",
"files": ["firmware/esp32-csi-node/main/edge_processing.c"],
"require": [
"if (s_cfg.tier >= 2) vTaskDelay(1);",
"Yield after WASM dispatch to feed Core 1 watchdog (#683)"
],
"rationale": "At edge tier>=2 on N16R8 PSRAM boards, process_frame() runs update_multi_person_vitals() (4 persons × 256 history samples) plus wasm_runtime_on_frame() back-to-back. The vTaskDelay(1) in edge_task() only fires AFTER process_frame() fully returns — if process_frame() takes >5 s (common on PSRAM-backed boards under sustained 30 pps CSI load), IDLE1 on Core 1 never runs and the Task Watchdog Timer fires. The fix adds two vTaskDelay(1) calls inside process_frame(), gated on tier>=2, at the multi-person vitals boundary and after WASM dispatch. Removing them re-opens the WDT storm on N16R8 hardware.",
"ref": "https://github.com/ruvnet/RuView/issues/683"
}
]
}