From 1e8e352778cda14404fc9e0861520e63f9be4f23 Mon Sep 17 00:00:00 2001 From: ruv Date: Tue, 2 Jun 2026 18:24:11 +0200 Subject: [PATCH] fix(ci): perf job gates on the real frame-budget guard, not TDD stubs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit After #914 fixed collection, the perf job actually ran the suite and exposed that test_api_throughput.py / test_inference_speed.py are TDD red-phase stubs (every test suffixed `_should_fail_initially`) that time a *mock that sleeps* — not a real perf signal. They carry machine- dependent wall-clock asserts (actual_rps >= 40, batch_time < individual_time) that are inherently flaky on shared CI runners, plus a cross-class fixture-scope bug (`fixture 'standard_model' not found`). Result: 3 failed, 10 errored — by design, not a regression. Forcing those green would manufacture a false signal. Instead, gate only on test_frame_budget.py, which times the *real* CSIProcessor pipeline against the ADR 50 ms per-frame budget (single-frame, p95/100-frames, +Doppler) — a genuine regression guard. Verified locally: 3 passed. The stub files remain in-repo for local TDD; they re-enter CI when their features are implemented and the mock-timing asserts are made deterministic. Co-Authored-By: claude-flow --- .github/workflows/ci.yml | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 1d56b58b..8eb64e4f 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -282,17 +282,26 @@ jobs: env: MOCK_POSE_DATA: "true" run: | - # The repo's performance suite is pytest (test_api_throughput.py, - # test_frame_budget.py, test_inference_speed.py) — there is no - # locustfile.py, so the old `locust -f tests/performance/locustfile.py` - # command always failed with "Could not find ...". Run the real suite. - # -o addopts="" drops the root pyproject's --cov/--cov-fail-under=100 - # flags (pytest-cov isn't installed here and 100% cov is for unit tests). + # Gate only on the genuine, deterministic perf guard: + # test_frame_budget.py times the *real* CSIProcessor pipeline against + # the ADR 50 ms per-frame budget (single-frame, p95 over 100 frames, + # +Doppler) — a true regression signal. + # + # test_api_throughput.py / test_inference_speed.py are excluded: every + # test there is a TDD red-phase stub (suffix `_should_fail_initially`) + # that times a *mock that sleeps* — meaningless as a perf signal, with + # machine-dependent wall-clock asserts (e.g. `actual_rps >= 40`, + # `batch_time < individual_time`) that are inherently flaky on shared + # CI runners, plus a cross-class fixture-scope bug. Forcing them green + # would be manufacturing a false signal; they stay in-repo for local + # TDD but do not gate CI until the underlying features are implemented. + # # `python -m pytest` (not the bare `pytest` script) puts the cwd - # (archive/v1) on sys.path so test_frame_budget.py's `from src.core...` - # import resolves — the bare script omits cwd and raises - # ModuleNotFoundError: No module named 'src'. - python -m pytest tests/performance/ -o addopts="" -v --junitxml=perf-junit.xml + # (archive/v1) on sys.path so `from src.core...` resolves — the bare + # script omits cwd and raises ModuleNotFoundError: No module named 'src'. + # -o addopts="" drops the root pyproject's --cov/--cov-fail-under=100. + python -m pytest tests/performance/test_frame_budget.py \ + -o addopts="" -v --junitxml=perf-junit.xml - name: Upload performance results if: always()