8.4 KiB
ADR-080: QE Analysis Remediation Plan
- Status: Proposed — P0 security findings #1–#3 RESOLVED on the shipped Rust sensing-server boundary (2026-06-13; closes ADR-164 G11)
- Date: 2026-04-06
- Source: QE Analysis Gist (2026-04-05)
- Full Reports: proffesor-for-testing/RuView
qe-reportsbranch
Context
An 8-agent QE swarm analyzed ~305K lines across Rust, Python, C firmware, and TypeScript on 2026-04-05. The overall score was 55/100 (C+) — Quality Gate FAILED. This ADR captures the findings and establishes a remediation plan.
Decision
Address the 15 prioritized issues from the QE analysis in three waves: P0 (immediate), P1 (this sprint), P2 (this quarter).
Security P0 closure note (2026-06-13) — Rust sensing-server boundary
The three P0 security findings below were logged against the Python v1 API
(archive/v1/src/…). ADR-164 G11 re-scoped them to the shipped boundary:
wifi-densepose-sensing-server (Rust). They were verified against the current
Rust crate and closed on branch fix/adr-080-sensing-server-security. Each fix
(or already-fixed finding) is pinned by a test that fails on the old behavior.
The Python v1 paths remain as-is — v1 is archived and not the shipped
surface; this closure governs the live Rust server only.
P0 — Fix Immediately
1. Rate Limiter Bypass / XFF spoofing (Security HIGH) — RESOLVED (verified absent on Rust boundary)
- Original location (v1):
archive/v1/src/middleware/rate_limit.py:200-206 - Problem: Trusts
X-Forwarded-Forwithout validation. Any client bypasses rate limits via header spoofing. - Rust verification (2026-06-13): The Rust sensing-server has no XFF-trusting control to bypass — there is no IP-based rate-limiter and no IP-allowlist, and neither security middleware reads a forwarded header.
bearer_auth.rsauthenticates on the token alone (require_bearerinspects only theAUTHORIZATIONheader);host_validation.rsdecides on theHostheader only. A repo-wide grep forx-forwarded-for|forwarded|peer_addr|client_ip|real-ipoverwifi-densepose-sensing-serverreturns nothing. The only "rate limiter" is the MQTT sample-rate gate (mqtt/state.rs), a per-entity publish throttle with no IP/header input. - Resolution: No code change needed (no vulnerable surface). Regression tests pin the immunity:
bearer_auth::tests::xff_header_never_affects_auth_decision(spoofed XFF never flips a 401↔200 decision) andhost_validation::tests::forwarded_headers_never_bypass_host_allowlist(spoofedX-Forwarded-Host: localhostnever lets a foreignHost: evil.compast the allowlist). Residual: if an IP-based control is ever added, it must derive the peer from the socket (ConnectInfo<SocketAddr>) and only honor XFF from an explicit--trusted-proxyCIDR — captured as guidance in the test docstrings.
2. Exception Details Leaked in Responses (Security HIGH, CWE-209) — RESOLVED
- Original location (v1):
archive/v1/src/api/routers/pose.py:140,stream.py:297, +5 endpoints - Problem: Internal error/stack-trace detail serialized into client responses.
- Rust finding (2026-06-13): Six handlers in
wifi-densepose-sensing-server/src/main.rsserialized the internal errorDisplayinto the JSON body:edge_registry_endpointreturned a panickedspawn_blockingJoinError("task … panicked") in a500and the raw upstream error in a503;delete_model/delete_recording/start_recordingreturnedstd::io::Errorstrings (OS detail / path);calibration_start/calibration_stopreturned theFieldModelerror chain. - Fix: New
src/error_response.rsmodule —internal_error/internal_error_json/upstream_unavailablelog the full detail server-side only (tagged with a correlation id) and return a generic body ({"error":"internal_error","correlation_id":…}) with nopanicked, no file paths, no Debug chain. All six call-sites rewired. Pinned byerror_response::tests::internal_error_body_does_not_leak_detail(leak-substring guard, verified to fail on the reverted old body) + 4 sibling tests.
3. WebSocket JWT in URL (Security HIGH, CWE-598) — RESOLVED (verified absent on Rust boundary)
- Original location (v1):
archive/v1/src/api/routers/stream.py:74,archive/v1/src/middleware/auth.py:243 - Problem: Tokens in query strings visible in logs/proxies/browser history.
- Rust verification (2026-06-13): The Rust sensing-server never reads a token from the URL.
require_bearer(bearer_auth.rs) inspects only theAuthorizationheader; the WebSocket handlers (ws_sensing_handler/ws_introspection_handler/ws_pose_handler) take a bareWebSocketUpgradewith noQueryextractor; the singleQueryin the crate (EdgeRegistryParams) is a non-secretrefreshflag. - Resolution: No code change needed (no query-token path exists). Regression test
bearer_auth::tests::query_string_token_is_never_acceptedproves?token=/?access_token=in the URL never authenticates (stays401) while the same token in the header succeeds (200) — verified to fail if a query-token path is re-introduced.
4. Rust Tests Not in CI
- Problem: 2,618 tests across 153K lines of Rust — zero run in any GitHub Actions workflow. Regressions ship undetected.
- Fix: Add
cargo test --workspace --no-default-featuresto CI. 1-2 hour task.
5. WebSocket Path Mismatch (Bug)
- Location:
ui/mobile/src/services/ws.service.ts:104constructs/ws/sensing, butconstants/websocket.ts:1definesWS_PATH = '/api/v1/stream/pose'. - Problem: Mobile WebSocket silently fails.
- Fix: Align paths. Verify which endpoint the server actually serves.
P1 — Fix This Sprint
| # | Issue | Location | Impact |
|---|---|---|---|
| 6 | God file: 4,846 lines, CC=121 | sensing-server/src/main.rs |
Untestable monolith |
| 7 | O(L×V) voxel scan per frame | ruvsense/tomography.rs:345-383 |
~10ms wasted; use DDA ray march |
| 8 | Sequential neural inference | wifi-densepose-nn inference.rs:334-336 |
2-4× GPU latency penalty |
| 9 | 720 .unwrap() in Rust |
Workspace-wide | Each = potential panic in RT paths |
| 10 | 112KB alloc/frame in Python | csi_processor.py:412-414 |
Deque→list→numpy every frame |
P2 — Fix This Quarter
| # | Issue | Impact |
|---|---|---|
| 11 | 11/12 Python modules have zero unit tests (12,280 LOC) | Services, middleware, DB untested |
| 12 | Firmware at 19% coverage (WASM runtime, OTA, swarm) | Security-critical code untested |
| 13 | MAT screen auto-falls back to simulated data | Disaster responders could monitor fake data |
| 14 | Token blacklist never consulted during auth | Revoked tokens remain valid |
| 15 | 50ms frame budget never benchmarked | Real-time requirement unverified |
Bright Spots
- 79 ADRs (exceptional governance)
- Witness bundle system (ADR-028) with SHA-256 proof
- 2,618 Rust tests with mathematical rigor
- Daily security scanning (Bandit, Semgrep, Safety)
- Ed25519 WASM signature verification on firmware
- Clean mobile state management with good test coverage
Full QE Reports (9 files, 4,914 lines)
| Report | What it covers |
|---|---|
EXECUTIVE-SUMMARY.md |
Top-level synthesis with all scores and priority matrix |
00-qe-queen-summary.md |
Master coordination, quality posture, test pyramid |
01-code-quality-complexity.md |
Cyclomatic complexity, code smells, top 20 hotspots |
02-security-review.md |
15 security findings (3 HIGH, 7 MEDIUM), OWASP coverage |
03-performance-analysis.md |
23 perf findings (4 CRITICAL), frame budget analysis |
04-test-analysis.md |
3,353 tests inventoried, duplication, quality grading |
05-quality-experience.md |
API/CLI/Mobile/DX UX assessment |
06-product-assessment-sfdipot.md |
SFDIPOT analysis, 57 test ideas, 14 session charters |
07-coverage-gaps.md |
Coverage matrix, top 20 risk gaps, 8-week roadmap |
Consequences
- P0 fixes eliminate 3 security vulnerabilities and 2 functional bugs
- P1 fixes improve performance, reliability, and maintainability
- P2 fixes close coverage gaps and harden the system for production
- Target score improvement: 55 → 75+ after P0+P1 completion
Generated from QE swarm analysis (fleet-02558e91) on 2026-04-05