Closes the Dense placeholder from earlier commits. Now both backends
implement forward(); only SparseGqa supports streaming step()/KvCache,
which is the structural gap dense MHA can't bridge by design.
Dense path:
- src/dense.rs new — DenseHead wraps upstream dense_attention. Stores
causal flag and (cloned) config. forward() is a one-line delegation;
no GQA dispatch (dense_attention upstream requires q_heads == kv_heads).
- AetherTemporalHead::Dense changed from a unit variant to Dense(DenseHead).
Construction succeeds for any valid TemporalHeadConfig where backend
is Dense.
- AetherTemporalHead.step() returns BackendDoesNotSupportStreaming for
Dense — there is no dense-MHA-with-KV-cache equivalent and offering
one would silently swallow the ADR-096 §3.2 structural argument.
- AetherTemporalHead.make_cache() likewise — there's no cache to size
for a dense kernel.
Errors:
- New TemporalError::BackendDoesNotSupportStreaming variant covers
the Dense-step / Dense-make_cache cases. Specific so callers can
fall back to forward() instead of giving up entirely.
- TemporalError::DenseBackendNotImplemented retained for v0.1
back-compat (no consumers depend on it post-this-commit, but
removing a public variant is a hard break). Future work can
deprecate it once downstream callers move off.
Tests (19/19 passing):
- dense_backend_returns_typed_error → renamed and rewritten as
dense_backend_forward_runs_with_matching_shape: constructs a Dense
head, runs forward over (32, 4, 4, 16) Q/K/V, asserts output shape.
- New dense_backend_step_returns_streaming_error: constructs Dense,
attempts make_cache, expects BackendDoesNotSupportStreaming.
- All 8 weight blob, 2 blob e2e, 3 streaming, 5 other smoke tests
unchanged and still passing.
This commit completes the ADR-096 §5 A/B gate: callers can now run
the same Q/K/V through both backends and compare outputs / latency.
The §5 four-gate validation (contrastive loss within 1%, rank-1
within 1pp, Spearman ≥0.95, latency ≥5×) becomes a runnable
proposition, not a future task — though the actual gate run requires
trained AETHER weights, which is its own track.
Co-Authored-By: claude-flow <ruv@ruv.net>