wifi-densepose/docs/adr/ADR-120-windowed-temporal-c...

7.7 KiB
Raw Blame History

ADR-120 — Windowed Temporal Classifier (W-MLP)

Status: Accepted Date: 2026-05-18 Scope: v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs (WindowedMlpModel, train_windowed_mlp_classifier, eval_windowed_mlp, AdaptiveModel::classify_window); main.rs (AppStateInner.feature_window, push_feature_window, adaptive_override switching to window path).

Context

ADR-119 added a small MLP (22 → 32 → 6) that improved accuracy from 49.58% (LogReg) to 53.53%. Loss flatlined at ~1.15 around epoch 10 of 30 — clear signal that the frame-level information ceiling had been reached for the 22-feature representation.

The dataset has 7 activity classes that differ primarily in temporal patterns, not in any single frame:

  • walking step cadence: ~2 Hz (visible in 0.5-second window)
  • transition (sit-stand): ~0.5 Hz (visible in 2-second window)
  • waving limb cadence: 1-2 Hz
  • active (jumping): bursty / quasi-periodic at ~3 Hz
  • present_still (sitting + standing merged): no temporal signature

Per-frame, walking and active and waving all look "moving" with similar amplitude std/skew — they're disambiguated only by HOW the amplitude pattern evolves over 1-2 seconds. A classifier that sees a single frame can't tell them apart no matter how good the per-frame features are.

Decisions

D1 — Stack 20 consecutive frames into a 440-d input

WINDOW_FRAMES   = 20  (~2 seconds at ~10 Hz tick rate)
N_FEATURES      = 22  (from ADR-118)
WINDOWED_INPUT  = 20 × 22 = 440
WINDOWED_HIDDEN = 64

Network: 440 → 64 ReLU → n_classes softmax. ~28k weights total — larger than the frame-level MLP's 3k, but still small enough to train in <60s and serialize as JSON.

Training samples are built by sliding a window of 20 frames with stride 5 within each recording (4× overlap). Windows do not cross recording boundaries — each window inherits its source recording's class label.

On the 6-node 151k-frame set:

  • 7 recordings × ~21k frames each = 151k frames total
  • (21k 20) / 5 ≈ 4,300 windows per recording
  • Total: ~30k windowed samples
  • Class balance is roughly preserved (each recording is one class)

D2 — Manual backprop, same recipe as MLP

Same SGD + momentum 0.9 + weight decay 1e-4 + cosine LR decay. Base LR lowered to 0.03 (vs MLP's 0.05) because the network is bigger. 25 epochs. He initialisation, ReLU activation, softmax output, cross-entropy loss.

D3 — AdaptiveModel carries all three classifiers, classify routes by availability

pub struct AdaptiveModel {
    pub weights: Vec<Vec<f64>>,     // ADR-118 legacy LogReg
    pub mlp: MlpModel,              // ADR-119 frame-level MLP
    pub windowed_mlp: WindowedMlpModel,  // ADR-120 (this) — primary
    // ...
}

classify_window() (new API) prefers windowed_mlp when trained AND the caller has a 20-frame buffer. Falls through to frame-level MLP when called with insufficient history. Old JSON model files load with MlpModel::default() and WindowedMlpModel::default() filling absent fields — backward compatible.

D4 — Rolling buffer in AppStateInner, pushed per tick

struct AppStateInner {
    feature_window: VecDeque<[f64; N_FEATURES]>,  // capacity = WINDOW_FRAMES
    // ...
}

New helper push_feature_window(&mut s, &features) computes the 22-d feature vector from current per-node amps, pushes to the back of the buffer, evicts oldest when over capacity. Called at all three tick sites where adaptive_override runs:

  • main.rs:~3030 — multi-BSSID tick handler
  • main.rs:~3225 — WiFi fallback tick handler
  • main.rs:~6510 — per-node loop in the broadcast tick task

adaptive_override (read-only over state) builds the 440-d input by copying the buffer's last 19 entries + the current frame's features, then calls model.classify_window(&flat). Cold-start (buffer < 20) falls back to model.classify(&feat_arr) — frame-level MLP.

Verified Acceptance

Retrained on the same 6-node, 151,329-frame set used since ADR-118:

LogReg:    49.58%
MLP:       53.53%   (+3.95 vs LogReg)
W-MLP:     90.40%   (+36.87 vs MLP)

Per-class (frame-level MLP → W-MLP):

absent          41% → 100%   +59
present_still   99% → 100%   +1   (already saturated)
transition      36% →  86%   +50  (sit-stand cadence captured)
active          30% →  74%   +44  (jumping cadence captured)
waving          38% →  90%   +52  (gesture cadence captured)
present_moving  33% →  82%   +49  (walking step cadence captured)

Loss curve confirms breakout from the frame-level plateau:

MLP:     epoch  0 → 1.28 → epoch 29 → 1.14   (flat plateau)
W-MLP:   epoch  0 → 1.01 → epoch 24 → 0.25   (still trending)

Total cumulative improvement vs the start-of-session 2-node 15-feature LogReg baseline:

40.4% → 90.40% = +50.0 percentage points

Caveat — training vs generalization

90.40% is training accuracy. The W-MLP has ~28,800 weights trained on ~30,200 windowed samples — capacity is comparable to dataset size, so some overfitting is expected. True generalization performance will only be measurable once an independent test set is captured.

Mitigations already in place:

  • Weight decay 1e-4 regularises against memorisation
  • Cosine LR decay with smooth annealing
  • Stride 5 in window construction reduces near-duplicate samples
  • Architecture stays small (one hidden layer) — limits overfit capacity

Recommended follow-up: record a 60-second held-out session per class (separate from training), evaluate W-MLP cold, compare to training accuracy. Expected drop: 5-15 pts for a healthy model.

Files Touched

v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs:
  + const WINDOW_FRAMES = 20, WINDOWED_INPUT = 440, WINDOWED_HIDDEN = 64
  + pub const N_FEATURES_PUB (for external buffer sizing)
  + pub struct WindowedMlpModel { w1, b1, w2, b2, n_classes }
  + impl WindowedMlpModel::{is_trained, forward}
  + AdaptiveModel.windowed_mlp field (serde-default)
  + AdaptiveModel::classify_window method
  + train_from_recordings builds recording_groups, slides windows,
    calls train_windowed_mlp_classifier
  + train_windowed_mlp_classifier (~150 LoC manual backprop)
  + eval_windowed_mlp helper
  + #[derive(Clone)] on Sample (for recording_groups Vec)
v2/crates/wifi-densepose-sensing-server/src/main.rs:
  + AppStateInner.feature_window: VecDeque<[f64; N_FEATURES_PUB]>
  + push_feature_window helper
  + adaptive_override switches to classify_window when buffer is full
  + 3 tick sites call push_feature_window before adaptive_override
docs/adr/ADR-120-windowed-temporal-classifier.md  (this)

Out of Scope / Follow-ups

  • Held-out test set — record fresh data, evaluate cold to confirm 90% isn't memorisation.
  • TCN instead of stacked-MLP — 1D conv over time would use weights more efficiently (~5k vs 28k). Worth pursuing if dataset scales 10×.
  • Output smoothing — shipped via two-layer mode+confirm filter on the adaptive output, see ADR-120 follow-up commits.
  • Split sitting/standing — currently merged into present_still; separating them would test whether the temporal RF signatures differ.
  • Class imbalancepresent_still has 2× windows; oversampling minority classes might lift accuracy 1-2 pts.
  • Window size experiments — 20 frames is a reasonable first guess; 10 (faster) or 30 (more context) untested.

References

  • ADR-118 — feature decorrelation + multi-node (22-feature basis)
  • ADR-119 — frame-level MLP (sibling classifier, fallback at cold start)
  • ADR-101 — raw amplitude classifier (the path that calls AdaptiveModel via adaptive_override)
  • ADR-105 — no synthetic data in production runtime; this ADR's confidence output is real model softmax probability, not a hardcoded value