wifi-densepose/docs/adr/ADR-081-gesture-controlled-...

628 lines
30 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# ADR-081: Gesture-Controlled Data Visualization
- **Status**: Proposed
- **Date**: 2026-04-07
- **Deciders**: ruv
- **Relates to**: ADR-079 (Camera Ground-Truth Training), ADR-029 (RuvSense Gesture Recognition), ADR-072 (WiFlow Architecture), ADR-076 (CNN Spectrogram Embeddings)
## Context
RuView can now track 17 COCO keypoints at 92.9% PCK@20 (ADR-079) and detect gestures
via DTW template matching (ADR-029). These capabilities exist independently — pose
estimation produces skeleton coordinates, and the UI displays static charts. There is no
system that connects hand/arm movements to interactive data exploration.
Gesture-controlled visualization would let users manipulate charts and graphs by waving
their hands in front of the ESP32 sensing zone — no mouse, no touchscreen, no wearable.
This is particularly valuable for:
- **Lab/cleanroom** — gloved hands can't use touchscreens
- **Kitchen/workshop** — dirty or wet hands
- **Presentations** — stand back and gesture at projected dashboards
- **Accessibility** — motor impairments that make mouse use difficult
- **Digital signage** — public displays without touch hardware
### Why Camera + CSI Fusion
Camera alone can do gesture control (e.g., Leap Motion, MediaPipe Hands). CSI alone can
detect coarse gestures (ADR-029). The fusion provides:
| Modality | Strengths | Weaknesses |
|----------|-----------|-----------|
| Camera (MediaPipe Hands) | 21 hand landmarks, finger-level precision, 30fps | Requires line of sight, lighting dependent, privacy concern |
| CSI (ESP32) | Through-wall, works in dark, privacy-preserving, $9 | Coarse spatial resolution, no finger tracking |
| **Fusion** | **Finger precision near camera + coarse tracking everywhere** | Requires both sensors during training |
The fusion model trains on camera + CSI pairs (like ADR-079), then deploys in two modes:
1. **Camera-assisted** — full precision when camera is available
2. **CSI-only** — reduced but functional gesture control without camera
## Decision
Build a gesture-to-visualization control system that maps hand/arm movements to chart
interactions using fused camera + CSI input.
### Gesture Vocabulary
#### Navigation Gestures (arm-level, CSI-detectable)
| Gesture | Motion | Chart Action | CSI Feasibility |
|---------|--------|-------------|-----------------|
| **Swipe left** | Open hand sweeps left | Pan chart left / previous dataset | High — clear directional motion |
| **Swipe right** | Open hand sweeps right | Pan chart right / next dataset | High |
| **Swipe up** | Open hand sweeps up | Scroll up / zoom out | High |
| **Swipe down** | Open hand sweeps down | Scroll down / zoom in | High |
| **Push forward** | Palm pushes toward screen | Select / drill into data point | Medium — depth motion harder |
| **Pull back** | Hand pulls away from screen | Back / zoom out | Medium |
| **Circular CW** | Hand circles clockwise | Increase value / rotate view | Medium — temporal pattern |
| **Circular CCW** | Hand circles counter-clockwise | Decrease value / rotate back | Medium |
| **Hold still** | Hand stationary 2+ seconds | Hover / show tooltip | High — absence of motion |
| **Both hands apart** | Arms spread outward | Expand / zoom into selection | High — bilateral motion |
| **Both hands together** | Arms move inward | Collapse / zoom out | High |
#### Precision Gestures (finger-level, camera-required)
| Gesture | Motion | Chart Action | Sensor |
|---------|--------|-------------|--------|
| **Pinch zoom** | Thumb + index spread/close | Continuous zoom | Camera only |
| **Point** | Index finger extended | Cursor position on chart | Camera only |
| **Grab** | Close fist | Grab and drag data point | Camera only |
| **Thumb up** | Thumbs up | Confirm / approve | Camera only |
| **Thumb down** | Thumbs down | Reject / undo | Camera only |
| **Two-finger rotate** | Two fingers twist | Rotate 3D visualization | Camera only |
| **Finger slider** | Index finger moves along axis | Adjust parameter value | Camera only |
### Architecture
```
┌──────────────────────────────────────────────────────────────────┐
│ Input Layer │
│ │
│ ESP32 CSI (UDP 5005) ──→ CSI Gesture Detector (DTW + WiFlow) │
│ ↓ │
│ Webcam (MediaPipe Hands) ──→ Hand Landmark Tracker (21 joints) │
│ ↓ │
│ Gesture Fusion Engine │
│ ├── CSI coarse: swipe/circle/hold │
│ ├── Camera fine: pinch/point/grab │
│ └── Confidence weighting by modality │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ Gesture Interpreter │
│ │
│ Raw gestures ──→ State Machine ──→ Chart Commands │
│ │
│ States: │
│ IDLE ──(motion detected)──→ TRACKING │
│ TRACKING ──(gesture matched)──→ ACTING │
│ ACTING ──(gesture complete)──→ COOLDOWN │
│ COOLDOWN ──(500ms)──→ IDLE │
│ │
│ Debounce: 200ms minimum gesture duration │
│ Cooldown: 500ms between consecutive gestures │
│ Confidence threshold: 0.7 for CSI, 0.9 for camera │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ Visualization Controller │
│ │
│ Chart Commands ──→ WebSocket ──→ UI │
│ │
│ Commands: │
│ { type: "pan", dx: -0.1, dy: 0 } │
│ { type: "zoom", factor: 1.2, center: [0.5, 0.5] } │
│ { type: "select", x: 0.45, y: 0.62 } │
│ { type: "rotate", angle: 15 } │
│ { type: "slider", axis: "x", value: 0.73 } │
│ { type: "hover", x: 0.45, y: 0.62 } │
│ { type: "back" } │
│ { type: "confirm" } │
│ { type: "reject" } │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ Visualization UI │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Line Chart │ │ Bar Chart │ │ 3D Scatter │ │
│ │ (time │ │ (category │ │ (spatial │ │
│ │ series) │ │ compare) │ │ data) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Heatmap │ │ Gauge │ │ Spectrogram │ │
│ │ (CSI grid) │ │ (vitals) │ │ (frequency) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ Visual feedback: gesture cursor overlay + action indicator │
│ Framework: D3.js / Observable Plot in existing UI │
└──────────────────────────────────────────────────────────────────┘
```
### Gesture Detection Pipeline
#### CSI Gesture Detection (arm-level)
Extends the existing DTW gesture classifier (ADR-029) with WiFlow pose input:
```
CSI [35, 20] ──→ WiFlow lite ──→ 17 keypoints ──→ Extract arm features:
- Wrist velocity (dx/dt, dy/dt)
- Elbow angle (shoulder-elbow-wrist)
- Bilateral symmetry (left vs right)
- Motion energy (frame differencing)
DTW template matching:
- 11 gesture templates
- Sliding window (1s)
- Top match + confidence
```
#### Camera Gesture Detection (finger-level)
Uses MediaPipe Hands (21 landmarks per hand, 30fps):
```
Webcam ──→ MediaPipe Hands ──→ 21 landmarks × 2 hands ──→ Extract:
- Finger states (extended/curled)
- Pinch distance (thumb-index)
- Grab state (all fingers curled)
- Point direction (index ray)
- Hand center velocity
Rule-based classifier:
- Pinch: thumb-index < 0.05
- Point: only index extended
- Grab: all fingers curled
- Thumbs up/down: thumb angle
```
#### Fusion Strategy
```
CSI confidence ──┐
├──→ Weighted fusion ──→ Final gesture + confidence
Camera conf ──┘
Rules:
- If both agree: confidence = max(csi_conf, cam_conf) + 0.1 * min(csi_conf, cam_conf)
- If only CSI: use CSI gesture, confidence *= 0.8
- If only camera: use camera gesture, confidence *= 0.95
- If conflict: prefer camera for fine gestures, CSI for coarse gestures
- Minimum confidence for action: 0.6
```
### Chart Interaction Mapping
#### Line Chart (Time Series)
| Gesture | Action | Parameters |
|---------|--------|-----------|
| Swipe left/right | Pan time axis | dx proportional to swipe speed |
| Pinch zoom | Zoom time axis | Continuous, centered on hand position |
| Both hands apart/together | Zoom (CSI-only alternative) | Binary zoom in/out |
| Point | Show tooltip at nearest data point | x from index finger position |
| Hold still | Sticky tooltip | Duration-based activation |
| Swipe up/down | Switch dataset / Y-axis scale | Discrete steps |
#### Bar Chart (Category Comparison)
| Gesture | Action | Parameters |
|---------|--------|-----------|
| Swipe left/right | Navigate categories | One category per swipe |
| Point | Highlight bar | Nearest bar to finger X position |
| Push forward | Select bar for drill-down | Depth gesture |
| Grab + drag | Reorder bars | Camera-only |
| Circular | Sort ascending/descending | Direction determines order |
#### 3D Scatter Plot
| Gesture | Action | Parameters |
|---------|--------|-----------|
| Swipe left/right | Rotate Y axis | Angle proportional to speed |
| Swipe up/down | Rotate X axis | Angle proportional to speed |
| Two-finger rotate | Rotate Z axis | Camera-only |
| Pinch zoom | Zoom | Camera-only |
| Both hands apart | Zoom in (CSI alternative) | Binary |
| Point | Highlight nearest point | Ray-cast from finger direction |
#### Heatmap (CSI Grid)
| Gesture | Action | Parameters |
|---------|--------|-----------|
| Swipe | Pan view | dx, dy |
| Pinch | Zoom region | Center + scale |
| Hold | Show cell value | Position-based |
| Circular | Adjust color scale range | CW = expand, CCW = contract |
#### Gauge (Vital Signs)
| Gesture | Action | Parameters |
|---------|--------|-----------|
| Swipe left/right | Switch vital (HR → BR → SpO2) | Discrete |
| Circular CW | Set high alert threshold | Continuous |
| Circular CCW | Set low alert threshold | Continuous |
| Thumb up | Acknowledge alert | Binary |
### Visual Feedback: AR Camera Overlay
The primary view is the **live camera feed with AR overlays** — the person is visible
with charts, skeleton, and data rendered on top. This creates a "Minority Report" style
interface where you see yourself manipulating data in real-time.
```
┌──────────────────────────────────────────────────────────────┐
│ │
│ ╔══════════════════════════════════════════════════════════╗ │
│ ║ ║ │
│ ║ [Live Camera Feed — person visible] ║ │
│ ║ ║ │
│ ║ ╭─────╮ ║ │
│ ║ │ │ ← skeleton overlay (17 keypoints) ║ │
│ ║ ╰──┬──╯ ║ │
│ ║ ╲ ║ │
│ ║ ╲ ┌──────────────────────┐ ║ │
│ ║ │ │ │ CSI Amplitude Chart │ ║ │
│ ║ │ 🖐→ │ │ ┌─╮ ╭─╮ ╭──╮ │ ║ │
│ ║ │ │ │ │ ╰─╯ ╰───╯ │ │ ║ │
│ ║ ╲ │ │ │ │ ║ │
│ ║ ╲ └──────────────────────┘ ║ │
│ ║ │ │ ↑ chart follows hand position ║ │
│ ║ ╲ ║ │
│ ║ ╲ ║ │
│ ║ ║ │
│ ╚══════════════════════════════════════════════════════════╝ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LOWER THIRD │ │
│ │ ┌────┐ │ │
│ │ │ pi │ RuView Sensing HR: 72 BPM BR: 16 BPM │ │
│ │ │ │ v0.7.0 Presence: 1 Motion: 0.23 │ │
│ │ └────┘ │ │
│ │ [logo] [gesture: Swipe Right] [CSI ●] [CAM ●] [28fps]│ │
│ └──────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
```
#### AR Overlay Layers (bottom to top)
| Layer | Content | Opacity | Update Rate |
|-------|---------|---------|-------------|
| 0 | Live camera feed (full frame) | 100% | 30fps |
| 1 | Skeleton overlay (17 keypoints + bones) | 70% | 30fps |
| 2 | Gesture cursor (hand position + state) | 90% | 30fps |
| 3 | Floating chart (anchored to hand/body region) | 85% | 30fps |
| 4 | Data labels + tooltips | 95% | On gesture |
| 5 | Lower third (RuView branding + vitals + status) | 95% | 1fps |
#### Floating Chart Placement
Charts are **anchored to the person's body** and follow movement:
```
Placement rules:
- Default: chart floats to the right of the person's dominant hand
- If hand moves left: chart slides to left side
- Chart stays within frame bounds (never clips off-screen)
- Multiple charts: stack vertically with 10% gap
- Inactive charts: shrink to thumbnail and anchor near shoulder
Chart anchor point = wrist_position + offset(0.15, -0.1) // right and slightly above hand
Chart size: 30% of frame width × 20% of frame height
```
#### Lower Third Design
The lower third bar provides persistent status in broadcast-style framing:
```
┌──────────────────────────────────────────────────────────────┐
│ ┌──────┐ │
│ │ pi │ RuView Sensing v0.7.0 │
│ │ │ ────────────────────────────────────────────── │
│ │ logo │ HR: 72 BPM | BR: 16 BPM | Persons: 1 │
│ └──────┘ Motion: Low | Gesture: Swipe Right | 28fps │
│ [CSI ●] [CAM ●] [FUSE] PCK@20: 92.9% │
└──────────────────────────────────────────────────────────────┘
Design:
- Background: semi-transparent dark (#1a1a2e, 80% opacity)
- Logo: RuView "pi" icon (32x32px), left-aligned
- Text: white (#ffffff) primary, gray (#a0a0a0) secondary
- Accent: teal (#00d4aa) for active indicators
- Height: 15% of frame
- Font: system monospace for data, sans-serif for labels
- Divider: thin teal line separating logo from data
```
#### RuView Logo Placement
```
The "pi" logo appears in two contexts:
1. Lower third (persistent):
- Position: bottom-left corner, 12px padding
- Size: 32x32px
- Style: white outline on dark background
- Always visible during gesture mode
2. Watermark (optional):
- Position: top-right corner, 8px padding
- Size: 24x24px, 30% opacity
- Style: subtle, doesn't interfere with data
```
#### Skeleton Rendering Style
```
Keypoint rendering:
- Detected joints: teal circles (#00d4aa), radius 6px
- Low-confidence joints: gray circles (#666), radius 4px
- Active hand (gesturing): yellow highlight (#ffcc00), radius 8px, glow effect
Bone rendering:
- Normal bones: teal lines (#00d4aa), 2px stroke
- Active arm (gesturing): yellow lines (#ffcc00), 3px stroke, glow
- Torso: slightly thicker (3px) to anchor the skeleton visually
Style: dark-theme friendly, high contrast against camera feed
```
**Cursor types:**
- **Open hand** — teal ring around wrist, rays extending from fingers
- **Pointing** — teal ray from index finger toward chart
- **Grabbing** — yellow fist icon, chart border highlights
- **Pinching** — two teal dots (thumb + index) with distance line
- **Ghost cursor** — CSI-only mode: larger, more diffuse circle (no finger detail)
### Data Flow Protocol
WebSocket messages from gesture engine to UI:
```typescript
interface GestureEvent {
type: 'gesture';
gesture: 'swipe_left' | 'swipe_right' | 'swipe_up' | 'swipe_down'
| 'pinch_zoom' | 'point' | 'grab' | 'hold' | 'circle_cw'
| 'circle_ccw' | 'push' | 'pull' | 'spread' | 'contract'
| 'thumb_up' | 'thumb_down';
confidence: number; // 0-1
source: 'csi' | 'camera' | 'fusion';
position?: [number, number]; // Normalized [0,1] hand position
velocity?: [number, number]; // Hand velocity for proportional control
param?: number; // Gesture-specific parameter (pinch distance, rotation angle)
}
interface CursorEvent {
type: 'cursor';
x: number; // 0-1 normalized
y: number; // 0-1 normalized
state: 'tracking' | 'pointing' | 'grabbing' | 'pinching' | 'idle';
hands: number; // 0, 1, or 2
}
interface StatusEvent {
type: 'status';
csi_active: boolean;
camera_active: boolean;
mode: 'fusion' | 'csi_only' | 'camera_only';
fps: number;
gesture_count: number; // Total gestures detected this session
}
```
### Training the CSI Gesture Model
Extends ADR-079's camera ground-truth pipeline:
```bash
# 1. Collect gesture training data (camera + CSI, 10 min)
# Perform each gesture 20+ times with natural variation
python scripts/collect-gesture-gt.py --duration 600 --gestures all --preview
# 2. Label gesture segments (auto-detected from camera)
node scripts/label-gestures.js \
--gt data/ground-truth/gestures-*.jsonl \
--csi data/recordings/csi-*.jsonl
# 3. Train gesture classifier
node scripts/train-gesture-model.js \
--data data/gestures/labeled-*.jsonl \
--scale lite
# 4. Deploy
# CSI-only mode: gestures detected from WiFlow keypoint motion
# Fusion mode: camera adds finger-level precision
```
**Training data per gesture:** ~20 examples × 11 gestures = 220 labeled samples.
With augmentation (time warp, amplitude noise): ~1,000 effective samples.
### Optimization: ruvector-cnn Spectrogram Gesture Classification
Replace DTW template matching with a CNN operating on CSI spectrograms via the
`ruvector-cnn` WASM package (ADR-076). This treats each gesture as an image
classification problem on the CSI time-frequency representation.
#### Why CNN Over DTW
| | DTW (current, ADR-029) | CNN Spectrogram (proposed) |
|---|---|---|
| Input | 1D keypoint trajectories | 2D CSI spectrogram image |
| Features | Hand-crafted (wrist velocity, elbow angle) | Learned end-to-end |
| Robustness | Sensitive to speed variation | Warp-invariant (pooling layers) |
| Multi-scale | Single scale | Hierarchical (dilated convolutions) |
| Training | Template recording + DTW distance | Supervised from camera labels |
| New gestures | Record new template | Retrain (or few-shot with embedding) |
| Accuracy | ~85% (DTW literature) | ~95%+ (CNN on spectrograms, literature) |
#### Pipeline
```
CSI [N_subcarriers, T=30] (1-second window)
Spectrogram transform: STFT per subcarrier
→ [N_sub, F_bins, T_bins] ≈ [35, 16, 15]
Reshape to grayscale image: [35×16, 15] = [560, 15]
→ Resize to [64, 64] (bilinear)
ruvector-cnn CnnEmbedder (WASM-accelerated)
→ 128-dim gesture embedding
Classifier head: Linear(128 → 18 gestures) + softmax
→ gesture_id + confidence
```
#### ruvector-cnn Integration
The `@ruvector/cnn` WASM package provides:
```javascript
const { init, CnnEmbedder, InfoNCELoss } = require('@ruvector/cnn');
await init();
// Create embedder for 64x64 CSI spectrogram "images"
const embedder = new CnnEmbedder({
inputSize: 64,
embeddingDim: 128,
normalize: true,
});
// Extract embedding from CSI spectrogram
const spectrogram = csiToSpectrogram(csiWindow); // [64, 64] Uint8Array
const embedding = embedder.extract(spectrogram, 64, 64);
// Classify gesture via nearest-neighbor to trained templates
const gesture = classifyGesture(embedding, gestureTemplates);
```
#### Training with Contrastive + Classification
Two-phase training using ruvector-cnn's built-in losses:
**Phase 1: Contrastive embedding (unsupervised)**
```javascript
const loss = new InfoNCELoss(0.07);
// Same gesture performed at different speeds → positive pairs
// Different gestures → negative pairs
// Train CnnEmbedder to cluster same-gesture spectrograms
```
**Phase 2: Gesture classification (supervised)**
```javascript
// Linear classifier on frozen embeddings
// 18 gestures × 20 examples each = 360 labeled samples
// Camera auto-labels: MediaPipe Hands detects gesture type
```
#### Dual-Path Architecture
Run both CNN and DTW in parallel for maximum robustness:
```
CSI input ──┬──→ WiFlow → keypoints → DTW templates → gesture_A (conf_A)
└──→ Spectrogram → ruvector-cnn → embedding → classifier → gesture_B (conf_B)
Fusion: if gesture_A == gesture_B → conf = max(conf_A, conf_B) + 0.15
if conflict → pick higher confidence
if only one detects → use it at 0.8× confidence
```
This dual-path approach provides:
- **DTW** catches gestures the CNN might miss (novel variations)
- **CNN** provides higher accuracy for trained gesture types
- **Fusion** reduces false positives (both must agree for high-confidence)
### Optimization: Temporal Gesture Encoding
Alternative lightweight path for when ruvector-cnn WASM overhead matters
(e.g., ESP32 edge deployment):
```
Keypoint sequence [T=30 frames, 1 second]:
wrist_x[0..29], wrist_y[0..29],
elbow_angle[0..29],
hand_velocity[0..29]
1D CNN (k=5, d=[1,2,4]) → 64-dim gesture embedding
Nearest-neighbor to gesture templates (cosine distance)
Top gesture + confidence
```
This is lighter than DTW for real-time use and can be trained end-to-end with
the WiFlow backbone (shared TCN features).
## File Structure
```
scripts/
collect-gesture-gt.py # Camera + CSI gesture data collection
label-gestures.js # Auto-label gesture segments from camera
train-gesture-model.js # Train CSI gesture classifier
gesture-server.js # WebSocket gesture detection server
ui/
components/
GestureOverlay.js # Cursor + feedback overlay
GestureChart.js # Gesture-controlled chart wrapper
GestureStatus.js # Sensor health bar
services/
gesture.service.js # WebSocket client for gesture events
```
## Consequences
### Positive
- **Hands-free data exploration** — manipulate charts without touching anything
- **Works in dark/dirty/gloved conditions** — CSI-only mode needs no camera
- **Natural interaction** — swipe, pinch, point are intuitive
- **Builds on existing infrastructure** — WiFlow + DTW + MediaPipe all exist
- **Dual-mode deployment** — degrade gracefully from fusion to CSI-only
- **Low latency** — WiFlow inference is 0.79ms, gesture detection adds ~5ms
### Negative
- **Learning curve** — users must learn gesture vocabulary
- **False positives** — normal movement may trigger gestures (mitigated by state machine + cooldown)
- **CSI-only precision** — coarse gestures only without camera
- **Single-user** — multi-user gesture disambiguation is hard
### Risks
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| Gesture false positives from normal movement | Medium | High | State machine with IDLE→TRACKING threshold, 200ms debounce, 0.7 confidence gate |
| CSI gestures too coarse for chart control | Medium | Medium | Camera fallback for precision; CSI handles navigation-level gestures only |
| Latency > 100ms feels unresponsive | Low | High | WiFlow 0.79ms + gesture 5ms + WebSocket <10ms = ~16ms total |
| User fatigue ("gorilla arm") | Medium | Medium | Support seated gestures; small wrist movements, not full arm sweeps |
| MediaPipe Hands not detecting in low light | Medium | Low | CSI-only fallback; works in complete darkness |
## Implementation Plan
| Phase | Task | Effort | Dependencies |
|-------|------|--------|-------------|
| P1 | `gesture-server.js` WebSocket server with camera hand tracking | 3 hrs | MediaPipe Hands model |
| P2 | Camera gesture classifier (rule-based from hand landmarks) | 2 hrs | P1 |
| P3 | CSI gesture classifier (WiFlow keypoints DTW templates) | 3 hrs | WiFlow model (ADR-079) |
| P4 | Fusion engine (confidence-weighted merge) | 2 hrs | P2 + P3 |
| P5 | `GestureOverlay.js` cursor + feedback UI component | 2 hrs | P1 |
| P6 | `GestureChart.js` gesture-controlled D3 chart wrapper | 4 hrs | P4 + P5 |
| P7 | Gesture training data collection + model training | 2 hrs | P3 |
| P8 | Integration with existing sensing UI | 2 hrs | P6 |
| **Total** | | **~20 hrs** | |
## References
- MediaPipe Hands Google's 21-landmark hand tracking (30fps, CPU)
- ADR-029 RuvSense DTW gesture recognition
- ADR-079 Camera ground-truth training pipeline (92.9% PCK@20)
- Leap Motion commercial gesture controller (comparison point)
- SolidJS/D3 gesture interaction patterns
- "GestureWiFi" (IEEE 2023) WiFi gesture recognition survey