feat(pointcloud): use MediaPipe Face Mesh for the live demo (ADR-094)

The previous synthetic procedural demo did not represent what the local
fusion pipeline produces — a real depth-backprojected point cloud of
the user's face and surroundings. This commit ports the closest browser
equivalent: MediaPipe Face Mesh runs in-browser at ~30 fps and emits
478 3D landmarks per frame. Each visitor now sees the outline of their
own face rendered as a point cloud, with a small floor + back wall for
spatial context.

- Adds MediaPipe Face Mesh + Camera Utils via jsdelivr CDN.
- Adds an "▶ Enable camera" CTA so getUserMedia is gated on a user
  gesture (required by some browsers and good UX regardless).
- New face-mesh frame generator uses the same splat shape as the live
  /api/splats payload, so a single render path drives both modes.
- Mirrors x to match selfie convention; maps lm.z (relative depth) to
  the world-coord range used by the live pipeline.
- Falls back automatically to the procedural floor + walls + figure
  when the camera is denied, dismissed, or unavailable.
- Badge surfaces the new state: '● DEMO Your Face (MediaPipe)'.
- Bumps poll cadence to 4 Hz so face mesh updates feel live.
- ADR-094 updated to reflect the new default behavior.

Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
ruv 2026-04-29 19:42:51 -04:00
parent 7343bdc4dd
commit cbedbce9e3
2 changed files with 177 additions and 17 deletions

View File

@ -45,10 +45,22 @@ Ship **one** viewer that auto-selects its transport from URL parameters,
and publish it to `gh-pages/pointcloud/` alongside the other demos:
1. **Default mode** — when the viewer is opened with no query parameters
on `https://ruvnet.github.io/RuView/pointcloud/`, render a synthetic
in-browser scene (floor grid, walls, breathing/swaying figure, animated
17-keypoint skeleton) and label the badge `● DEMO Synthetic`. No
network calls are made. Renders forever, deterministic, ~200 splats.
on `https://ruvnet.github.io/RuView/pointcloud/`, present a "▶ Enable
camera" CTA. On click the viewer requests webcam access, runs
**MediaPipe Face Mesh** in-browser (~30 fps, 478 refined landmarks),
and renders the visitor's own face as a point cloud — the closest
browser equivalent of the local pipeline's depth-backprojected face
geometry that motivated this ADR (`I could see the outline of my face
in points`). The viewer mirrors x to match selfie convention and
maps Face Mesh's relative-z to the same world-coordinate range the
live `/api/splats` payload uses, so a single render path drives both.
Badge reads `● DEMO Your Face (MediaPipe)`. If the user denies
camera permission, dismisses the prompt, or visits on a device
without a webcam, the viewer falls back automatically to a
procedural scaffold (floor grid, walls, breathing figure, 17-keypoint
skeleton). All processing is client-side; no frames leave the
browser. ~480-500 splats from the face plus ~110 floor/wall context
splats.
2. **Auto mode** (`?backend=auto`) — fetch from `/api/splats` on the same
origin. This is the local-development case (`ruview-pointcloud serve`
serves the viewer and the API together). On any failure (404, network
@ -99,11 +111,14 @@ and nvsim deployments.
### Negative / tradeoffs
- **Synthetic ≠ real.** The demo figure is procedural, not recorded from
hardware, so visitors cannot see *real* CSI-derived poses without
supplying `?backend=`. We accept this — the alternatives (pre-recorded
JSON, on-page WASM inference) add maintenance cost and diverge the
render path.
- **Face mesh ≠ CSI.** Browser webcam + MediaPipe gives real face
geometry but does not produce CSI-derived pose. Visitors who want to
see the *WiFi-driven* path still need `?backend=<their-host>`. The
procedural fallback is not WiFi-driven either; it is purely visual
scaffolding. We accept this — the goal of the hosted demo is to
convey the *shape* of what the local pipeline produces (a point
cloud of the user) rather than reproduce the WiFi physics in the
browser. The latter is a future ADR (WASM port of the fusion crate).
- **CORS burden on remote mode.** Users who want to share their backend
must add `Access-Control-Allow-Origin: https://ruvnet.github.io` (or
`*`) to their `ruview-pointcloud serve` config. We document this in the
@ -139,9 +154,11 @@ This ADR is **Implemented** when all of the following hold:
1. Pushing to `main` with a viewer change triggers
`pointcloud-pages.yml`, which deploys to `gh-pages/pointcloud/` in
under 60 seconds.
2. `https://ruvnet.github.io/RuView/pointcloud/` loads, renders the
synthetic scene, displays `● DEMO Synthetic` badge, and shows
non-zero splat + frame counts.
2. `https://ruvnet.github.io/RuView/pointcloud/` loads, shows the
"Enable camera" CTA, and on accept renders the visitor's face as a
point cloud with badge `● DEMO Your Face (MediaPipe)` and non-zero
splat + frame counts. On camera denial, falls back to the
procedural scene with badge `● DEMO Synthetic`.
3. Existing demos at `https://ruvnet.github.io/RuView/` and
`…/pose-fusion.html` and `…/nvsim/` are still reachable after the
first deploy (smoke-tested manually).

View File

@ -5,24 +5,32 @@
<style>
body { margin: 0; background: #0a0a0a; color: #e8a634; font-family: monospace; }
canvas { display: block; }
#info { position: absolute; top: 10px; left: 10px; padding: 12px; background: rgba(0,0,0,0.85); border: 1px solid #e8a634; border-radius: 6px; min-width: 240px; font-size: 13px; line-height: 1.5; }
#info { position: absolute; top: 10px; left: 10px; padding: 12px; background: rgba(0,0,0,0.85); border: 1px solid #e8a634; border-radius: 6px; min-width: 240px; font-size: 13px; line-height: 1.5; z-index: 10; }
#cam-cta { position: absolute; bottom: 16px; left: 50%; transform: translateX(-50%); padding: 10px 18px; background: #e8a634; color: #0a0a0a; border: none; border-radius: 4px; font-family: monospace; font-size: 14px; font-weight: bold; cursor: pointer; z-index: 10; }
#cam-cta:hover { background: #ffc04d; }
#cam-cta.hidden { display: none; }
.live { color: #4f4; } .demo { color: #f44; }
.face { color: #4cf; }
.section { margin-top: 6px; padding-top: 6px; border-top: 1px solid #333; }
.label { color: #888; }
</style>
<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/r128/three.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/three@0.128.0/examples/js/controls/OrbitControls.js"></script>
<!-- MediaPipe Face Mesh — runs in demo mode so each visitor sees their own face as a point cloud -->
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh@0.4/face_mesh.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils@0.3/camera_utils.js"></script>
</head>
<body>
<div id="info">
<h3 style="margin:0 0 8px 0">RuView Point Cloud</h3>
<div id="stats">Loading...</div>
</div>
<button id="cam-cta">▶ Enable camera — render your face as a point cloud</button>
<script>
var scene = new THREE.Scene();
scene.background = new THREE.Color(0x0a0a0a);
var camera = new THREE.PerspectiveCamera(75, window.innerWidth/window.innerHeight, 0.1, 100);
camera.position.set(0, 2, -4);
camera.position.set(0, 0.5, -1.5);
camera.lookAt(0, 0, 2);
var renderer = new THREE.WebGLRenderer({ antialias: true });
@ -115,6 +123,118 @@
var transportMode = "demo"; // resolved at first fetch: "live" | "remote" | "demo"
var demoStartMs = Date.now();
var demoFrameNum = 0;
var latestFaceLandmarks = null; // populated by MediaPipe when camera enabled
var faceMeshState = "idle"; // "idle" | "starting" | "running" | "denied" | "unavailable"
// ----- MediaPipe Face Mesh (browser equivalent of camera-depth backprojection) -----
// Locally, ruview-pointcloud serve fuses real camera depth + WiFi CSI. In the
// browser we don't have depth from a webcam, but Face Mesh produces 468
// 3D landmarks (x,y in [0,1], z roughly in [-0.5,0.5]) at ~30 fps — enough to
// reproduce the "I can see the outline of my face in points" experience. The
// landmarks feed into the same splat render path as live /api/splats data.
async function startFaceMesh() {
if (faceMeshState !== "idle") return;
if (!window.FaceMesh || !window.Camera) {
faceMeshState = "unavailable";
return;
}
faceMeshState = "starting";
try {
var videoEl = document.createElement("video");
videoEl.style.display = "none";
videoEl.autoplay = true;
videoEl.playsInline = true;
videoEl.muted = true;
document.body.appendChild(videoEl);
var fm = new FaceMesh({
locateFile: function(file) {
return "https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh@0.4/" + file;
}
});
fm.setOptions({
maxNumFaces: 1,
refineLandmarks: true,
minDetectionConfidence: 0.5,
minTrackingConfidence: 0.5
});
fm.onResults(function(results) {
if (results.multiFaceLandmarks && results.multiFaceLandmarks[0]) {
latestFaceLandmarks = results.multiFaceLandmarks[0];
}
});
var mpCamera = new Camera(videoEl, {
onFrame: async function() { await fm.send({ image: videoEl }); },
width: 640,
height: 480
});
await mpCamera.start();
faceMeshState = "running";
var btn = document.getElementById("cam-cta");
if (btn) btn.classList.add("hidden");
} catch (err) {
faceMeshState = "denied";
console.warn("Face mesh unavailable:", err);
}
}
function faceMeshFrame() {
if (faceMeshState !== "running" || !latestFaceLandmarks) return null;
var lms = latestFaceLandmarks;
var splats = [];
var i, lm, x, y, z;
// 468 (or 478 with refined landmarks) face points → splats. MediaPipe's
// selfie convention has x mirrored; we mirror back so left-of-screen = your
// left side. z is depth-relative-to-face-center, ~[-0.1,+0.1] in practice.
for (i = 0; i < lms.length; i++) {
lm = lms[i];
x = (0.5 - lm.x) * 4.0;
y = (0.5 - lm.y) * 3.0;
z = 2.0 + lm.z * 4.0;
splats.push({
center: [x, y, z],
color: [0.95, 0.65, 0.20],
opacity: 1.0,
scale: [0.012, 0.012, 0.012]
});
}
// Procedural floor + back wall for spatial context — same density as the
// local demo's room scaffold.
var gx, gz;
for (gx = -4; gx <= 4; gx++) {
for (gz = 1; gz <= 8; gz++) {
splats.push({
center: [gx * 0.4, -1.4, gz * 0.4],
color: [0.15, 0.18, 0.22],
opacity: 1.0,
scale: [0.05, 0.05, 0.05]
});
}
}
for (gx = -4; gx <= 4; gx += 2) {
for (var wy = -1; wy <= 2; wy++) {
splats.push({
center: [gx * 0.4, wy * 0.5, 4.0],
color: [0.12, 0.20, 0.28],
opacity: 1.0,
scale: [0.05, 0.05, 0.05]
});
}
}
demoFrameNum += 1;
return {
splats: splats,
count: splats.length,
frame: demoFrameNum,
live: false,
source: "face-mesh",
pipeline: {
skeleton: null,
vitals: { breathing_rate: 14, motion_score: 0.15 }
}
};
}
function buildSplatsUrl() {
if (backendArg === "demo") return null;
@ -211,11 +331,16 @@
};
}
function pickDemoFrame() {
// Prefer real face-mesh data when the camera is running; else procedural.
return faceMeshFrame() || syntheticFrame();
}
async function fetchCloud() {
// Demo-only mode: never hit the network.
if (backendArg === "demo") {
transportMode = "demo";
handleData(syntheticFrame());
handleData(pickDemoFrame());
return;
}
try {
@ -231,7 +356,7 @@
return;
}
transportMode = "demo";
handleData(syntheticFrame());
handleData(pickDemoFrame());
}
}
@ -262,6 +387,8 @@
mode = '<span class="live">&#9679; LIVE</span> Local Backend';
} else if (transportMode === "remote") {
mode = '<span class="live">&#9679; REMOTE</span> ' + backendArg;
} else if (data.source === "face-mesh") {
mode = '<span class="face">&#9679; DEMO</span> Your Face (MediaPipe)';
} else {
mode = '<span class="demo">&#9679; DEMO</span> Synthetic';
}
@ -321,8 +448,24 @@
}
} catch(e) {}
}
// Wire the camera CTA: shown only when we'll be rendering the demo path
// (auto-with-no-backend or explicit ?backend=demo). Hidden in live/remote.
(function wireCamCta() {
var btn = document.getElementById("cam-cta");
if (!btn) return;
// Hide CTA when user explicitly required live data.
if (requireLive || backendArg.startsWith("http")) {
btn.classList.add("hidden");
return;
}
btn.addEventListener("click", function() {
btn.textContent = "Starting camera…";
startFaceMesh();
});
})();
fetchCloud();
setInterval(fetchCloud, 500);
setInterval(fetchCloud, 250); // 4 Hz — enough for face mesh, light on the network
function updateSplats(splats) {
if (pointsMesh) scene.remove(pointsMesh);