feat(pointcloud): use MediaPipe Face Mesh for the live demo (ADR-094)
The previous synthetic procedural demo did not represent what the local fusion pipeline produces — a real depth-backprojected point cloud of the user's face and surroundings. This commit ports the closest browser equivalent: MediaPipe Face Mesh runs in-browser at ~30 fps and emits 478 3D landmarks per frame. Each visitor now sees the outline of their own face rendered as a point cloud, with a small floor + back wall for spatial context. - Adds MediaPipe Face Mesh + Camera Utils via jsdelivr CDN. - Adds an "▶ Enable camera" CTA so getUserMedia is gated on a user gesture (required by some browsers and good UX regardless). - New face-mesh frame generator uses the same splat shape as the live /api/splats payload, so a single render path drives both modes. - Mirrors x to match selfie convention; maps lm.z (relative depth) to the world-coord range used by the live pipeline. - Falls back automatically to the procedural floor + walls + figure when the camera is denied, dismissed, or unavailable. - Badge surfaces the new state: '● DEMO Your Face (MediaPipe)'. - Bumps poll cadence to 4 Hz so face mesh updates feel live. - ADR-094 updated to reflect the new default behavior. Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
parent
7343bdc4dd
commit
cbedbce9e3
|
|
@ -45,10 +45,22 @@ Ship **one** viewer that auto-selects its transport from URL parameters,
|
|||
and publish it to `gh-pages/pointcloud/` alongside the other demos:
|
||||
|
||||
1. **Default mode** — when the viewer is opened with no query parameters
|
||||
on `https://ruvnet.github.io/RuView/pointcloud/`, render a synthetic
|
||||
in-browser scene (floor grid, walls, breathing/swaying figure, animated
|
||||
17-keypoint skeleton) and label the badge `● DEMO Synthetic`. No
|
||||
network calls are made. Renders forever, deterministic, ~200 splats.
|
||||
on `https://ruvnet.github.io/RuView/pointcloud/`, present a "▶ Enable
|
||||
camera" CTA. On click the viewer requests webcam access, runs
|
||||
**MediaPipe Face Mesh** in-browser (~30 fps, 478 refined landmarks),
|
||||
and renders the visitor's own face as a point cloud — the closest
|
||||
browser equivalent of the local pipeline's depth-backprojected face
|
||||
geometry that motivated this ADR (`I could see the outline of my face
|
||||
in points`). The viewer mirrors x to match selfie convention and
|
||||
maps Face Mesh's relative-z to the same world-coordinate range the
|
||||
live `/api/splats` payload uses, so a single render path drives both.
|
||||
Badge reads `● DEMO Your Face (MediaPipe)`. If the user denies
|
||||
camera permission, dismisses the prompt, or visits on a device
|
||||
without a webcam, the viewer falls back automatically to a
|
||||
procedural scaffold (floor grid, walls, breathing figure, 17-keypoint
|
||||
skeleton). All processing is client-side; no frames leave the
|
||||
browser. ~480-500 splats from the face plus ~110 floor/wall context
|
||||
splats.
|
||||
2. **Auto mode** (`?backend=auto`) — fetch from `/api/splats` on the same
|
||||
origin. This is the local-development case (`ruview-pointcloud serve`
|
||||
serves the viewer and the API together). On any failure (404, network
|
||||
|
|
@ -99,11 +111,14 @@ and nvsim deployments.
|
|||
|
||||
### Negative / tradeoffs
|
||||
|
||||
- **Synthetic ≠ real.** The demo figure is procedural, not recorded from
|
||||
hardware, so visitors cannot see *real* CSI-derived poses without
|
||||
supplying `?backend=`. We accept this — the alternatives (pre-recorded
|
||||
JSON, on-page WASM inference) add maintenance cost and diverge the
|
||||
render path.
|
||||
- **Face mesh ≠ CSI.** Browser webcam + MediaPipe gives real face
|
||||
geometry but does not produce CSI-derived pose. Visitors who want to
|
||||
see the *WiFi-driven* path still need `?backend=<their-host>`. The
|
||||
procedural fallback is not WiFi-driven either; it is purely visual
|
||||
scaffolding. We accept this — the goal of the hosted demo is to
|
||||
convey the *shape* of what the local pipeline produces (a point
|
||||
cloud of the user) rather than reproduce the WiFi physics in the
|
||||
browser. The latter is a future ADR (WASM port of the fusion crate).
|
||||
- **CORS burden on remote mode.** Users who want to share their backend
|
||||
must add `Access-Control-Allow-Origin: https://ruvnet.github.io` (or
|
||||
`*`) to their `ruview-pointcloud serve` config. We document this in the
|
||||
|
|
@ -139,9 +154,11 @@ This ADR is **Implemented** when all of the following hold:
|
|||
1. Pushing to `main` with a viewer change triggers
|
||||
`pointcloud-pages.yml`, which deploys to `gh-pages/pointcloud/` in
|
||||
under 60 seconds.
|
||||
2. `https://ruvnet.github.io/RuView/pointcloud/` loads, renders the
|
||||
synthetic scene, displays `● DEMO Synthetic` badge, and shows
|
||||
non-zero splat + frame counts.
|
||||
2. `https://ruvnet.github.io/RuView/pointcloud/` loads, shows the
|
||||
"Enable camera" CTA, and on accept renders the visitor's face as a
|
||||
point cloud with badge `● DEMO Your Face (MediaPipe)` and non-zero
|
||||
splat + frame counts. On camera denial, falls back to the
|
||||
procedural scene with badge `● DEMO Synthetic`.
|
||||
3. Existing demos at `https://ruvnet.github.io/RuView/` and
|
||||
`…/pose-fusion.html` and `…/nvsim/` are still reachable after the
|
||||
first deploy (smoke-tested manually).
|
||||
|
|
|
|||
|
|
@ -5,24 +5,32 @@
|
|||
<style>
|
||||
body { margin: 0; background: #0a0a0a; color: #e8a634; font-family: monospace; }
|
||||
canvas { display: block; }
|
||||
#info { position: absolute; top: 10px; left: 10px; padding: 12px; background: rgba(0,0,0,0.85); border: 1px solid #e8a634; border-radius: 6px; min-width: 240px; font-size: 13px; line-height: 1.5; }
|
||||
#info { position: absolute; top: 10px; left: 10px; padding: 12px; background: rgba(0,0,0,0.85); border: 1px solid #e8a634; border-radius: 6px; min-width: 240px; font-size: 13px; line-height: 1.5; z-index: 10; }
|
||||
#cam-cta { position: absolute; bottom: 16px; left: 50%; transform: translateX(-50%); padding: 10px 18px; background: #e8a634; color: #0a0a0a; border: none; border-radius: 4px; font-family: monospace; font-size: 14px; font-weight: bold; cursor: pointer; z-index: 10; }
|
||||
#cam-cta:hover { background: #ffc04d; }
|
||||
#cam-cta.hidden { display: none; }
|
||||
.live { color: #4f4; } .demo { color: #f44; }
|
||||
.face { color: #4cf; }
|
||||
.section { margin-top: 6px; padding-top: 6px; border-top: 1px solid #333; }
|
||||
.label { color: #888; }
|
||||
</style>
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/r128/three.min.js"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/three@0.128.0/examples/js/controls/OrbitControls.js"></script>
|
||||
<!-- MediaPipe Face Mesh — runs in demo mode so each visitor sees their own face as a point cloud -->
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh@0.4/face_mesh.js"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils@0.3/camera_utils.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<div id="info">
|
||||
<h3 style="margin:0 0 8px 0">RuView Point Cloud</h3>
|
||||
<div id="stats">Loading...</div>
|
||||
</div>
|
||||
<button id="cam-cta">▶ Enable camera — render your face as a point cloud</button>
|
||||
<script>
|
||||
var scene = new THREE.Scene();
|
||||
scene.background = new THREE.Color(0x0a0a0a);
|
||||
var camera = new THREE.PerspectiveCamera(75, window.innerWidth/window.innerHeight, 0.1, 100);
|
||||
camera.position.set(0, 2, -4);
|
||||
camera.position.set(0, 0.5, -1.5);
|
||||
camera.lookAt(0, 0, 2);
|
||||
|
||||
var renderer = new THREE.WebGLRenderer({ antialias: true });
|
||||
|
|
@ -115,6 +123,118 @@
|
|||
var transportMode = "demo"; // resolved at first fetch: "live" | "remote" | "demo"
|
||||
var demoStartMs = Date.now();
|
||||
var demoFrameNum = 0;
|
||||
var latestFaceLandmarks = null; // populated by MediaPipe when camera enabled
|
||||
var faceMeshState = "idle"; // "idle" | "starting" | "running" | "denied" | "unavailable"
|
||||
|
||||
// ----- MediaPipe Face Mesh (browser equivalent of camera-depth backprojection) -----
|
||||
// Locally, ruview-pointcloud serve fuses real camera depth + WiFi CSI. In the
|
||||
// browser we don't have depth from a webcam, but Face Mesh produces 468
|
||||
// 3D landmarks (x,y in [0,1], z roughly in [-0.5,0.5]) at ~30 fps — enough to
|
||||
// reproduce the "I can see the outline of my face in points" experience. The
|
||||
// landmarks feed into the same splat render path as live /api/splats data.
|
||||
async function startFaceMesh() {
|
||||
if (faceMeshState !== "idle") return;
|
||||
if (!window.FaceMesh || !window.Camera) {
|
||||
faceMeshState = "unavailable";
|
||||
return;
|
||||
}
|
||||
faceMeshState = "starting";
|
||||
try {
|
||||
var videoEl = document.createElement("video");
|
||||
videoEl.style.display = "none";
|
||||
videoEl.autoplay = true;
|
||||
videoEl.playsInline = true;
|
||||
videoEl.muted = true;
|
||||
document.body.appendChild(videoEl);
|
||||
|
||||
var fm = new FaceMesh({
|
||||
locateFile: function(file) {
|
||||
return "https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh@0.4/" + file;
|
||||
}
|
||||
});
|
||||
fm.setOptions({
|
||||
maxNumFaces: 1,
|
||||
refineLandmarks: true,
|
||||
minDetectionConfidence: 0.5,
|
||||
minTrackingConfidence: 0.5
|
||||
});
|
||||
fm.onResults(function(results) {
|
||||
if (results.multiFaceLandmarks && results.multiFaceLandmarks[0]) {
|
||||
latestFaceLandmarks = results.multiFaceLandmarks[0];
|
||||
}
|
||||
});
|
||||
|
||||
var mpCamera = new Camera(videoEl, {
|
||||
onFrame: async function() { await fm.send({ image: videoEl }); },
|
||||
width: 640,
|
||||
height: 480
|
||||
});
|
||||
await mpCamera.start();
|
||||
faceMeshState = "running";
|
||||
var btn = document.getElementById("cam-cta");
|
||||
if (btn) btn.classList.add("hidden");
|
||||
} catch (err) {
|
||||
faceMeshState = "denied";
|
||||
console.warn("Face mesh unavailable:", err);
|
||||
}
|
||||
}
|
||||
|
||||
function faceMeshFrame() {
|
||||
if (faceMeshState !== "running" || !latestFaceLandmarks) return null;
|
||||
var lms = latestFaceLandmarks;
|
||||
var splats = [];
|
||||
var i, lm, x, y, z;
|
||||
// 468 (or 478 with refined landmarks) face points → splats. MediaPipe's
|
||||
// selfie convention has x mirrored; we mirror back so left-of-screen = your
|
||||
// left side. z is depth-relative-to-face-center, ~[-0.1,+0.1] in practice.
|
||||
for (i = 0; i < lms.length; i++) {
|
||||
lm = lms[i];
|
||||
x = (0.5 - lm.x) * 4.0;
|
||||
y = (0.5 - lm.y) * 3.0;
|
||||
z = 2.0 + lm.z * 4.0;
|
||||
splats.push({
|
||||
center: [x, y, z],
|
||||
color: [0.95, 0.65, 0.20],
|
||||
opacity: 1.0,
|
||||
scale: [0.012, 0.012, 0.012]
|
||||
});
|
||||
}
|
||||
// Procedural floor + back wall for spatial context — same density as the
|
||||
// local demo's room scaffold.
|
||||
var gx, gz;
|
||||
for (gx = -4; gx <= 4; gx++) {
|
||||
for (gz = 1; gz <= 8; gz++) {
|
||||
splats.push({
|
||||
center: [gx * 0.4, -1.4, gz * 0.4],
|
||||
color: [0.15, 0.18, 0.22],
|
||||
opacity: 1.0,
|
||||
scale: [0.05, 0.05, 0.05]
|
||||
});
|
||||
}
|
||||
}
|
||||
for (gx = -4; gx <= 4; gx += 2) {
|
||||
for (var wy = -1; wy <= 2; wy++) {
|
||||
splats.push({
|
||||
center: [gx * 0.4, wy * 0.5, 4.0],
|
||||
color: [0.12, 0.20, 0.28],
|
||||
opacity: 1.0,
|
||||
scale: [0.05, 0.05, 0.05]
|
||||
});
|
||||
}
|
||||
}
|
||||
demoFrameNum += 1;
|
||||
return {
|
||||
splats: splats,
|
||||
count: splats.length,
|
||||
frame: demoFrameNum,
|
||||
live: false,
|
||||
source: "face-mesh",
|
||||
pipeline: {
|
||||
skeleton: null,
|
||||
vitals: { breathing_rate: 14, motion_score: 0.15 }
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
function buildSplatsUrl() {
|
||||
if (backendArg === "demo") return null;
|
||||
|
|
@ -211,11 +331,16 @@
|
|||
};
|
||||
}
|
||||
|
||||
function pickDemoFrame() {
|
||||
// Prefer real face-mesh data when the camera is running; else procedural.
|
||||
return faceMeshFrame() || syntheticFrame();
|
||||
}
|
||||
|
||||
async function fetchCloud() {
|
||||
// Demo-only mode: never hit the network.
|
||||
if (backendArg === "demo") {
|
||||
transportMode = "demo";
|
||||
handleData(syntheticFrame());
|
||||
handleData(pickDemoFrame());
|
||||
return;
|
||||
}
|
||||
try {
|
||||
|
|
@ -231,7 +356,7 @@
|
|||
return;
|
||||
}
|
||||
transportMode = "demo";
|
||||
handleData(syntheticFrame());
|
||||
handleData(pickDemoFrame());
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -262,6 +387,8 @@
|
|||
mode = '<span class="live">● LIVE</span> Local Backend';
|
||||
} else if (transportMode === "remote") {
|
||||
mode = '<span class="live">● REMOTE</span> ' + backendArg;
|
||||
} else if (data.source === "face-mesh") {
|
||||
mode = '<span class="face">● DEMO</span> Your Face (MediaPipe)';
|
||||
} else {
|
||||
mode = '<span class="demo">● DEMO</span> Synthetic';
|
||||
}
|
||||
|
|
@ -321,8 +448,24 @@
|
|||
}
|
||||
} catch(e) {}
|
||||
}
|
||||
// Wire the camera CTA: shown only when we'll be rendering the demo path
|
||||
// (auto-with-no-backend or explicit ?backend=demo). Hidden in live/remote.
|
||||
(function wireCamCta() {
|
||||
var btn = document.getElementById("cam-cta");
|
||||
if (!btn) return;
|
||||
// Hide CTA when user explicitly required live data.
|
||||
if (requireLive || backendArg.startsWith("http")) {
|
||||
btn.classList.add("hidden");
|
||||
return;
|
||||
}
|
||||
btn.addEventListener("click", function() {
|
||||
btn.textContent = "Starting camera…";
|
||||
startFaceMesh();
|
||||
});
|
||||
})();
|
||||
|
||||
fetchCloud();
|
||||
setInterval(fetchCloud, 500);
|
||||
setInterval(fetchCloud, 250); // 4 Hz — enough for face mesh, light on the network
|
||||
|
||||
function updateSplats(splats) {
|
||||
if (pointsMesh) scene.remove(pointsMesh);
|
||||
|
|
|
|||
Loading…
Reference in New Issue