wifi-densepose/v2/crates/homecore-assist
rUv 9b126e927e
harden(assist security): bound untrusted utterance (DoS); cmd-injection/ReDoS/NaN/fail-open all proven clean with evidence (#1086)
* fix(homecore-assist): bound untrusted utterance length, fail closed (ADR-133 security)

The intent recognizers accept utterances from untrusted callers (voice
transcripts, the WebSocket `assist` command). Neither the regex nor the
semantic path bounded utterance length, so a pathological multi-megabyte
utterance forced an unbounded `to_lowercase()` clone plus a per-registered-
pattern scan (and, in the semantic path, full tokenisation + feature-hash
embedding) — an allocation/CPU amplification on attacker-controlled input.
The `regex` crate is linear-time (no catastrophic backtracking), so this was
a throughput/memory DoS rather than a hang, but it was still unbounded.

Fix: introduce MAX_UTTERANCE_BYTES (4 KiB — far above any real spoken
command) and check it at both recognizer boundaries BEFORE any allocation or
scan. An over-length utterance fails closed: Ok(None) (no intent, no action),
identical to an unrecognised phrase. No legitimate command is affected.

Pinned by fails-on-old tests:
  - recognizer::over_length_utterance_fails_closed — an over-length utterance
    that contains a valid command resolves to None (would have matched before)
  - semantic_recognizer::over_length_utterance_fails_closed_semantic

Co-Authored-By: claude-flow <ruv@ruv.net>

* test(homecore-assist): pin clean security dimensions with evidence (ADR-133)

Adds regression tests documenting the dimensions reviewed and found clean,
so the properties cannot silently regress:

  - runner: no subprocess surface exists. RufloRunnerOpts.{script_path,env}
    are inert and never executed; even a hostile script_path/env spawns
    nothing. And the entity_id capture class [a-z0-9_ .] strips every shell
    metacharacter, so a resolved slot can never carry ; | & $ ` / etc into a
    (future) argv — sanitisation by construction.
    (shell_metachars_never_survive_into_a_resolved_slot,
     runner_opts_are_inert_no_process_spawned)
  - recognizer: the regex crate is a linear-time finite automaton; a classic
    catastrophic-backtracking shape (a+)+$ on adversarial input completes in
    bounded time — no ReDoS.
    (pathological_backtracking_pattern_completes_in_bounded_time)
  - embedding: embeddings are structurally finite (FNV feature-hash + guarded
    L2 normalise, no external float input, no unguarded division), so a crafted
    utterance cannot inject NaN/Inf to poison cosine k-NN; cosine against the
    zero vector is a finite 0.0, never NaN.
    (embeddings_are_structurally_finite, cosine_with_zero_vector_is_finite_not_nan,
     empty_utterance_against_empty_index_no_panic_no_match)
  - pipeline: injection-shaped utterances never deliver a metacharacter into a
    service call; the worst case resolves to a clean entity token, and an
    unrecognised utterance fails closed to not_understood (no action).
    (pipeline_injection_shaped_utterance_carries_no_metachars_to_service)

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(homecore-assist): record ADR-133 security review (HC-ASSIST-01 + clean dims)

CHANGELOG [Unreleased] Security entry + ADR-133 section 6 review notes for the
homecore-assist voice/intent pipeline review.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-14 21:34:38 -04:00
..
src harden(assist security): bound untrusted utterance (DoS); cmd-injection/ReDoS/NaN/fail-open all proven clean with evidence (#1086) 2026-06-14 21:34:38 -04:00
Cargo.toml fix(homecore-assist): exact in-memory cosine k-NN, drop fragile :memory: HNSW 2026-06-11 22:13:04 -04:00
README.md docs(homecore-assist): comprehensive README — intent recognition + Ruflo agent bridge 2026-05-25 23:13:20 -04:00

README.md

homecore-assist

Voice-activated intent recognition and execution pipeline for HOMECORE with Ruflo agent bridge (P2).

Crates.io License MSRV: 1.89+ Tests ADR-133

P1 scaffold: intent recognition via regex patterns, 5 built-in intent handlers (turn on/off, set brightness, cancel), and Ruflo runner trait surface. Real tokio::process subprocess integration (P2) allows orchestration with Ruflo agents for complex multi-step actions.

What this crate does

homecore-assist is the voice/NLU gateway for HOMECORE. It takes natural language utterances, recognizes which intent they represent, and executes the appropriate action. It provides:

  • IntentRecognizer trait — abstraction for matching utterances to intents
  • RegexIntentRecognizer — P1 built-in; uses regex patterns (HA classic style)
  • IntentHandler trait — abstraction for handling recognized intents
  • 5 built-in handlersHassTurnOn, HassTurnOff, HassLightSet, HassNevermind, HassCancelAll (mirrors HA's classic intents)
  • RufloRunner trait — abstraction for delegating complex actions to Ruflo agents
  • NoopRunner — P1 stub; real tokio::process subprocess integration in P2
  • AssistPipeline — wires utterance → recognizer → handler → response

Each component is trait-based so recognizers can be swapped (regex in P1, semantic embeddings in P2) without changing the pipeline.

Features

  • Regex pattern recognition — utterance matching via compiled regex (P1)
  • 5 built-in intents — Turn On, Turn Off, Set Brightness, Nevermind, Cancel All
  • Intent entities + slots — recognized patterns capture entity names and parameters (e.g., "turn on light.kitchen" → entity: light.kitchen)
  • Intent responses — structured response with optional text, card (tile data), and conversation context
  • Ruflo agent bridge — submit complex intents to Ruflo agents for multi-step workflows (P2 subprocess)
  • Trait-based recognizers — pluggable: RegexIntentRecognizer (P1), SemanticIntentRecognizer (P2, ruvector embeddings)
  • Trait-based handlers — extensible: built-in HA-mirroring handlers + custom handlers
  • No external STT/TTS — this module handles NLU only; STT/TTS via homecore-api or external service

Capabilities

Capability Type Method Notes
Recognize intent Recognizer RegexIntentRecognizer::recognize(utterance) Returns Intent enum or error
Handle intent Handler IntentHandler::handle(intent, context) → service call Execute service, set state, or defer to Ruflo
Call Ruflo agent Runner RufloRunner::run(intent, opts) (P2) Subprocess with JSON request/response
Build response Response IntentResponse::new(text, entities, card) Conversational response + optional card data
Run pipeline Pipeline AssistPipeline::process(utterance) Full utterance → recognizer → handler → response

Comparison to Home Assistant

Aspect Home Assistant homecore-assist
Intent framework HA Assist pipeline (Python) Rust async trait-based pipeline
Recognizer type Regex (classic) + ML sentence transformer (2024+) Regex (P1); semantic embeddings (P2)
Built-in intents HassTurnOn, HassTurnOff, HassLight*, etc. 5 core intents mirroring HA classic
Custom intents YAML + Python script integration Trait + handler registration
Agent orchestration N/A (HA has no agent framework) RufloRunner + subprocess bridge (P2)
STT/TTS Via conversation integration + webhooks Separate; HOMECORE-ASSIST handles NLU only
Slot extraction regex groups + sentence-transformers Regex groups (P1); ruvector embeddings (P2)
Response format Text + TTS synthesis Structured IntentResponse with card data

Performance

  • Intent recognition latency — < 10 ms per utterance (regex compilation cached)
  • Handler execution — < 20 ms per intent (service call latency dominates)
  • Ruflo agent subprocess (P2) — ~500 ms per agent call (process spawn + IPC overhead)
  • Memory overhead per intent — ~500 bytes (Intent struct + handler state)
  • Concurrent utterances — 100+ per second on single machine (tokio task per utterance)
  • No per-crate benchmarks yet — a follow-up issue tracks baseline measurements

Usage

Regex intent recognition (P1):

use homecore_assist::{RegexIntentRecognizer, IntentName, IntentRecognizer};

#[tokio::main]
async fn main() {
    let mut recognizer = RegexIntentRecognizer::new();
    
    // Register patterns
    recognizer.register(IntentName::HassTurnOn, r"turn (?:on|up) (?:the )?(\w+)").unwrap();
    
    // Recognize utterance
    let intent = recognizer.recognize("turn on the kitchen light").await.unwrap();
    println!("Intent: {:?}", intent.intent_name);
    println!("Entities: {:?}", intent.entities);
}

Built-in handler (P1):

use homecore_assist::{HassTurnOn, IntentHandler, Intent, IntentResponse};
use homecore::HomeCore;

#[tokio::main]
async fn main() {
    let homecore = HomeCore::new();
    let handler = HassTurnOn::new(homecore);
    
    let intent = Intent {
        intent_name: IntentName::HassTurnOn,
        entities: vec![("entity_id".to_string(), "light.kitchen".to_string())].into_iter().collect(),
        slots: Default::default(),
        ..Default::default()
    };
    
    let response = handler.handle(&intent).await.unwrap();
    println!("Response: {}", response.text.unwrap_or_default());
}

Full pipeline (P1):

use homecore_assist::AssistPipeline;
use homecore::HomeCore;

#[tokio::main]
async fn main() {
    let homecore = HomeCore::new();
    let pipeline = AssistPipeline::new(homecore);
    
    let response = pipeline.process("turn on the kitchen light").await.unwrap();
    println!("Assistant: {}", response.text.unwrap_or_default());
}

Relation to other HOMECORE crates

homecore-assist (intent pipeline + Ruflo bridge)
├─ homecore (state machine; handlers call services)
├─ homecore-api (exposes intent endpoints via REST/WS, P2)
├─ homecore-automation (complex intents can trigger automations)
├─ homecore-server (registers AssistPipeline at startup)
└─ ruflo (Ruflo agent subprocess for multi-step workflows, P2)

References