15 KiB
ADR-127: HOMECORE-CORE — Rust state machine, entity registry, event bus, service registry
| Field | Value |
|---|---|
| Status | Proposed |
| Date | 2026-05-25 |
| Deciders | ruv |
| Codename | HOMECORE-CORE |
| Relates to | ADR-126 (HOMECORE master), ADR-028 (witness chain), ADR-124 (RUVIEW-POLICY) |
| Tracking issue | TBD |
1. Context
homeassistant/core.py is the 3,200-line heart of Python Home Assistant. It defines five objects that every other HA component depends on:
HomeAssistant— the runtime coordinator, event loop holder, and service locator. Containsbus(EventBus),states(StateMachine),services(ServiceRegistry),config(Config),components(loaded component set).EventBus— publish/subscribe event dispatch.async_fire(event_type, event_data)dispatches to all registered listeners. Listener registration isasync_listen(event_type, callback). Wildcard listener isMATCH_ALL. Event data is a plain Python dict.StateMachine— an in-memory dictionary fromentity_id(str) toState.async_set(entity_id, new_state, attributes)writes and firesstate_changed.get(entity_id)reads.async_remove(entity_id)firesstate_removed. States are immutable snapshots withlast_changed,last_updated,context.ServiceRegistry— maps(domain, service_name)→ async handler function.async_call(domain, service, data)fires acall_serviceevent, waits for the registered handler.async_register(domain, service, handler, schema)registers a handler with optional voluptuous schema validation.EntityRegistry(homeassistant/helpers/entity_registry.py) — persists metadata (enabled/disabled, name override, area assignment, device ID, unique ID, entity category) across restarts. Stored in.storage/core.entity_registry. Loaded at startup; written on every change.
The DeviceRegistry (homeassistant/helpers/device_registry.py, stored in .storage/core.device_registry) tracks physical devices that entities belong to. Entities link to devices via device_id; devices link to config entries via config_entry_id.
1.1 Why these specific files matter
Python HA's core.py is a single-process Python 3.12 module that:
- Holds the asyncio event loop directly
- Serialises all state-changed writes through
asyncio.Lock - Fires event listeners in the same event loop iteration that fired the event (listeners cannot block)
- Is single-threaded by design — concurrent writes to the state machine are impossible without explicit async primitives
For HOMECORE the same semantic requirements apply, but the implementation must support:
- Concurrent reads from dozens of integration WASM sandboxes polling current state
- High-frequency writes from the RuView sensing stack (CSI at 100 Hz; state updates at up to 20 Hz per entity)
- Ordered delivery of state_changed events to automation triggers (ADR-129) and recorder (ADR-132) subscribers
- Zero-copy reads where possible for the REST API (ADR-130) path
2. Decision
Implement the homecore Rust crate at v2/crates/homecore/ with the following design.
2.1 State machine: DashMap + Tokio broadcast
The primary state store is a DashMap<EntityId, Arc<State>> where:
EntityIdis a validated newtype aroundString(validated format:domain.name)Stateis a frozen struct:entity_id,state(String),attributes(serde_json::Value),last_changed(DateTime),last_updated(DateTime),context(Context)Arc<State>allows zero-copy cloning for readers while the writer atomically replaces the map entry
State changes are published to a tokio::sync::broadcast::Sender<StateChangedEvent> channel (capacity: 4,096 events). Any number of receivers subscribe — the recorder, automation engine, WebSocket subscriber handler, and ruvector dual-write task all hold independent receivers. Slow receivers that fall behind by 4,096 events receive a RecvError::Lagged and must re-sync from the current state map.
2.2 Event bus: typed + untyped channels
HOMECORE distinguishes two event categories:
- System events (typed):
StateChanged,ServiceCall,ComponentLoaded,PlatformDiscovered,HomeAssistantStart,HomeAssistantStop. These use Tokio typed broadcast channels with zero allocation on the read path. - Integration events (untyped): integrations fire arbitrary event types (
event_type: String,event_data: serde_json::Value). These use a singlebroadcast::Sender<DomainEvent>whereDomainEventcarries the type string and data blob. This mirrors HA'sEventBus.async_fire().
2.3 Service registry: HashMap + mpsc dispatch
Services are registered as (Domain, ServiceName) → ServiceHandler where ServiceHandler is a Box<dyn Fn(ServiceCall) -> BoxFuture<ServiceResponse> + Send + Sync>. The registry lives in a tokio::sync::RwLock<HashMap<(Domain, ServiceName), ServiceHandler>>. Service calls go through the event bus (fire call_service) and are dispatched to the handler by an internal router task. This matches HA's indirection: hass.services.async_call(domain, service, data) does not call the handler directly; it fires an event.
2.4 Entity registry: persisted metadata sidecar
The entity registry is a RwLock<HashMap<EntityId, EntityEntry>> backed by an async JSON writer that flushes to .homecore/storage/core.entity_registry on every write. The schema matches HA's core.entity_registry schema (version 13 as of HA 2025.1) so ADR-134 migration can read both formats interchangeably.
EntityEntry fields mirrored from HA:
entity_id: EntityIdunique_id: Option<String>platform: Stringname: Option<String>(user override)disabled_by: Option<DisabledBy>(user, integration, config_entry)area_id: Option<AreaId>device_id: Option<DeviceId>entity_category: Option<EntityCategory>(config, diagnostic)config_entry_id: Option<ConfigEntryId>
2.5 Device registry: parallel sidecar
DeviceRegistry mirrors HA's core.device_registry schema (version 13). Devices are identified by a set of (id_type, id_value) tuples (the identifiers field), which matches HA's pattern of accepting multiple identifier types per device (MAC address, serial number, integration-specific ID).
3. HA-side reference table
| HA module / file | What it does | HOMECORE preserves | Changes | Drops |
|---|---|---|---|---|
homeassistant/core.py StateMachine |
In-memory state store, fire state_changed |
Same semantics: immutable snapshots, last_changed, last_updated, context |
DashMap instead of asyncio-locked dict; broadcast::Sender instead of asyncio callbacks |
Python asyncio coupling |
homeassistant/core.py EventBus |
Pub/sub event dispatch | MATCH_ALL listener; per-type listener; event data dict |
Typed system events + untyped domain events; no Python dict — use serde_json::Value |
@callback decorator, HassJob abstraction |
homeassistant/core.py ServiceRegistry |
Register/call services | Same (domain, service) key structure; schema validation |
Schema validation via serde Deserialize trait instead of voluptuous |
voluptuous, Python type coercions |
homeassistant/core.py HomeAssistant |
Runtime coordinator / service locator | State machine + event bus + services accessible on one struct | Struct with Arc<HomeCoreInner> for cheap cloning across tasks |
asyncio event loop holder, Python executor |
homeassistant/helpers/entity_registry.py |
Persist entity metadata | All fields listed in §2.4; file format compatible | Async tokio I/O; no Python pickle | Python-specific persistence helpers |
homeassistant/helpers/device_registry.py |
Persist device metadata | identifiers, connections, manufacturer, model, name, via_device_id |
Async tokio I/O | — |
homeassistant/helpers/entity.py |
Entity base class | entity_id, state, attributes, unique_id, device_info, async_write_ha_state semantics |
Trait HomeCoreEntity instead of class |
Python MRO, @property decorators |
homeassistant/helpers/event.py |
Convenience event helpers | async_track_state_change, async_track_time_interval (as Rust timer tasks) |
Rust closures / async tasks | Python asyncio task wrappers |
4. Public API parity table
| HA Python surface | HOMECORE Rust equivalent |
|---|---|
hass.states.get(entity_id) |
hass.states.get(&entity_id) -> Option<Arc<State>> |
hass.states.async_set(entity_id, state, attributes) |
hass.states.set(entity_id, state, attributes).await |
hass.states.async_remove(entity_id) |
hass.states.remove(&entity_id).await |
hass.states.async_all(domain_filter) |
hass.states.all(domain_filter) -> Vec<Arc<State>> |
hass.bus.async_fire(event_type, data) |
hass.bus.fire(event_type, data).await |
hass.bus.async_listen(event_type, callback) |
hass.bus.subscribe(event_type) -> broadcast::Receiver<DomainEvent> |
hass.services.async_call(domain, service, data) |
hass.services.call(domain, service, data).await -> ServiceResponse |
hass.services.async_register(domain, service, handler, schema) |
hass.services.register(domain, service, handler) |
hass.services.has_service(domain, service) |
hass.services.has(domain, service) -> bool |
entity_registry.async_get(entity_id) |
entity_registry.get(&entity_id) -> Option<&EntityEntry> |
entity_registry.async_update_entity(entity_id, **kwargs) |
entity_registry.update(entity_id, patch).await |
device_registry.async_get_device(identifiers) |
device_registry.get_by_identifiers(identifiers) -> Option<&DeviceEntry> |
Context(user_id, parent_id) |
Context { id: Uuid, parent_id: Option<Uuid>, user_id: Option<UserId> } |
5. Phased implementation plan
P1 — Skeleton (2 weeks)
- Create
v2/crates/homecore/workspace member withCargo.toml. - Define
State,EntityId,Domain,ServiceName,Context,DomainEventtypes. StateMachine:DashMap+ broadcast channel;set(),get(),remove(),all().EventBus: typed broadcast for system events + untyped broadcast for domain events.- Unit tests: 50 state writes/reads with concurrent readers; verify broadcast delivery.
P2 — Service registry + entity registry (2 weeks)
ServiceRegistry:RwLock<HashMap>+ mpsc dispatch task.EntityRegistry: in-memory + JSON async writer to.homecore/storage/core.entity_registry.DeviceRegistry: in-memory + JSON async writer to.homecore/storage/core.device_registry.- Serialization:
serdewith#[serde(rename_all = "snake_case")]; schema version 13 header written to match HA format. - Unit tests: register service, call service, verify handler invoked; persist and reload entity registry.
P3 — Trait surface for integrations (1 week)
HomeCoreEntitytrait:entity_id(),unique_id(),name(),device_info(),state(),attributes(),async_write_ha_state(&hass).Platformtrait:async_setup_entry(hass, config_entry) -> Result<()>.ConfigEntrystruct mirroring HA'sConfigEntryfields.- Integration test: a minimal test integration registers an entity, writes a state, reads it back from the state machine.
P4 — Performance validation (1 week)
- Benchmark: 1,000 state writes/s on Pi 5; measure latency at p50/p95/p99.
- Benchmark: 100 concurrent WS subscribers each receiving all state_changed events; measure delivery lag.
- Benchmark: broadcast channel saturation test at 4,096 capacity; verify
RecvError::Laggedhandling. - Acceptance criterion: p99 state write latency < 1 ms on Pi 5 (8 GB, 4 cores).
6. Risks
| Risk | Likelihood | Severity | Mitigation | Cross-ADR impact |
|---|---|---|---|---|
| Broadcast channel lag — a slow subscriber (e.g. ruvector recorder write) lags behind and drops events | Medium | High | Give recorder its own channel separate from WS subscribers; recorder is the hot path, give it highest priority | ADR-132: recorder write path must be designed to keep up with 100 Hz state writes |
| DashMap contention — shard count default (16) may be too low for 100 Hz writes on a single entity | Low | Medium | Increase DashMap shard count to 64; benchmark before ADR-130 integration | ADR-130: REST API reads state directly from DashMap — must be lock-free |
Entity registry format drift — HA updates .storage/core.entity_registry schema; HOMECORE falls behind |
Medium | Medium | Pin to schema version 13; version-check on load; fail loudly on unknown version | ADR-134: migration tool reads HA entity registry — must support the same schema version |
Context propagation — HA's Context is used for audit trails (which automation triggered which service call). HOMECORE must propagate it correctly or automation audits break |
High | Low | Derive Context from source event at every service call; thread through ServiceCall.context field |
ADR-129: automation engine must supply context when calling services |
7. Open questions
Q1: Should EntityId validation be strict (reject anything that doesn't match [a-z0-9_]+\.[a-z0-9_]+) or lenient (accept any UTF-8 string)? HA itself accepts unicode entity IDs since 2024.3. Strict validation simplifies routing; lenient matches HA's actual behaviour.
Q2: The broadcast::Sender capacity of 4,096 is chosen based on a worst-case of 100 state writes/s × 40 s of acceptable lag before a slow receiver is declared dead. Is 40 s the right threshold, or should it be configurable per receiver?
Q3: Should the HomeCoreEntity trait be object-safe (enabling Vec<Box<dyn HomeCoreEntity>>) or use associated types (enabling monomorphisation)? Object safety is required for the WASM plugin boundary (ADR-128); monomorphisation is faster for built-in integrations.
Q4: HA's State.context carries a user_id that traces which user or automation initiated a state change. HOMECORE uses UserId from the auth layer (ADR-130). Is the auth layer a dependency of the core state machine, or should user_id be an optional opaque string to avoid circular deps?
8. References
HA upstream
homeassistant/core.py—HomeAssistant,StateMachine(lines 1–800),EventBus(lines 800–1100),ServiceRegistry(lines 1100–1500),Config(lines 1500–2000)homeassistant/helpers/entity_registry.py—EntityRegistry,RegistryEntry(all ~1,900 lines); schema version constantSTORAGE_VERSIONhomeassistant/helpers/device_registry.py—DeviceRegistry,DeviceEntry; schema versionhomeassistant/helpers/entity.py—Entitybase class;async_write_ha_state; entity lifecycle hookshomeassistant/helpers/event.py—async_track_state_change,async_track_time_interval
This repo
v2/crates/wifi-densepose-sensing-server/src/main.rs— Axum + Tokio architecture pattern used throughout the existing server stackdocs/adr/ADR-126-ruview-native-ha-port-master.md— HOMECORE master; §5.5 crate naming; §6 compatibility contract; §5.1 RUVIEW-POLICYdocs/adr/ADR-028-esp32-capability-audit.md— witness chain pattern (Ed25519 per state transition)