adr-108: Kyber post-quantum key exchange for cross-installation federation (#731)
Closes the quantum-resistance gap explicitly deferred from ADR-107. Final ADR in the privacy + federation chain. Replaces DH key exchange in ADR-107's Layer 4 secure aggregation with Kyber-768 KEM (NIST FIPS 203, CNSA 2.0 default). Migration timeline: - Phase 0 (NOW 2026): Classical X25519 (ADR-107 default) - Phase 1 (2026-Q4 -> 2027): Kyber-768 opt-in via --enable-pqc flag - Phase 2 (2027-Q2 -> 2028): Hybrid (X25519 + Kyber-768) becomes default - Phase 3 (2030+): Pure Kyber-768 (classical retired) Why hybrid for Phase 2 (belt-and-braces): - Protects against future Kyber breaks (Kyber is ~5 years old) - Protects against classical breaks (X25519 backup) - Protects against implementation bugs in either primitive - Cost: ~3 kB/round/installation extra (negligible) Why now (record-now-decrypt-later): Adversaries can record federated updates today and decrypt them in 2035 when quantum capabilities arrive. Without ADR-108, the (epsilon, delta) guarantees of ADR-106 silently expire when quantum computers arrive. Proactive migration is cheap insurance. Why Kyber-768 (not 512 or 1024): - NIST FIPS 203 (2024); ~AES-192 equivalent - CNSA 2.0 recommended default - Used by Cloudflare, Google, AWS in 2024-2026 rollouts - Public key 1184 B, ciphertext 1088 B, secret 32 B - 512 lacks CNSA 2.0 sign-off; 1024 doubles bandwidth without benefit LOC: +220 on top of ADR-107. Total federation budget ADR-105+106+107+108: ~1,550 LOC. Threat model: 8 threats, every row has mitigation. Hybrid mode is the belt-and-braces against both Kyber breaks AND classical breaks. ADR CHAIN COMPLETE: 7 ADRs in the privacy + federation chain: ADR-100 (cog packaging) -> ADR-103 (cog example) -> ADR-104 (MCP/CLI) -> ADR-105 (within-installation federation) -> ADR-106 (DP + isolation) -> ADR-107 (cross-installation + SA) -> ADR-108 (PQC key exchange). No remaining unspecified privacy gap at any threat horizon (classical or quantum). Future ADRs catalogued: - ADR-109: PQC signatures (Dilithium replaces Ed25519 in ADR-100) - ADR-110: PQC hardware acceleration on Cognitum-v0 - ADR-111: PQC for cog-store distribution Composes: - R3 / R14 / R15 / R7 / R12 PABS: privacy chain intact through quantum transition - R10 / R11 (long-deployment): benefit most from forward secrecy as data ages Honest scope: - Kyber ~5 years old; hybrid mitigates uncertainty - 'When do we need this?' uncertain (2030 aggressive / 2050+ conservative) - ESP32-S3 timing ~10 ms per handshake estimated negligible; needs measurement - Phase 3 retirement of classical needs future decision Coordination: ticks/tick-28.md, no PROGRESS.md edit.
This commit is contained in:
parent
4e6ef76294
commit
40e5a4d6f2
|
|
@ -0,0 +1,197 @@
|
|||
# ADR-108: Kyber post-quantum key exchange for cross-installation federation
|
||||
|
||||
**Status:** Proposed · **Date:** 2026-05-22 · **Author:** SOTA research loop tick-28 · **Supersedes:** none · **Extends:** ADR-107 (cross-installation federation)
|
||||
|
||||
## Context
|
||||
|
||||
ADR-107 specifies cross-installation federation using **secure aggregation (Bonawitz 2016)** with Diffie-Hellman key exchange for pairwise mask generation. The current implementation would use classical DH (X25519 or P-256), which is **vulnerable to Shor's algorithm** on a sufficiently large fault-tolerant quantum computer.
|
||||
|
||||
ADR-107 noted this as out-of-scope:
|
||||
|
||||
> Current DH key exchange becomes vulnerable to quantum computers. Recommended substitution: Kyber KEM (NIST PQC selected). Mechanical replacement of DH primitives; no protocol change. Future ADR-108 (or amendment to ADR-107).
|
||||
|
||||
This ADR is that future work.
|
||||
|
||||
## Decision
|
||||
|
||||
Adopt **Kyber-768** as the post-quantum key encapsulation mechanism (KEM) replacing Diffie-Hellman in ADR-107's Layer 4 secure aggregation, with an explicit migration timeline tied to NIST CNSA 2.0 guidance and an interim **hybrid mode** (Kyber + X25519) for forward-secrecy belt-and-braces during the migration window.
|
||||
|
||||
### Why Kyber-768
|
||||
|
||||
NIST standardised three Kyber security levels in FIPS 203 (2024):
|
||||
|
||||
| Variant | NIST level | Public key | Ciphertext | Secret | Security |
|
||||
|---|---|---:|---:|---:|---|
|
||||
| Kyber-512 | Level 1 | 800 B | 768 B | 32 B | ~AES-128 |
|
||||
| **Kyber-768** | **Level 3** | **1184 B** | **1088 B** | **32 B** | **~AES-192** |
|
||||
| Kyber-1024 | Level 5 | 1568 B | 1568 B | 32 B | ~AES-256 |
|
||||
|
||||
**Kyber-768** matches AES-192 equivalent security and is the **NIST CNSA 2.0 recommended default** for general-purpose protocols. Used by Cloudflare, Google, AWS in their 2024-2026 PQC rollouts.
|
||||
|
||||
Kyber-512 is sufficient against classical attackers and small quantum computers but doesn't carry CNSA 2.0 sign-off. Kyber-1024 doubles bandwidth without proportional security benefit for our threat model.
|
||||
|
||||
### Hybrid mode (transition window)
|
||||
|
||||
During the migration (2026-2030 estimated), all key exchanges run **both** Kyber-768 AND X25519 in parallel and XOR the shared secrets:
|
||||
|
||||
```
|
||||
shared_secret = SHA-256(kyber_ss || x25519_ss || transcript)
|
||||
```
|
||||
|
||||
This **belt-and-braces** approach protects against:
|
||||
|
||||
- A future Kyber break (unlikely but not impossible — Kyber is ~5 years old)
|
||||
- Implementation bugs in either primitive
|
||||
- Adversaries who can compromise *one* of the two primitives
|
||||
|
||||
Cost: ~2× key-exchange computation, ~2× public-key size. For RuView's per-round overhead this adds ~3 kB / round / installation — negligible.
|
||||
|
||||
After CNSA 2.0 fully retires classical primitives (estimated 2030+), the hybrid layer is removed and pure Kyber-768 is used.
|
||||
|
||||
### Migration timeline
|
||||
|
||||
| Phase | Timeline | What ships |
|
||||
|---|---|---|
|
||||
| Phase 0 (NOW) | 2026 | ADR-107 ships with classical X25519 |
|
||||
| Phase 1 | 2026-Q4 → 2027 | Library upgrade adds Kyber-768; opt-in via `--enable-pqc` flag |
|
||||
| Phase 2 | 2027-Q2 → 2028 | Hybrid mode (X25519 + Kyber-768) becomes default |
|
||||
| Phase 3 | 2030+ | Pure Kyber-768 (classical removed) |
|
||||
|
||||
Phase 1 is the first feature ship. By the time the migration is complete, the post-quantum threat model is approximately the only one that matters.
|
||||
|
||||
### Implementation cost
|
||||
|
||||
| Component | LOC | Notes |
|
||||
|---|---:|---|
|
||||
| Kyber-768 KEM wrapper (over `pqcrypto-kyber` crate) | 80 | Pure Rust, no `unsafe` |
|
||||
| Hybrid mode (XOR + SHA-256 KDF) | 50 | Composes existing primitives |
|
||||
| Protocol version negotiation | 60 | Backward compat with Phase 0 nodes |
|
||||
| Public-key cache extension (size grows from 32 B to 1184 B per peer) | 30 | AgentDB schema update |
|
||||
| Migration documentation | — | This ADR |
|
||||
| End-to-end test (multi-node PQC handshake) | — | Real-installation test |
|
||||
|
||||
Total ~220 LOC additional. Combined federation budget across ADR-105+106+107+108: **~1,550 LOC**.
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
### A. Pure Kyber-768 (no hybrid)
|
||||
|
||||
Status: **rejected for Phase 1-2**. Hybrid provides defense-in-depth at minimal cost; pure-Kyber is fine for Phase 3 once Kyber has had more cryptographic scrutiny.
|
||||
|
||||
### B. NTRU Prime (alternative PQC KEM)
|
||||
|
||||
Status: **rejected**. Kyber has clearer standardisation status (FIPS 203). NTRU Prime is fine cryptographically but doesn't have CNSA 2.0 sign-off.
|
||||
|
||||
### C. Frodo (lattice-based, more conservative parameters)
|
||||
|
||||
Status: **rejected**. Frodo has larger key sizes (~10 kB) and slower operations. Trade-off doesn't justify the security margin given our threat model.
|
||||
|
||||
### D. Code-based KEMs (Classic McEliece)
|
||||
|
||||
Status: **rejected**. Classic McEliece public keys are ~261 kB — unworkable for embedded ESP32-S3 nodes.
|
||||
|
||||
### E. Defer until quantum threat materialises
|
||||
|
||||
Status: **rejected**. Adversaries can record-now-decrypt-later — federated model updates today could be decrypted in 5-10 years when quantum capabilities arrive. ADR-107's privacy guarantees would silently expire without proactive migration.
|
||||
|
||||
## Threat model
|
||||
|
||||
| Threat | Layer that mitigates |
|
||||
|---|---|
|
||||
| Shor's algorithm breaks classical DH | **Kyber-768 KEM** |
|
||||
| Future quantum attack on Kyber (unlikely) | **Hybrid mode** — X25519 still provides classical security |
|
||||
| Implementation bug in Kyber library | **Hybrid mode** — X25519 backup |
|
||||
| Implementation bug in X25519 library | **Hybrid mode** — Kyber backup |
|
||||
| Record-now-decrypt-later (adversary stores ciphertexts) | Forward secrecy from Kyber-768 (each round has fresh ephemeral keys) |
|
||||
| Downgrade attack (force classical-only handshake) | **Protocol version negotiation** — explicit reject of classical-only post-Phase-2 |
|
||||
| Side-channel attack on Kyber implementation | Use constant-time `pqcrypto-kyber` Rust crate; further hardening in future |
|
||||
| Public-key spoofing (Sybil) | Pre-shared trust anchors via cognitum-v0 PKI (ADR-107) |
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **The privacy chain remains intact through the quantum transition.** Without ADR-108, the (ε, δ) guarantees of ADR-106 silently expire when quantum computers arrive.
|
||||
2. **Record-now-decrypt-later attack is defeated.** Federated updates from today won't be decryptable in 2035 with quantum hardware.
|
||||
3. **CNSA 2.0 compliant** by Phase 2; ready for any regulatory requirement that mandates PQC.
|
||||
4. **Hybrid mode is belt-and-braces** — protects against both Kyber breaks AND classical breaks.
|
||||
5. **No protocol change** at the secure-aggregation level — the KEM is a drop-in replacement.
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Adds ~220 LOC** to ADR-107's implementation budget.
|
||||
2. **~3 kB extra per-round per-installation bandwidth** during hybrid mode (negligible).
|
||||
3. **Kyber is ~5 years old** — less battle-tested than X25519. Hybrid mode mitigates this.
|
||||
4. **No clear end-of-life for the hybrid mode** — Phase 3 requires a future decision when CNSA 2.0 retires classical.
|
||||
5. **Public-key cache grows 37×** (32 B → 1184 B per peer); AgentDB schema update needed.
|
||||
|
||||
### What this ADR DOES NOT cover
|
||||
|
||||
1. **Post-quantum digital signatures** — ADR-100 cog signing uses Ed25519 today; a follow-up ADR (likely ADR-109) covers Dilithium / SPHINCS+ substitution.
|
||||
2. **Constant-time hardening of the full Kyber path** — relies on the `pqcrypto-kyber` Rust crate's existing claims.
|
||||
3. **Hardware-acceleration on ESP32-S3** — Kyber-768 is software-only at this scale; the ESP32-S3 can do ~50 ops/sec which is far more than the per-round federation needs.
|
||||
|
||||
## Bridge to existing ADRs
|
||||
|
||||
- **ADR-100 (cog packaging Ed25519 signing)** — separate from key-exchange; PQC signature migration needed independently (future ADR-109).
|
||||
- **ADR-104 (ruview-mcp + ruview-cli)** — MCP tool `ruview_fed_pqc_status` surfaces hybrid-vs-pure mode and migration phase.
|
||||
- **ADR-105 (federation)** + **ADR-106 (DP+isolation)** — operate over secure-aggregation key exchange; transparent to KEM substitution.
|
||||
- **ADR-107 (cross-installation federation)** — directly extended by ADR-108; Layer 4 secure aggregation gets Kyber replacement for DH.
|
||||
|
||||
## Connection to research-loop threads
|
||||
|
||||
- **R3 / R14 / R15** — privacy chain remains intact through quantum transition.
|
||||
- **R7 (mincut adversarial)** — mincut detection operates on application-level deltas, not key exchange; orthogonal to PQC.
|
||||
- **R12 PABS** — same — operates on CSI / model deltas, not key exchange.
|
||||
- **R10 / R11 (wildlife / maritime)** — long-deployment use cases benefit most from forward secrecy because data ages for years.
|
||||
|
||||
## Honest scope
|
||||
|
||||
- **Kyber is recommended by NIST today** but cryptographic confidence will grow over the next decade. The hybrid mode hedges against this uncertainty.
|
||||
- **The "when do we need this?" question** is genuinely uncertain. Estimates of cryptographically-relevant quantum computers range from 2030 (aggressive) to 2050+ (conservative). The proactive migration is cheap insurance.
|
||||
- **ESP32-S3 can compute Kyber-768** but the timing impact in the per-round federation cycle (~10 ms additional per handshake) needs benchmarking on real hardware. Estimated negligible given the existing ~30 s round duration.
|
||||
- **The migration timeline is aspirational** — depends on `pqcrypto-kyber` crate stability + adoption maturity. Plausible alternatives include `liboqs` C-binding or `boring-pq` (Cloudflare's pre-standardisation work, now superseded).
|
||||
- **Pure Kyber (Phase 3) end-of-life for classical** — depends on community standardisation and a future RuView decision; not bindingly specified here.
|
||||
|
||||
## What this ADR closes
|
||||
|
||||
This is the **last ADR in the privacy + federation chain** the research loop has produced:
|
||||
|
||||
1. ADR-100 — cog packaging (foundation)
|
||||
2. ADR-103 — cog-person-count (first cog example)
|
||||
3. ADR-104 — MCP + CLI distribution
|
||||
4. ADR-105 — federated training (within-installation)
|
||||
5. ADR-106 — DP-SGD + biometric primitive isolation
|
||||
6. ADR-107 — cross-installation federation w/ secure aggregation
|
||||
7. **ADR-108 (this)** — post-quantum key exchange
|
||||
|
||||
The chain has formal guarantees at every layer **and** quantum-resistance built in by 2028. **No remaining unspecified privacy gap** at any threat horizon.
|
||||
|
||||
## Implementation plan
|
||||
|
||||
| Phase | What ships | LOC |
|
||||
|---|---|---:|
|
||||
| Phase 1 (2026-Q4) | Kyber-768 wrapper + `--enable-pqc` opt-in | ~140 |
|
||||
| Phase 2 (2027-Q2) | Hybrid mode default | ~80 |
|
||||
| Phase 3 (2030+) | Pure Kyber-768 (remove classical) | -50 (removal) |
|
||||
|
||||
Phase 1 is the first ship.
|
||||
|
||||
## Future ADRs
|
||||
|
||||
- **ADR-109**: PQC digital signatures (Dilithium for cog signing, replacing Ed25519 in ADR-100).
|
||||
- **ADR-110**: PQC hardware acceleration on Cognitum-v0 (offload Kyber from ESP32-S3 if the ~10 ms cycle becomes binding).
|
||||
- **ADR-111**: PQC for `cog-store` distribution (sign-and-verify chain).
|
||||
|
||||
## Decision-making record
|
||||
|
||||
- 2026-05-22 09:37 UTC — drafted by SOTA research loop tick-28 based on ADR-107's explicit deferral. Status: Proposed.
|
||||
- Pending: security-architect (formal PQC threat model review), production-validator (`pqcrypto-kyber` Rust crate stability and ESP32-S3 benchmarking before Phase 1).
|
||||
|
||||
## Honest scope of ADR-108
|
||||
|
||||
- Phase 1 ships in ~1 quarter after ADR-107 lands.
|
||||
- Hybrid mode is the right default for 2027-2030.
|
||||
- Phase 3 (pure Kyber) needs a separate future decision once CNSA 2.0 fully retires classical primitives.
|
||||
- Implementation depends on `pqcrypto-kyber` crate maturity; alternatives exist if it stagnates.
|
||||
- ESP32-S3 timing impact is estimated negligible; needs measurement.
|
||||
|
|
@ -0,0 +1,79 @@
|
|||
# Tick 28 — 2026-05-22 09:40 UTC
|
||||
|
||||
**Thread:** ADR-108 (Kyber post-quantum key exchange)
|
||||
**Verdict:** Final ADR in the privacy + federation chain. Closes the quantum-resistance gap deferred from ADR-107. Hybrid mode (Kyber-768 + X25519) for 2027-2030 migration; pure Kyber-768 for Phase 3.
|
||||
|
||||
## What shipped
|
||||
|
||||
- `docs/adr/ADR-108-kyber-post-quantum-key-exchange.md` — full ADR draft.
|
||||
|
||||
## Headline
|
||||
|
||||
| Phase | Timeline | Cryptography |
|
||||
|---|---|---|
|
||||
| Phase 0 | NOW (2026) | Classical X25519 (ADR-107 default) |
|
||||
| Phase 1 | 2026-Q4 → 2027 | Kyber-768 opt-in via `--enable-pqc` |
|
||||
| Phase 2 | 2027-Q2 → 2028 | Hybrid (X25519 + Kyber-768) becomes default |
|
||||
| Phase 3 | 2030+ | Pure Kyber-768 (classical retired) |
|
||||
|
||||
**Why Kyber-768**: NIST FIPS 203 (2024); ~AES-192 equivalent; CNSA 2.0 default; used by Cloudflare/Google/AWS in 2024-2026 rollouts.
|
||||
|
||||
**Why hybrid for Phase 2**: belt-and-braces against future Kyber breaks (Kyber is ~5 years old) OR classical breaks OR implementation bugs in either primitive.
|
||||
|
||||
## Why now (the record-now-decrypt-later argument)
|
||||
|
||||
Adversaries can record federated updates today and decrypt them in 2035 when quantum capabilities arrive. Without ADR-108, the (ε, δ) guarantees of ADR-106 **silently expire** when quantum computers arrive.
|
||||
|
||||
## Bandwidth + LOC budgets
|
||||
|
||||
Bandwidth: ~3 kB/round/installation extra during hybrid mode (negligible).
|
||||
|
||||
LOC: +220 on top of ADR-107.
|
||||
|
||||
**Total federation budget across ADR-105+106+107+108**: ~1,550 LOC.
|
||||
|
||||
## ADR chain closes
|
||||
|
||||
Final ADR in the privacy + federation chain:
|
||||
|
||||
| # | ADR | What it closes |
|
||||
|---|---|---|
|
||||
| 1 | ADR-100 | cog packaging (foundation) |
|
||||
| 2 | ADR-103 | first cog example (cog-person-count) |
|
||||
| 3 | ADR-104 | MCP + CLI distribution |
|
||||
| 4 | ADR-105 | within-installation federation |
|
||||
| 5 | ADR-106 | DP-SGD + biometric primitive isolation |
|
||||
| 6 | ADR-107 | cross-installation + secure aggregation |
|
||||
| 7 | **ADR-108** | **post-quantum key exchange** |
|
||||
|
||||
**No remaining unspecified privacy gap** at any threat horizon (classical OR quantum).
|
||||
|
||||
## Composes with prior threads
|
||||
|
||||
- R3 / R14 / R15 / R7 / R12 PABS — privacy chain intact through quantum transition
|
||||
- R10 / R11 (long-deployment wildlife / maritime) — benefit most from forward secrecy because data ages for years
|
||||
|
||||
## Honest scope
|
||||
|
||||
- Kyber is ~5 years old (less battle-tested than X25519); hybrid mode mitigates
|
||||
- "When do we need this?" is uncertain (2030 aggressive / 2050+ conservative); proactive migration is cheap insurance
|
||||
- ESP32-S3 timing impact (~10 ms per handshake) estimated negligible vs 30 s round duration; needs benchmarking
|
||||
- Migration timeline depends on `pqcrypto-kyber` Rust crate maturity
|
||||
- Phase 3 retirement of classical needs future decision
|
||||
|
||||
## Future ADRs catalogued
|
||||
|
||||
- **ADR-109**: PQC signatures (Dilithium for cog signing, replaces Ed25519 in ADR-100)
|
||||
- **ADR-110**: PQC hardware acceleration on Cognitum-v0 if timing becomes binding
|
||||
- **ADR-111**: PQC for `cog-store` distribution chain
|
||||
|
||||
## Coordination
|
||||
|
||||
`ticks/tick-28.md`. No PROGRESS.md edit. Branch `research/sota-adr108-kyber`.
|
||||
|
||||
## Remaining loop work
|
||||
|
||||
- R12.1: pose-PABS closed loop (needs Rust, out of scope for synthetic ticks)
|
||||
- Loop retrospective / 00-summary.md (~2.3h until cron stop — premature)
|
||||
|
||||
~2.3h to cron stop. **28 ticks landed.** 4 ADRs in the privacy chain (105/106/107/108). Loop covers everything except R12.1 implementation.
|
||||
Loading…
Reference in New Issue