wifi-densepose/vendor/midstream/AIMDS/docs/guides/ARCHITECTURE.md

7.3 KiB

AIMDS System Architecture

Overview

AIMDS is a production-ready AI Model Defense System designed to detect and mitigate threats against AI models including prompt injection, jailbreaks, and model manipulation attacks.

System Components

1. Detection Layer (Rust)

Location: crates/aimds-detection/

Responsibilities:

  • Real-time pattern matching using Aho-Corasick and Regex
  • Input sanitization and threat neutralization
  • Nanosecond-precision task scheduling

Key Modules:

  • pattern_matcher.rs: Multi-strategy threat detection
  • sanitizer.rs: Input cleaning and normalization
  • scheduler.rs: High-performance task scheduling using Midstream's nanosecond-scheduler

Performance Targets:

  • Pattern matching: <10ms p99
  • Sanitization: <5ms p99
  • Scheduling overhead: <1ms p99

2. Analysis Layer (Rust)

Location: crates/aimds-analysis/

Responsibilities:

  • Behavioral analysis using temporal attractors
  • Policy verification with LTL checking
  • Strange-loop pattern detection

Key Modules:

  • behavioral.rs: Temporal attractor-based anomaly detection
  • policy_verifier.rs: LTL-based policy enforcement
  • ltl_checker.rs: Linear Temporal Logic verification

Performance Targets:

  • Behavioral analysis: <100ms p99
  • Policy verification: <500ms p99
  • LTL checking: <200ms p99

3. Response Layer (Rust)

Location: crates/aimds-response/

Responsibilities:

  • Meta-learning from attack patterns
  • Adaptive mitigation strategy generation
  • Automated threat response

Key Modules:

  • meta_learning.rs: Strange-loop powered adaptive learning
  • adaptive.rs: Dynamic response strategy adjustment
  • mitigations.rs: Threat neutralization actions

Performance Targets:

  • Response generation: <50ms p99
  • Mitigation application: <30ms p99
  • Learning update: <100ms p99

4. API Gateway (TypeScript)

Location: src/

Responsibilities:

  • HTTP/REST API exposure
  • AgentDB vector search integration
  • Lean theorem proving integration
  • Metrics and telemetry

Key Modules:

  • gateway/server.ts: Express server and routing
  • agentdb/client.ts: Vector database integration (150x faster)
  • lean-agentic/verifier.ts: Formal verification
  • monitoring/metrics.ts: Prometheus metrics

Performance Targets:

  • API response: <200ms p99
  • Vector search: <5ms p99
  • Theorem proving: <1s p99

Data Flow

1. Request arrives at TypeScript Gateway
   ↓
2. Input validation and rate limiting
   ↓
3. Detection Layer (Rust)
   - Pattern matching
   - Sanitization
   - Scheduling
   ↓
4. Analysis Layer (Rust)
   - Behavioral analysis
   - Policy verification
   - LTL checking
   ↓
5. Response Layer (Rust)
   - Meta-learning
   - Strategy generation
   - Mitigation application
   ↓
6. Response returned via Gateway

Integration Points

Midstream Platform

  • temporal-compare: High-performance temporal comparison
  • nanosecond-scheduler: Sub-microsecond task scheduling
  • temporal-attractor-studio: Behavioral pattern analysis
  • temporal-neural-solver: Neural network-based threat solving
  • strange-loop: Self-referential pattern detection

External Services

  • AgentDB: 150x faster vector database for pattern caching
  • Lean-Agentic: Formal verification and theorem proving
  • Redis: Caching and rate limiting
  • Prometheus: Metrics collection
  • Grafana: Visualization

Deployment Architecture

Docker Compose (Development)

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Gateway   │───▶│   Backend   │───▶│   AgentDB   │
│  (Node.js)  │    │   (Rust)    │    │  (Vector)   │
└─────────────┘    └─────────────┘    └─────────────┘
       │                   │                   │
       └───────────────────┴───────────────────┘
                           │
                    ┌──────▼──────┐
                    │    Redis    │
                    └─────────────┘

Kubernetes (Production)

┌───────────────────────────────────────────┐
│          Load Balancer (80/443)           │
└────────────────┬──────────────────────────┘
                 │
    ┌────────────┴────────────┐
    │                         │
┌───▼────┐              ┌────▼────┐
│Gateway │ (Replicas=3) │Backend  │ (Replicas=3)
│  Pod   │              │   Pod   │
└───┬────┘              └────┬────┘
    │                        │
    └────────┬───────────────┘
             │
    ┌────────▼─────────┐
    │   Services:      │
    │   - Redis        │
    │   - AgentDB      │
    │   - Prometheus   │
    └──────────────────┘

Security Considerations

Input Validation

  • All inputs sanitized before processing
  • Pattern matching on multiple layers
  • Rate limiting per user/IP

Authentication

  • API key authentication
  • Role-based access control (RBAC)
  • Session management

Data Protection

  • Encryption at rest (Redis)
  • Encryption in transit (TLS)
  • Secure secret management (Kubernetes Secrets)

Threat Mitigation

  • Multiple detection strategies
  • Adaptive learning from attacks
  • Automated response workflows
  • Human-in-the-loop for critical decisions

Scalability

Horizontal Scaling

  • Stateless gateway (scales with load)
  • Stateless backend (scales with CPU)
  • Distributed caching (Redis Cluster)
  • Vector search sharding (AgentDB)

Performance Optimization

  • Request batching
  • Connection pooling
  • Cache-first architecture
  • Async/await throughout

Resource Management

  • CPU: 500m-2000m per gateway pod
  • Memory: 512Mi-2Gi per gateway pod
  • CPU: 1000m-4000m per backend pod
  • Memory: 1Gi-4Gi per backend pod

Monitoring & Observability

Metrics (Prometheus)

  • Request rate, latency, errors
  • Detection accuracy and false positives
  • Analysis performance
  • Resource utilization

Tracing (OpenTelemetry)

  • End-to-end request tracing
  • Distributed context propagation
  • Performance bottleneck identification

Logging (Winston/Tracing)

  • Structured JSON logs
  • Log aggregation (ELK/Loki)
  • Alert triggers

Future Enhancements

  1. Multi-model support: Extend beyond Claude to other LLMs
  2. Advanced learning: Reinforcement learning for response strategies
  3. Federated detection: Share threat intelligence across deployments
  4. GPU acceleration: CUDA support for neural analysis
  5. Edge deployment: Lightweight version for edge computing

References