manni07
895b759756
Merge 3575766982 into 4a6f3e40a9
2026-03-04 09:11:56 +01:00
maderix
4c14ed0e25
CLI fixes + --no-ane-extras flag + README benchmark table
...
- Fix positional arg parsing (model_path, steps, lr were silently ignored)
- Add --model, --ckpt flags; forward ckpt_path across exec() restarts
- Add --no-ane-extras to disable ANE classifier/softmax/rmsnorm_bwd
- CPU fallback for softmax/classifier/rmsnorm_bwd when extras disabled
- Update README with 4-way benchmark comparison table (20 steps)
2026-03-03 04:34:55 -08:00
manni07
7c67e78306
fix: address MED security findings (MED-01 to MED-06)
...
- MED-01: IOSurfaceLock() return checked in all 6 I/O functions; early return
on failure prevents data race (stories_io.h, ane_runtime.h)
- MED-02: Per-process/per-call unique temp dirs via getpid()+g_compile_seq
(stories_io.h, ane_runtime.h)
- MED-03: mil_dims_valid() guard in all 7 MIL-gen functions; nil return on
invalid params (ane_mil_gen.h)
- MED-04: CkptHdr.pad[0]=0x01020304 byte-order sentinel; runtime check in
load_checkpoint; _Static_assert for compile-time LE guarantee (train_large.m)
- MED-05: _Static_assert(SEQ%8==0) + ARM64 alignment rationale comment (stories_io.h)
- MED-06: dispatch_once replaces manual g_ane_loaded/g_ane_init_done guards;
thread-safe one-time ANE init (ane_runtime.h, stories_config.h)
ref: docs/reports/security-audit-2026-03-02.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 22:45:19 +01:00
manni07
aa5a6ddd86
fix: address CRIT security findings (CRIT-01 to CRIT-04)
...
- CRIT-01: dlopen() return check + NSClassFromString validation in ane_init()
(ane_runtime.h + stories_config.h); g_ane_ok / g_ane_ok_large flag
only set when all private classes load successfully; stories_config.h
gets re-entry guard (g_ane_init_done) that was previously missing
- CRIT-02: g_ane_ok guard in ane_compile() and compile_kern_mil_w(); NULL check
for inMemoryModel after inMemoryModelWithDescriptor: — prevents crash
when API call returns nil (ane_runtime.h, stories_io.h)
- CRIT-03: Validate fread() return for critical config/header reads to prevent
garbage malloc() sizes; fopen() NULL check in save_checkpoint();
design decision documented (model.h, train_large.m)
- CRIT-04: int -> size_t in build_blob*/build_blob_t/build_blob_fp16; calloc()
NULL checks added; (size_t) cast in malloc() size calculations to
prevent signed integer overflow UB (stories_io.h, model.h)
Simulation: 3 iterations, overall score 96.15% (all criteria >= 95%)
ref: docs/reports/security-audit-2026-03-02.md
2026-03-02 22:38:12 +01:00
m0at
40d3f45631
Add ANE probe tests and training telemetry for M5 optimization
...
Four standalone probe tests to characterize the M5 ANE:
- test_weight_reload: Can weights be hot-swapped via unload+load without recompilation?
- test_perf_stats: Enumerate _ANEPerformanceStats methods/properties and hardware counters
- test_qos_sweep: Measure compile/load/eval latency across QoS 0-63
- test_ane_advanced: Probe SharedEvents, weightsBuffer IOSurface, procedureIndex, VirtualClient
Training telemetry (train_large.m):
- JSON lines to stderr with per-step timing breakdown and per-batch TFLOPS metrics
- Enables external monitoring tools to visualize ANE utilization in real-time
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 22:54:58 -08:00
maderix
4d67db1bdb
stories110M: 12-layer ANE training with dashboard, 107ms/step
...
- Scale to full stories110M (109M params, 12 layers) with real TinyStories data
- vDSP-vectorized cross-entropy (110ms→14ms), NEON fp16 IO, async dW
- TUI dashboard: loss curve, ANE/CPU power, CPU/memory graphs, text generation
- Split into modular headers: config, io, mil, cpu_ops
2026-03-01 03:14:39 -08:00
maderix
f213c8db68
Initial release
2026-02-28 00:22:06 -08:00