berkus/ANE - ANE

Author	SHA1	Message	Date
log-wade	9fbd4dff5b	fix: guard short token datasets in train_large_ane and dynamic pipeline - Add n_tokens <= SEQ+1 check in train_large_ane.m and training_dynamic/train.m - Prevents underflow in max_pos and possible OOB reads (aligns with train_large.m) - Add M5 MacBook Pro benchmark result and full output for Issue #3 Made-with: Cursor	2026-03-07 14:12:31 -06:00
maderix	efcf193075	Add model config to benchmark report, update README with current results Benchmark report now includes full Stories110M model configuration (arch, layers, dims, kernels). README updated: 12-layer results replace stale single-layer numbers, limitations reflect current state.	2026-03-04 06:13:21 -08:00
maderix	1a7d8846b2	Add NE core counts, clarify FP16 vs rated TOPS methodology All chips have 16 NE cores except Ultra (32 via UltraFusion). M4 38 TOPS is INT8/mixed-precision, not comparable to M3 FP16 spec.	2026-03-04 06:11:29 -08:00
maderix	050bc4fdf0	Add cross-generation ANE benchmark report from issue #3 Community-submitted results for M1 Pro/Max, M3 Pro, M4 Pro/Max, M5. Includes training performance, peak throughput, MIL compatibility matrix, and structured JSON data.	2026-03-04 05:30:00 -08:00