berkus/ANE - ANE

Author	SHA1	Message	Date
lixiang	c3263bd618	Add M5 Max benchmark — first H17 ANE on record _ANEDeviceInfo.aneSubType returns "h17" on M5 Max (M4 / base M5 = "h16"), but peak FP16 (19.27 TFLOPS) and INT8 W8A8 (35.61 TOPS) match M4 within 4%. Stories110M static 90.0 ms/step, dynamic 73.5 ms/step; Qwen3-0.6B dynamic 320.0 ms/step (1.29× M4 baseline). Training gains over base M5 are CPU-driven (12 P-cores + Accelerate), not ANE-driven. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 17:18:52 +08:00
maderix	efcf193075	Add model config to benchmark report, update README with current results Benchmark report now includes full Stories110M model configuration (arch, layers, dims, kernels). README updated: 12-layer results replace stale single-layer numbers, limitations reflect current state.	2026-03-04 06:13:21 -08:00
maderix	1a7d8846b2	Add NE core counts, clarify FP16 vs rated TOPS methodology All chips have 16 NE cores except Ultra (32 via UltraFusion). M4 38 TOPS is INT8/mixed-precision, not comparable to M3 FP16 spec.	2026-03-04 06:11:29 -08:00
maderix	050bc4fdf0	Add cross-generation ANE benchmark report from issue #3 Community-submitted results for M1 Pro/Max, M3 Pro, M4 Pro/Max, M5. Includes training performance, peak throughput, MIL compatibility matrix, and structured JSON data.	2026-03-04 05:30:00 -08:00