Commit Graph

4 Commits

Author SHA1 Message Date
lixiang c3263bd618 Add M5 Max benchmark — first H17 ANE on record
_ANEDeviceInfo.aneSubType returns "h17" on M5 Max (M4 / base M5 = "h16"), but
peak FP16 (19.27 TFLOPS) and INT8 W8A8 (35.61 TOPS) match M4 within 4%.
Stories110M static 90.0 ms/step, dynamic 73.5 ms/step; Qwen3-0.6B dynamic
320.0 ms/step (1.29× M4 baseline). Training gains over base M5 are CPU-driven
(12 P-cores + Accelerate), not ANE-driven.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:18:52 +08:00
maderix efcf193075 Add model config to benchmark report, update README with current results
Benchmark report now includes full Stories110M model configuration
(arch, layers, dims, kernels). README updated: 12-layer results
replace stale single-layer numbers, limitations reflect current state.
2026-03-04 06:13:21 -08:00
maderix 1a7d8846b2 Add NE core counts, clarify FP16 vs rated TOPS methodology
All chips have 16 NE cores except Ultra (32 via UltraFusion).
M4 38 TOPS is INT8/mixed-precision, not comparable to M3 FP16 spec.
2026-03-04 06:11:29 -08:00
maderix 050bc4fdf0 Add cross-generation ANE benchmark report from issue #3
Community-submitted results for M1 Pro/Max, M3 Pro, M4 Pro/Max, M5.
Includes training performance, peak throughput, MIL compatibility
matrix, and structured JSON data.
2026-03-04 05:30:00 -08:00