berkus/ANE - ANE

Commit Graph

Author	SHA1	Message	Date
Claude	7b6a18a059	Add ANE int8/int4 quantization probe Probe whether Apple Neural Engine executes quantized ops natively (faster int8-int8 compute path) or just dequantizes to fp16 at load time. Tests 5 approaches at transformer-representative dimensions: 1. FP16 baseline conv (baked weights) 2. INT8 via constexpr_affine_dequantize (per-channel scale+zp) 3. UINT4 via constexpr_affine_dequantize (per-channel) 4. UINT4 via constexpr_blockwise_shift_scale (block_size=32) 5. 4-bit palettized via constexpr_lut_to_dense (16-entry LUT) Each test compiles MIL → ANE kernel, benchmarks 100 evals, reports TFLOPS. If int8 shows ~2x fp16 TFLOPS, ANE has native int8 compute. If same TFLOPS, it's dequant-only (still useful for memory savings). Build: xcrun clang -O2 -fobjc-arc -o quant_probe quant_probe.m \ -framework Foundation -framework IOSurface -ldl https://claude.ai/code/session_01U5HLjsm4iUzL9iDaHbxeRB	2026-03-03 01:02:05 +00:00
gigantic	8c9161add6	Update README.md	2026-03-02 14:36:28 -08:00
m0at	5271c00281	Update README to reflect GPT-2 inference, M5 findings, and upstream attribution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 14:26:48 -08:00
gigantic	70bfa4e54e	Revise README for clarity and additional details Updated README to clarify implementation details and added NEON CPU decode.	2026-03-02 13:39:26 -08:00
gigantic	40a5384074	Revise README for project fork and updates Forked the project and updated the README to reflect changes.	2026-03-02 13:31:31 -08:00
gigantic	6b8d69b93d	Update README with project details and structure	2026-03-02 13:03:27 -08:00
gigantic	a102d27fa2	Update README.md	2026-03-02 13:02:53 -08:00
maderix	4d67db1bdb	stories110M: 12-layer ANE training with dashboard, 107ms/step - Scale to full stories110M (109M params, 12 layers) with real TinyStories data - vDSP-vectorized cross-entropy (110ms→14ms), NEON fp16 IO, async dW - TUI dashboard: loss curve, ANE/CPU power, CPU/memory graphs, text generation - Split into modular headers: config, io, mil, cpu_ops	2026-03-01 03:14:39 -08:00
maderix	f213c8db68	Initial release	2026-02-28 00:22:06 -08:00

9 Commits All Branches Search

9 Commits

All Branches