Training on Apple Neural Engine
Go to file
gigantic 70bfa4e54e
Revise README for clarity and additional details
Updated README to clarify implementation details and added NEON CPU decode.
2026-03-02 13:39:26 -08:00
training stories110M: 12-layer ANE training with dashboard, 107ms/step 2026-03-01 03:14:39 -08:00
LICENSE Initial release 2026-02-28 00:22:06 -08:00
README.md Revise README for clarity and additional details 2026-03-02 13:39:26 -08:00
api_exploration.m Initial release 2026-02-28 00:22:06 -08:00
inmem_basic.m Initial release 2026-02-28 00:22:06 -08:00
inmem_bench.m Initial release 2026-02-28 00:22:06 -08:00
inmem_peak.m Initial release 2026-02-28 00:22:06 -08:00
sram_bench.m Initial release 2026-02-28 00:22:06 -08:00
sram_probe.m Initial release 2026-02-28 00:22:06 -08:00

README.md

ANE Training — Backpropagation on Apple Neural Engine

You might be asking, "why the FUCK would you pick GPT2?"

Have you read the art bro? Have you? Nah. I doubt it.

GPT2 had more soul in it's theoretical pinky finger than all of us combined.

But I digress..

What This Is

A from-scratch implementation of transformer training (forward + backward pass) running on the ANE in Apple Silicon with NEON cpu decode.

I forked diz shit and need to write out everything different so stay tuned.

Disclaimer

This project is independent research into Apple Neural Engine architecture. It uses undocumented APIs discovered through runtime introspection for research and educational purposes under fair use and interoperability provisions (see Sega v. Accolade, 1992; DMCA §1201(f)). No Apple proprietary code or binaries are included in this repository. This project is not affiliated with or endorsed by Apple Inc. Use at your own risk.

License

MIT — see LICENSE