Training on Apple Neural Engine

Go to file

gigantic 70bfa4e54e Revise README for clarity and additional details Updated README to clarify implementation details and added NEON CPU decode.		2026-03-02 13:39:26 -08:00
training	stories110M: 12-layer ANE training with dashboard, 107ms/step	2026-03-01 03:14:39 -08:00
LICENSE	Initial release	2026-02-28 00:22:06 -08:00
README.md	Revise README for clarity and additional details	2026-03-02 13:39:26 -08:00
api_exploration.m	Initial release	2026-02-28 00:22:06 -08:00
inmem_basic.m	Initial release	2026-02-28 00:22:06 -08:00
inmem_bench.m	Initial release	2026-02-28 00:22:06 -08:00
inmem_peak.m	Initial release	2026-02-28 00:22:06 -08:00
sram_bench.m	Initial release	2026-02-28 00:22:06 -08:00
sram_probe.m	Initial release	2026-02-28 00:22:06 -08:00

README.md

ANE Training — Backpropagation on Apple Neural Engine

You might be asking, "why the FUCK would you pick GPT2?"

Have you read the art bro? Have you? Nah. I doubt it.

GPT2 had more soul in it's theoretical pinky finger than all of us combined.

But I digress..

What This Is

A from-scratch implementation of transformer training (forward + backward pass) running on the ANE in Apple Silicon with NEON cpu decode.

I forked diz shit and need to write out everything different so stay tuned.

Disclaimer

This project is independent research into Apple Neural Engine architecture. It uses undocumented APIs discovered through runtime introspection for research and educational purposes under fair use and interoperability provisions (see Sega v. Accolade, 1992; DMCA §1201(f)). No Apple proprietary code or binaries are included in this repository. This project is not affiliated with or endorsed by Apple Inc. Use at your own risk.

License

MIT — see LICENSE