1016 B
ANE Training — Backpropagation on Apple Neural Engine
You might be asking, "why the FUCK would you pick GPT2?"
Have you read the art bro? Have you? Nah. I doubt it.
GPT2 had more soul in it's theoretical pinky finger than all of us combined.
But I digress..
What This Is
A from-scratch implementation of transformer training (forward + backward pass) running on the ANE in Apple Silicon with NEON cpu decode.
I forked diz shit and need to write out everything different so stay tuned.
Disclaimer
This project is independent research into Apple Neural Engine architecture. It uses undocumented APIs discovered through runtime introspection for research and educational purposes under fair use and interoperability provisions (see Sega v. Accolade, 1992; DMCA §1201(f)). No Apple proprietary code or binaries are included in this repository. This project is not affiliated with or endorsed by Apple Inc. Use at your own risk.
License
MIT — see LICENSE