Commit Graph

2 Commits

Author SHA1 Message Date
tom 0a1d841a10 Fix model path: accept argv[1] like train_large does
train_opt had a hardcoded MODEL_PATH that didn't match the working
directory, causing fallback to random init. Now accepts positional
model path argument (e.g., ./train_opt stories110M.bin).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 09:33:58 -04:00
tom 09e9c996bb Add optimized training variant: 14% speedup (107→92 ms/step)
New train_opt target with NEON-vectorized Adam, fp16 activation/gradient
caching, concurrent dW dispatch, pre-allocated buffers, and optional
Metal GPU support. Tested on M3 Max with stories110M.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 09:08:12 -04:00