ANE/bridge at 98ddd2d1901f451aa00850fc52b1622975a09684 - ANE

History

fspecii 98ddd2d190 bridge: add compile_dyn + write_weight — function parameter IOSurfaces Adds a second dynamic weight approach to the bridge alongside the existing BLOBFILE compile path. Instead of packing weights into the spatial dimension of a single large input tensor and slicing them inside MIL (the training_dynamic/ approach), weights are declared as native MIL function parameters backed by persistent IOSurfaces: // training_dynamic/ approach: spatial packing func main<ios18>(tensor<fp32, [1, DIM, 1, SEQ + 4*DIM]> x) { Wq = slice_by_size(x=x, begin=..., size=...); // overhead ... // this PR: native function parameters func main<ios18>(tensor<fp16,[1,K,1,M]> x, tensor<fp16,[1,N,K]> W) { ... } New API: ane_bridge_compile_dyn() — compile with n_weights IOSurface parameters ane_bridge_write_weight() — write fp16 to weight IOSurface (~0.001ms) ane_bridge_write_weight_f32() — write fp32 with NEON conversion ane_bridge_copy_io() — direct output→input copy, no CPU round-trip ane_bridge_begin/end_realtime() — 90.6% p99 jitter reduction Compile cache fix: ANE only writes net.plist for parameter-based models (no data file). try_cache_restore now checks net.plist only; data is saved/restored conditionally for BLOBFILE models that do produce it. Also removes the pre-built libane_bridge.dylib binary from version control. Performance vs spatial packing (Stories110M, 12 layers, M-series): training_dynamic/ (slice approach): 110ms/step function parameter approach: 76.9ms/step (-30%) The slice/reshape/transpose overhead per weight matrix explains the gap. Both compile once at startup; weight updates are IOSurface writes in both cases. Tested: test_bridge.m — 15/15 assertions across all new API functions.		2026-03-03 15:00:51 +02:00
..
Makefile	Python Bridge+Memory leak fix+More functions	2026-03-03 02:04:36 -05:00
ane_bridge.h	bridge: add compile_dyn + write_weight — function parameter IOSurfaces	2026-03-03 15:00:51 +02:00
ane_bridge.m	bridge: add compile_dyn + write_weight — function parameter IOSurfaces	2026-03-03 15:00:51 +02:00
test_bridge.m	bridge: add compile_dyn + write_weight — function parameter IOSurfaces	2026-03-03 15:00:51 +02:00