ANE/CONTRIBUTION_SUBMISSION.md

3.4 KiB
Raw Blame History

Contribution submission guide

This file summarizes what was done on branch contribution/benchmark-m5-and-fixes and how to submit it.


1. Benchmark (submit to Issue #3)

Link: https://github.com/maderix/ANE/issues/3

Post this as a new comment:

## M5 MacBook Pro benchmark (static pipeline, 20 steps)

- **Chip:** Apple M5, 10-core (4P+6E)
- **RAM:** 24 GB
- **macOS:** 26.3 (Build 25D125)
- **Run:** `./train_large --data ./tinystories_data00.bin --steps 20 --lr 1e-4`

### Efficiency report
- Total steps: 20
- Wall time: 10423 ms (10.4 s)
- Compile time: 7187 ms (69.0%)
- Train time: 2542 ms (24.4%)
- **Avg train: 127.1 ms/step**
- ANE TFLOPS: 0.73 sustained
- ANE utilization: 4.6% of 15.8 TFLOPS

Full output with JSON lines is in `benchmarks/my_m5_benchmark_output.txt` (or paste the contents below).

Then paste the contents of benchmarks/my_m5_benchmark_output.txt in the same comment, or attach it.


2. Bug fix (PR)

Fix: Guard short token datasets in train_large_ane.m and training/training_dynamic/train.m.

Why: When n_tokens <= SEQ + 1, the expression max_pos = n_tokens - SEQ - 1 underflows (unsigned), leading to a huge random range and possible out-of-bounds reads. train_large.m already had this guard; the other two pipelines did not.

Changes:

  • training/train_large_ane.m: After n_tokens = data_len / 2, add a check that fails early with a clear error, munmap and close the fd, and return 1.
  • training/training_dynamic/train.m: Same guard added.

Suggested PR title: fix: guard short token datasets in train_large_ane and dynamic pipeline

Suggested PR description:

## Summary
- Add a token dataset length guard in `training/train_large_ane.m`
- Add the same guard in `training/training_dynamic/train.m`
- Fail early with a clear error when the dataset is too short for one (input, target) window

## Why
Both paths use `max_pos = n_tokens - SEQ - 1`. When `n_tokens <= SEQ + 1`, this unsigned subtraction underflows, producing a huge range and potentially out-of-bounds reads. `train_large.m` already had this guard (lines 299304); this PR aligns the other two pipelines.

## Validation
- `make -C training train_large_ane` — builds
- `make -C training/training_dynamic train` — builds
- With a too-short data file, both binaries exit with the new error message.

3. Optional: benchmark data in repo

Branch also adds:

  • benchmarks/my_m5_benchmark_output.txt — full benchmark log
  • One new entry in benchmarks/community_results.json for this M5 run (contributor: log-wade)

You can either:

  • Include the community_results.json update in the same PR as the bug fix, or
  • Omit it and only post the benchmark to Issue #3 (maintainer may update the report from the issue).

4. Before opening the PR

  1. Fork the repo on GitHub (if you havent): https://github.com/maderix/ANE → Fork.
  2. Add your fork as a remote and push:
    git remote add myfork git@github.com:YOUR_USERNAME/ANE.git
    git push myfork contribution/benchmark-m5-and-fixes
    
  3. Open a PR from myfork/contribution/benchmark-m5-and-fixes to maderix/ANE main.
  4. Post the benchmark comment to Issue #3 (link above).

5. Replace contributor name

In benchmarks/community_results.json, the new entry uses "contributor": "log-wade". Change that to your GitHub username if different.