ANE/CONTRIBUTION_SUBMISSION.md

96 lines
3.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Contribution submission guide
This file summarizes what was done on branch `contribution/benchmark-m5-and-fixes` and how to submit it.
---
## 1. Benchmark (submit to Issue #3)
**Link:** https://github.com/maderix/ANE/issues/3
**Post this as a new comment:**
```
## M5 MacBook Pro benchmark (static pipeline, 20 steps)
- **Chip:** Apple M5, 10-core (4P+6E)
- **RAM:** 24 GB
- **macOS:** 26.3 (Build 25D125)
- **Run:** `./train_large --data ./tinystories_data00.bin --steps 20 --lr 1e-4`
### Efficiency report
- Total steps: 20
- Wall time: 10423 ms (10.4 s)
- Compile time: 7187 ms (69.0%)
- Train time: 2542 ms (24.4%)
- **Avg train: 127.1 ms/step**
- ANE TFLOPS: 0.73 sustained
- ANE utilization: 4.6% of 15.8 TFLOPS
Full output with JSON lines is in `benchmarks/my_m5_benchmark_output.txt` (or paste the contents below).
```
Then paste the contents of `benchmarks/my_m5_benchmark_output.txt` in the same comment, or attach it.
---
## 2. Bug fix (PR)
**Fix:** Guard short token datasets in `train_large_ane.m` and `training/training_dynamic/train.m`.
**Why:** When `n_tokens <= SEQ + 1`, the expression `max_pos = n_tokens - SEQ - 1` underflows (unsigned), leading to a huge random range and possible out-of-bounds reads. `train_large.m` already had this guard; the other two pipelines did not.
**Changes:**
- `training/train_large_ane.m`: After `n_tokens = data_len / 2`, add a check that fails early with a clear error, munmap and close the fd, and return 1.
- `training/training_dynamic/train.m`: Same guard added.
**Suggested PR title:** `fix: guard short token datasets in train_large_ane and dynamic pipeline`
**Suggested PR description:**
```markdown
## Summary
- Add a token dataset length guard in `training/train_large_ane.m`
- Add the same guard in `training/training_dynamic/train.m`
- Fail early with a clear error when the dataset is too short for one (input, target) window
## Why
Both paths use `max_pos = n_tokens - SEQ - 1`. When `n_tokens <= SEQ + 1`, this unsigned subtraction underflows, producing a huge range and potentially out-of-bounds reads. `train_large.m` already had this guard (lines 299304); this PR aligns the other two pipelines.
## Validation
- `make -C training train_large_ane` — builds
- `make -C training/training_dynamic train` — builds
- With a too-short data file, both binaries exit with the new error message.
```
---
## 3. Optional: benchmark data in repo
Branch also adds:
- `benchmarks/my_m5_benchmark_output.txt` — full benchmark log
- One new entry in `benchmarks/community_results.json` for this M5 run (contributor: `log-wade`)
You can either:
- Include the `community_results.json` update in the same PR as the bug fix, or
- Omit it and only post the benchmark to Issue #3 (maintainer may update the report from the issue).
---
## 4. Before opening the PR
1. **Fork the repo** on GitHub (if you havent): https://github.com/maderix/ANE → Fork.
2. **Add your fork as a remote and push:**
```bash
git remote add myfork git@github.com:YOUR_USERNAME/ANE.git
git push myfork contribution/benchmark-m5-and-fixes
```
3. Open a PR from `myfork/contribution/benchmark-m5-and-fixes` to `maderix/ANE` main.
4. Post the benchmark comment to Issue #3 (link above).
---
## 5. Replace contributor name
In `benchmarks/community_results.json`, the new entry uses `"contributor": "log-wade"`. Change that to your GitHub username if different.