nanogpt-training by benchflow-ai
Train GPT-2 scale models (~124M parameters) efficiently on a single GPU. Covers GPT-124M architecture, FineWeb dataset loading, modern optimizers (Muon, AdamW), mixed precision training, and training loop implementation.
Data & Analytics
231 Stars
165 Forks
Updated May 1, 2026, 03:15 AM
Why Use This
This skill provides specialized capabilities for benchflow-ai's codebase.
Use Cases
- Developing new features in the benchflow-ai repository
- Refactoring existing code to follow benchflow-ai standards
- Understanding and working with benchflow-ai's codebase structure
Install Guide
2 steps- 1
Skip this step if Ananke is already installed.
- 2
Skill Snapshot
Auto scan of skill assets. Informational only.
Valid SKILL.md
Checks against SKILL.md specification
Source & Community
Skill Stats
SKILL.md 0 Lines
Total Files 1
Total Size 0 B
License NOASSERTION