nanogpt-training by benchflow-ai

Train GPT-2 scale models (~124M parameters) efficiently on a single GPU. Covers GPT-124M architecture, FineWeb dataset loading, modern optimizers (Muon, AdamW), mixed precision training, and training loop implementation.

Data & Analytics

231 Stars

165 Forks

Updated May 1, 2026, 03:15 AM

Why Use This

This skill provides specialized capabilities for benchflow-ai's codebase.

Use Cases

Developing new features in the benchflow-ai repository
Refactoring existing code to follow benchflow-ai standards
Understanding and working with benchflow-ai's codebase structure

Install Guide

2 steps

1

Download Ananke

Skip this step if Ananke is already installed.
2

Install inside Ananke

Click Install Skill, paste the link below, then press Install.

https://github.com/benchflow-ai/skillsbench/tree/main/tasks/mhc-layer-impl/environment/skills/nanogpt-training

Skill Snapshot

Auto scan of skill assets. Informational only.

Valid SKILL.md

Checks against SKILL.md specification

Source & Community

Repository skillsbench

Skill Version

main

Community

231 165

Updated At May 1, 2026, 03:15 AM

Skill Stats

SKILL.md 0 Lines

Total Files 1

Total Size 0 B

License NOASSERTION

Source

GitHub Repository ↗ Commit main ↗ skill.extrachatgpt.com ↗