nanogpt-training by benchflow-ai

Train GPT-2 scale models (~124M parameters) efficiently on a single GPU. Covers GPT-124M architecture, tokenized dataset loading (e.g., HuggingFace Hub shards), modern optimizers (Muon, AdamW), mixed precision training, and training loop implementation.

Data & Analytics
231 Stars
165 Forks
Updated Jan 19, 2026, 03:59 AM

Why Use This

This skill provides specialized capabilities for benchflow-ai's codebase.

Use Cases

  • Developing new features in the benchflow-ai repository
  • Refactoring existing code to follow benchflow-ai standards
  • Understanding and working with benchflow-ai's codebase structure

Install Guide

2 steps
  1. 1

    Download Ananke

    Skip this step if Ananke is already installed.

  2. 2

    Install inside Ananke

    Click Install Skill, paste the link below, then press Install.

    https://github.com/benchflow-ai/skillsbench/tree/main/tasks/mhc-layer-impl/environment/skills/nanogpt-training

Skill Snapshot

Auto scan of skill assets. Informational only.

Valid SKILL.md

Checks against SKILL.md specification

Source & Community

Repository skillsbench
Skill Version
main
Community
231 165
Updated At Jan 19, 2026, 03:59 AM

Skill Stats

SKILL.md 113 Lines
Total Files 1
Total Size 0 B
License NOASSERTION