llm-evaluation by wshobson

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

Coding
25.1K Stars
2.8K Forks
Updated Jan 9, 2026, 03:41 PM

Why Use This

This skill provides specialized capabilities for wshobson's codebase.

Use Cases

  • Developing new features in the wshobson repository
  • Refactoring existing code to follow wshobson standards
  • Understanding and working with wshobson's codebase structure

Skill Snapshot

Auto scan of skill assets. Informational only.

Valid SKILL.md

Checks against SKILL.md specification

Source & Community

Repository agents
Skill Version
main
Community
25.1K 2.8K
Updated At Jan 9, 2026, 03:41 PM

Skill Stats

SKILL.md 472 Lines
Total Files 1
Total Size 0 B
License NOASSERTION