hermes atlas
apr·2026 153·repos hermes·v0.10.0 ★ star this repo

NousResearch/hermes-agent-self-evolution official

⚒ Evolutionary self-improvement for Hermes Agent — optimize skills, prompts, and code using DSPy + GEPA

★ 3.7K langPython maintainerNous Research updated2026-03-29

Hermes Agent Self-Evolution is an automated optimization framework designed to improve the performance of Hermes Agent without requiring GPU training. It utilizes DSPy and Genetic-Pareto Prompt Evolution (GEPA) to analyze execution traces and mutate skill files, tool descriptions, and system prompts based on reflective feedback. The system validates candidate variants through constraint gates, including size limits and full test suite passes, before proposing improvements via pull requests. It supports evaluation using both synthetic data and real session history from tools like Claude Code and Copilot.

  • Optimizes agent skills and prompts using DSPy and GEPA engines
  • Operates via API calls without requiring local GPU resources
  • Enforces strict guardrails including size limits and 100% test pass rates
full readme from github

🧬 Hermes Agent Self-Evolution

Evolutionary self-improvement for Hermes Agent.

Hermes Agent Self-Evolution uses DSPy + GEPA (Genetic-Pareto Prompt Evolution) to automatically evolve and optimize Hermes Agent's skills, tool descriptions, system prompts, and code — producing measurably better versions through reflective evolutionary search.

No GPU training required. Everything operates via API calls — mutating text, evaluating results, and selecting the best variants. ~$2-10 per optimization run.

How It Works

Read current skill/prompt/tool ──► Generate eval dataset
                                        │
                                        ▼
                                   GEPA Optimizer ◄── Execution traces
                                        │                    ▲
                                        ▼                    │
                                   Candidate variants ──► Evaluate
                                        │
                                   Constraint gates (tests, size limits, benchmarks)
                                        │
                                        ▼
                                   Best variant ──► PR against hermes-agent

GEPA reads execution traces to understand why things fail (not just that they failed), then proposes targeted improvements. ICLR 2026 Oral, MIT licensed.

Quick Start

# Install
git clone https://github.com/NousResearch/hermes-agent-self-evolution.git
cd hermes-agent-self-evolution
pip install -e ".[dev]"

# Point at your hermes-agent repo
export HERMES_AGENT_REPO=~/.hermes/hermes-agent

# Evolve a skill (synthetic eval data)
python -m evolution.skills.evolve_skill \
    --skill github-code-review \
    --iterations 10 \
    --eval-source synthetic

# Or use real session history from Claude Code, Copilot, and Hermes
python -m evolution.skills.evolve_skill \
    --skill github-code-review \
    --iterations 10 \
    --eval-source sessiondb

What It Optimizes

Phase Target Engine Status
Phase 1 Skill files (SKILL.md) DSPy + GEPA ✅ Implemented
Phase 2 Tool descriptions DSPy + GEPA 🔲 Planned
Phase 3 System prompt sections DSPy + GEPA 🔲 Planned
Phase 4 Tool implementation code Darwinian Evolver 🔲 Planned
Phase 5 Continuous improvement loop Automated pipeline 🔲 Planned

Engines

Engine What It Does License
DSPy + GEPA Reflective prompt evolution — reads execution traces, proposes targeted mutations MIT
Darwinian Evolver Code evolution with Git-based organisms AGPL v3 (external CLI only)

Guardrails

Every evolved variant must pass:

  1. Full test suitepytest tests/ -q must pass 100%
  2. Size limits — Skills ≤15KB, tool descriptions ≤500 chars
  3. Caching compatibility — No mid-conversation changes
  4. Semantic preservation — Must not drift from original purpose
  5. PR review — All changes go through human review, never direct commit

Full Plan

See PLAN.md for the complete architecture, evaluation data strategy, constraints, benchmarks integration, and phased timeline.

License

MIT — © 2026 Nous Research