NousResearch/hermes-agent-self-evolution official
⚒ Evolutionary self-improvement for Hermes Agent — optimize skills, prompts, and code using DSPy + GEPA
Hermes Agent Self-Evolution is an automated optimization framework designed to improve the performance of Hermes Agent without requiring GPU training. It utilizes DSPy and Genetic-Pareto Prompt Evolution (GEPA) to analyze execution traces and mutate skill files, tool descriptions, and system prompts based on reflective feedback. The system validates candidate variants through constraint gates, including size limits and full test suite passes, before proposing improvements via pull requests. It supports evaluation using both synthetic data and real session history from tools like Claude Code and Copilot.
- Optimizes agent skills and prompts using DSPy and GEPA engines
- Operates via API calls without requiring local GPU resources
- Enforces strict guardrails including size limits and 100% test pass rates
full readme from github
🧬 Hermes Agent Self-Evolution
Evolutionary self-improvement for Hermes Agent.
Hermes Agent Self-Evolution uses DSPy + GEPA (Genetic-Pareto Prompt Evolution) to automatically evolve and optimize Hermes Agent's skills, tool descriptions, system prompts, and code — producing measurably better versions through reflective evolutionary search.
No GPU training required. Everything operates via API calls — mutating text, evaluating results, and selecting the best variants. ~$2-10 per optimization run.
How It Works
Read current skill/prompt/tool ──► Generate eval dataset
│
▼
GEPA Optimizer ◄── Execution traces
│ ▲
▼ │
Candidate variants ──► Evaluate
│
Constraint gates (tests, size limits, benchmarks)
│
▼
Best variant ──► PR against hermes-agent
GEPA reads execution traces to understand why things fail (not just that they failed), then proposes targeted improvements. ICLR 2026 Oral, MIT licensed.
Quick Start
# Install
git clone https://github.com/NousResearch/hermes-agent-self-evolution.git
cd hermes-agent-self-evolution
pip install -e ".[dev]"
# Point at your hermes-agent repo
export HERMES_AGENT_REPO=~/.hermes/hermes-agent
# Evolve a skill (synthetic eval data)
python -m evolution.skills.evolve_skill \
--skill github-code-review \
--iterations 10 \
--eval-source synthetic
# Or use real session history from Claude Code, Copilot, and Hermes
python -m evolution.skills.evolve_skill \
--skill github-code-review \
--iterations 10 \
--eval-source sessiondb
What It Optimizes
| Phase | Target | Engine | Status |
|---|---|---|---|
| Phase 1 | Skill files (SKILL.md) | DSPy + GEPA | ✅ Implemented |
| Phase 2 | Tool descriptions | DSPy + GEPA | 🔲 Planned |
| Phase 3 | System prompt sections | DSPy + GEPA | 🔲 Planned |
| Phase 4 | Tool implementation code | Darwinian Evolver | 🔲 Planned |
| Phase 5 | Continuous improvement loop | Automated pipeline | 🔲 Planned |
Engines
| Engine | What It Does | License |
|---|---|---|
| DSPy + GEPA | Reflective prompt evolution — reads execution traces, proposes targeted mutations | MIT |
| Darwinian Evolver | Code evolution with Git-based organisms | AGPL v3 (external CLI only) |
Guardrails
Every evolved variant must pass:
- Full test suite —
pytest tests/ -qmust pass 100% - Size limits — Skills ≤15KB, tool descriptions ≤500 chars
- Caching compatibility — No mid-conversation changes
- Semantic preservation — Must not drift from original purpose
- PR review — All changes go through human review, never direct commit
Full Plan
See PLAN.md for the complete architecture, evaluation data strategy, constraints, benchmarks integration, and phased timeline.
License
MIT — © 2026 Nous Research