OnlyTerp/hermes-optimization-guide

Name: hermes-optimization-guide
Author: OnlyTerp

Performance optimization guide for Hermes deployments

★ 506

overview

The Hermes Optimization Guide is a comprehensive technical playbook designed to streamline the deployment and performance tuning of the Hermes Agent. It provides a collection of runnable artifacts, including configuration templates, installable skills, and a one-command VPS bootstrap script. Users can orchestrate multi-agent swarms and various model providers across multiple interfaces like CLI, desktop apps, and chat platforms. The guide supports diverse environments ranging from local NVIDIA hardware to cloud-based sandboxes.

Provides 13 installable skills and 5 opinionated config templates
Includes a one-command script for production VPS deployment
Supports model-agnostic orchestration across diverse hardware and platforms

full readme from github

Hermes Optimization Guide

Current through Hermes Agent v0.18.0 (v2026.7.1) — "The Judgment Release" · 27 parts, 13 installable guide skills, 5 opinionated configs, 4 reference architectures, one-command VPS bootstrap · Now covering Mixture-of-Agents as a first-class model, evidence-based verification + /goal completion contracts, /learn + /journey self-improvement, background subagent fan-out, the maturing Desktop app (Projects, memory graph, multi-terminal), iMessage via Photon (no Mac needed), the NVIDIA RTX / DGX Spark local-hardware story, and gateway scale-to-zero for teams. Bring any model — this guide is about the harness, not the weights.

Other languages: 中文 · 日本語

The End-to-End Hermes Guide — docs + runnable artifacts

Every part you need to go from fresh install to a production Hermes deployment — driven from the native desktop app, the CLI/TUI, a browser admin panel, or 25+ chat platforms (now including iMessage with no Mac required, via Photon). Orchestrate Claude Code / Codex / Gemini CLI through durable Kanban lanes and multi-agent swarms, plug into any MCP server, trace every call in Langfuse, let it curate its own skills, push heavy work onto disposable Modal/Daytona/Vercel sandboxes — or run the whole thing locally on your own GPU / NVIDIA DGX Spark. It's all model-agnostic: bring whatever weights you want, the guide is about the harness.

Unlike most guides, the prescriptions come with working files: skills/ you can ln -s into ~/.hermes/skills/, templates/config/ you cp to ~/.hermes/config.yaml, scripts/vps-bootstrap.sh that takes a fresh VPS to production in one command.

Docs plus runnable artifacts — 27 guide parts, 13 installable skills, 5 config templates, 4 reference architectures, one-command VPS bootstrap, 8-question config wizard

By Terp — Terp AI Labs · Last updated July 1, 2026 · CHANGELOG · ROADMAP · ECOSYSTEM

Install

Pick the surface that fits you — they all drive the same agent, config, keys, sessions, and skills.

Easiest — the desktop app. Grab the Hermes Desktop installer for macOS/Windows/Linux (or run hermes desktop if you already have the CLI). First launch offers Quick Setup via Nous Portal — sign in, pick a model, start chatting. Full tour: Part 24: Hermes Desktop App.

Terminal — one line. macOS / Linux:

curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash

Windows (native, PowerShell):

iex (irm https://hermes-agent.nousresearch.com/install.ps1)

Server — one command to production. On a fresh Debian 12 / Ubuntu 24.04 box (Hetzner CX22 works great for ~$5/mo):

curl -sSL https://raw.githubusercontent.com/OnlyTerp/hermes-optimization-guide/main/scripts/vps-bootstrap.sh | sudo bash

This installs Hermes, Node.js, Caddy (auto-TLS reverse proxy), UFW, fail2ban, creates a non-root hermes user, drops in hardened systemd units, and symlinks every skill from this repo into ~hermes/.hermes/skills/. See scripts/vps-bootstrap.sh for what it does line by line — it's non-destructive and re-runnable.

Prefer a 5-minute local-only setup? → docs/quickstart.md (zero to Telegram bot in 5 min).

Repo Map

Folder	What's in it
`skills/`	13 installable `SKILL.md` files. `ln -s` into `~/.hermes/skills/` and they're live.
`templates/config/`	5 opinionated `config.yaml` — minimum, telegram-bot, production, cost-optimized, security-hardened.
`templates/compose/`	Self-hosted Langfuse v3 stack (ClickHouse + MinIO + Redis).
`templates/caddy/`	Caddyfile reference (reverse proxy + auto TLS + HSTS).
`templates/systemd/`	Hardened `hermes.service` + `hermes-dashboard.service`.
`templates/cron/`	Recommended production cron schedule.
`scripts/vps-bootstrap.sh`	One-command fresh VPS → production Hermes.
`diagrams/`	6 Mermaid diagrams (architecture, MCP flow, delegation, sandbox sync, observability, security layers).
`assets/`	Banner art + the SVG infographics used across the guide (architecture, paths, timeline).
`benchmarks/`	Reproducible cost + latency table across 13 models × 5 tasks.
`docs/wizard/`	Interactive config wizard — 8 questions → ready-to-drop `config.yaml`. Runs in your browser.
`docs/reference-architectures/`	4 blueprints — Homelab, Solo Dev, Small Agency, Road Warrior. Full parts list + cost + install.
`docs/outreach/`	Launch tweet, HN post, upstream-PR body drafts (for people linking to this guide).
`docs/quickstart.md`	5-minute zero-to-Telegram-bot.
`ECOSYSTEM.md`	Curated directory of MCP servers, coding agents, dashboard plugins.
`ROADMAP.md` · `CHANGELOG.md` · `CONTRIBUTING.md`	The usual suspects.
README + `part1-.md` … `part26-.md`	The 27-part guide itself (now incl. MoA + verification, Desktop App, NVIDIA / local hardware).

Architecture at a glance

Hermes architecture — surfaces (desktop, CLI/TUI, web, 25+ chat platforms, cron) flow into the gateway (model router, approval layer, context engine, scale-to-zero), which fans out to any model, tools, memory, and observability

Prefer Mermaid? The same picture, editable:

flowchart LR
  subgraph Surfaces[Surfaces — one agent, many front ends]
    direction TB
    Desktop[Desktop app<br/>macOS · Windows · Linux]
    Term[CLI · TUI]
    Web[Web admin panel]
    Chat[25+ chat platforms<br/>Telegram · Discord · Slack<br/>Teams · LINE · WeChat · …]
  end
  Surfaces --> Gateway
  Gateway --> Router[Model Router<br/>cost + context + capability]
  Router --> Providers[Any provider / model<br/>Cloud APIs · OpenAI-compatible<br/>Local: llama.cpp · LM Studio · Ollama<br/>NVIDIA RTX · DGX Spark]
  Gateway --> Approval[Approval Layer<br/>denylist · allowlist · quarantine]
  Approval --> Tools[Tools<br/>Native · Tool Gateway · MCP<br/>Subagents · Coding Agents · Swarms]
  Tools --> Memory[Memory<br/>Vector · LightRAG · mem0]
  Tools --> Logs[(Audit log<br/>+ Langfuse / Helicone traces)]

Full set of diagrams: diagrams/architecture.md.

Pick Your Path

Pick your path — ten curated reading paths through the guide's 27 parts, from a 10-minute setup to production hardening, local GPU, and MoA verification

This guide grew to 27 parts because Hermes grew. Every part lives in its own file (part1-setup.md … part26-moa-verification.md); this README keeps a short summary of Parts 1–5 (plus the full SOUL.md personality section) and links out. You don't have to read them all — pick the shortest path to what you need:

🎯 "I just want it working in 10 minutes"

Skip the terminal: install the desktop app and let first-run Quick Setup via Nous Portal pick a model for you. Prefer the CLI? Part 1: Setup → Part 12: Web Dashboard and point-and-click the rest.

🛡️ "I'm worried about prompt injection (you should be)"

Part 19: Security Playbook — read this first if your agent reads any untrusted input (email, webhooks, Discord, public Telegram groups).

🖥️ "Just give me an app, not a terminal"

Part 24: Hermes Desktop App — download, Quick Setup, and drive everything from a real GUI: streaming chat, a Cmd+K command palette, drag-and-drop files, a model picker, and an optional connection to a remote Hermes box.

🔒 "Run it all locally on my own GPU"

Part 25: NVIDIA & Local Hardware — RTX / DGX Spark, OpenShell isolation, and a model-agnostic local stack (llama.cpp / LM Studio / Ollama) so your data never leaves the machine.

🧑‍⚖️ "I want an ensemble of frontier models — and proof the work is done"

Part 26: MoA, Verification & Self-Improvement — pick a Mixture-of-Agents council like a model, judge /goal completion against evidence, and steer what the agent learns with /learn + /journey.

What's New (July 2026)

Release timeline — v0.13 Tenacity, v0.14 Foundation, v0.15 Velocity, v0.16 Surface, v0.17 Reach, and the current v0.18 Judgment release

Two huge releases landed since the Surface refresh — v0.17.0 "Reach" (v2026.6.19) and v0.18.0 "The Judgment Release" (v2026.7.1). Combined: ~~3,200 commits, ~1,800 merged PRs, 1,200+ issues closed, and — as of v0.18 — every P0 and P1 issue in the entire Hermes repo resolved (~~700 highest-priority items cleared in twelve days, with a standing commitment to keep the count at zero). None of it is model-specific — bring whatever weights you want.

v0.18.0 — "Judgment" (latest)

Mixture-of-Agents is a first-class model — every named MoA preset is a selectable virtual model under a moa provider in every picker (CLI/TUI/desktop/gateway). Each reference model's reasoning renders as its own labelled block, and the aggregator's answer streams live. /moa is now one-shot sugar. See Part 26.
The agent proves its work — verification evidence for coding tasks (run the project's checks, don't assert success), completion contracts for /goal, /goal wait <pid>, and a pre_verify hook. See Part 26.
/learn + /journey — distill a reusable skill from anything (/learn <dir|url|workflow>), and browse/edit/delete everything the agent has learned on a timeline. The desktop adds a playable memory graph. Background self-improvement now routes to an aux model and costs a fraction of before. See Part 26 and Part 7.
Background subagent fan-out — delegate_task dispatches parallel background subagents and returns one consolidated turn when all finish; your chat is never blocked. See Part 8.
Desktop becomes a coding cockpit — first-class per-profile Projects (sidebar, coding rail, review pane, worktree management), a multi-terminal panel, PR-style diffs in chat, and a conversation timeline rail. See Part 24.
Run it for a team — gateway scale-to-zero with drain coordination (no dropped in-flight turns), administrator-pinned managed scope from /etc/hermes, multiplexed profiles over one gateway, and cron continuations. See Part 26.
Google Vertex AI provider — Gemini through your GCP service account with auto-minted, auto-refreshed OAuth2 tokens (no static key). The Gemini-CLI OAuth providers were removed — see the migration note in Part 9.
Everyday wins — /prompt (compose in $EDITOR), /reasoning full, /timestamps, in-place compaction by default, Blank Slate setup mode, and a security round (MCP-config persistence hardening, cron credential-exfil blocks, Slack xapp- token redaction). See Part 26 and Part 19.

v0.17.0 — "Reach"

iMessage via Photon Spectrum — no Mac required — hermes photon login and Hermes lives in the blue bubbles; positioned as the successor to the BlueBubbles bridge. Plus an official WhatsApp Business Cloud API adapter and the Raft agent-network channel. See Part 15.
Background subagents — delegate_task(background=true) returns a handle immediately; the result re-enters the conversation when it finishes. See Part 8.
A much deeper desktop app — rebindable shortcuts, native OS notifications, live subagent watch-windows, any VS Code Marketplace theme, a resizable terminal pane, remote media relay, and per-thread drafts. See Part 24.
Dashboard grows up — a full profile builder (model + skills + MCPs from the browser), a rehauled Skills Hub (previews + security scans), and hardened dashboard auth. See Part 12.
image_generate learned to edit — image-to-image transforms across every provider; Automation Blueprints replace raw cron syntax with guided forms; the memory tool gained atomic batch operations; the Curator's LLM consolidation pass is now opt-in (routine curation costs zero tokens). See Part 22 and Part 7.
Telegram rich messages (Bot API 10.1, on by default), MCP elicitation (servers can prompt mid-tool-call on any surface), and Cursor's Composer model via xAI Grok OAuth. See Part 15, Part 17, and Part 9.

v0.16.0 — "Surface"

Hermes Desktop — a native macOS/Windows/Linux app: streaming chat with live tool activity, a session list with archive/search, drag-and-drop files, clipboard image paste, a Cmd+K command palette, a model picker in the composer, a per-session YOLO toggle, and in-app self-update. It's "another surface over one agent, not a fork." See Part 24.
Remote backend — desktop and clients can connect to a remote Hermes gateway over a secure WebSocket (OAuth or username/password), with per-profile hosts, concurrent multi-profile sessions, and cross-profile @session links. Thin GUI local, heavy agent remote. See Part 24.
Browser admin panel — the web dashboard grew into a full admin panel: a Channels page that sets up every messaging platform from the browser, MCP catalog enable/disable, credentials, webhooks, memory config, and a System page with check-before-update and one-click Debug Share. See Part 12.
Quick Setup via Nous Portal — hermes portal opens a guided first-run that signs you in and picks a model; Quick Setup vs Full Setup paths on first launch. See Part 1.
/undo [N] — take back the last N turns and prefill your last message to edit and resend, with CLI / TUI / messaging parity. See Part 22.
Fuzzy model picker + default interface choice — type-to-filter model search across desktop/web/TUI/CLI, grouped multi-endpoint providers, an hourly-refreshed catalog, and a cli-or-tui default for hermes chat (with a --cli per-invocation override). See Part 22.
Leaner default skills — rarely-used bundled skills moved to optional, a new environments: relevance gate, and the Curator can now prune built-in skills. See Part 22.
NVIDIA Skills Hub tap — a built-in trusted Skills source alongside OpenAI/Anthropic/HuggingFace (CUDA-X, AIQ, cuOpt), part of the broader NVIDIA local-hardware story. See Part 25.
Security — CVE-2026-48710 Starlette pin, SSRF off-loop hardening, and subprocess credential stripping. See Part 19.

NVIDIA partnership — run it local

Hermes is now optimized for always-on local use on NVIDIA RTX PCs, RTX PRO workstations, and DGX Spark (128GB unified memory, ~1 petaflop of AI performance, runs 120B-class MoE models all day). Tensor Cores accelerate inference, there's a dedicated DGX Spark playbook, and OpenShell adds kernel-level isolation between the agent and your OS. It stays model-agnostic — bring any weights. See Part 25.

Earlier milestones (still relevant)

v0.15 "Velocity" — multi-agent swarms (hermes kanban swarm), the big perf wave (~4,500× faster free session_search), Brainworm/promptware defense, skill bundles, and ntfy as a messaging platform. See Part 23 and Part 19.
v0.14 "Foundation" — PyPI installs + lighter launch, Grok OAuth + 1M context, hermes proxy (OpenAI-compatible localhost), x_search, Teams/LINE/SimpleX, live /handoff, and the first native Windows support. See Part 23 and Part 13.
v0.13 "Tenacity" — durable multi-agent Kanban, /goal persistent objectives, Checkpoints v2, and no-agent cron. See Part 23.
v0.12 "Curator" — the autonomous Curator (hermes curator), a rubric-based self-improvement loop, a much wider provider menu, and a plugin-first gateway. See Part 22 and Part 9.
v0.11 "Interface" — the Ink TUI rewrite, a per-transport provider layer, native AWS Bedrock, and auxiliary-model routing for side tasks. See Part 22.

Fundamentals that haven't changed: the local web dashboard (hermes dashboard), the Tool Gateway + hermes proxy, Fast Mode (/fast) and guided compression (/compress <topic>), and the MCP + coding-agent + remote-sandbox developer stack. See Part 12, Part 13, Part 14, Part 17, Part 18, and Part 21.

Setup — Install Hermes, configure your provider, first-run walkthrough (with Android/Termux)
SOUL.md Personality — The Molty prompt, what good personality rules look like, how to fix a bland agent
OpenClaw Migration — Move your OpenClaw data, config, skills, and memory into Hermes
LightRAG — Graph RAG — Set up a knowledge graph that actually understands relationships, not just text similarity
Telegram Bot — Connect Hermes to Telegram for mobile access, voice memos, and group chats
On-the-Fly Skills — Ask Hermes to create new skills that optimize your workflow automatically
Context Compression — Fix the silent context loss bug, configure compression thresholds, survive long sessions
Memory System — The three-tier memory architecture: persistent facts, conversation recall, procedural memory
Subagent Patterns — Orchestrator/worker delegation, ACP subagents, parallel task execution
Custom Model Providers — Grok/SuperGrok OAuth, Bedrock, Azure AI Foundry, Vertex AI, LM Studio, Codex OAuth, MoA presets, OpenRouter routing, model aliases, fallback chains
SOUL.md Anti-Patterns — What makes an agent annoying vs useful, the formula that works
Gateway Recovery — Crash detection, auto-recovery, common failure modes, health checks
Web Dashboard — hermes dashboard, browser Chat via real TUI, models/plugins tabs, config, keys, sessions, logs, analytics, cron
Tool Gateway, Local Proxy & Live Search — Nous-managed tools, hermes proxy, and x_search
Fast Mode & Background Watchers — /fast, /steer, /queue, watch_patterns, pluggable context engine, /compress <topic>
New Platforms (Teams, LINE, SimpleX, iMessage, WeChat, Android) — Teams end-to-end, LINE, SimpleX, Google Chat, QQBot, Yuanbao, BlueBubbles/iMessage, Weixin/WeCom, Android via Termux
Backup, Import & /debug — Portable hermes backup/import, /debug bundler, hermes debug share, security hardening
MCP Servers — The tool-protocol standard. stdio + HTTP transports, sampling, trust boundaries, server shortlist, writing your own
Delegating to Coding Agents — Claude Code Week 20+, Codex v0.133+, Gemini CLI v0.43, OpenCode, Aider, Zed ACP, print-mode, Kanban, git isolation
Security Playbook — Prompt-injection defense, provenance labels, approval layers, secrets redaction, MCP trust model, hardline blocks
Observability & Cost Control — Langfuse plugin, Helicone, OpenTelemetry → Phoenix, prompt-prefix caching, CDP spans, auxiliary routing, evals
Remote Sandboxes & Bulk File Sync — SSH, Modal, Daytona, Vercel Sandbox, Fly Machines, E2B. Diff-based sync-back on teardown
Latest Power Moves — Curator, TUI habits, context-file hygiene, plugins, dashboard Chat, cron chaining, and the 2026 upgrade checklist
Foundation + Tenacity Stack — PyPI/lazy deps, hermes proxy, /handoff, durable Kanban, /goal, Checkpoints v2, no-agent cron, worker lanes, multi-agent swarms, and the upgrade checklist
Hermes Desktop App — Native macOS/Windows/Linux GUI, Quick Setup, Cmd+K palette, Projects, multi-terminal, memory graph, remote gateway, multi-profile, voice, self-update
NVIDIA & Local Hardware — Run Hermes on your own GPU: RTX / DGX Spark, OpenShell isolation, NemoClaw, and a model-agnostic local stack
MoA, Verification & Self-Improvement — Mixture-of-Agents presets as models, /moa, completion contracts for /goal, /learn, /journey, background fan-out, scale-to-zero

The Problem

If you're running a stock Hermes setup (or migrating from OpenClaw), you're probably dealing with:

Installation confusion. The docs cover the basics but don't tell you what to configure first or what matters.
Lost knowledge from OpenClaw. You spent weeks building memory, skills, and workflows — now they're stuck in the old system.
Basic memory that can't reason. Vector search finds similar text but can't answer "what decisions led to X and who was involved?"
No mobile access. Sitting at a terminal is fine until you need to check something from your phone.
Repetitive prompting. You keep asking the agent to do the same multi-step task the same way, every time.

What This Fixes

After this guide:

Problem	Solution	Result
Fresh install	Step-by-step setup	Working agent in under 5 minutes
OpenClaw data stuck	Automated migration	Skills, memory, config all transferred
Shallow memory	LightRAG graph RAG	Entities + relationships, not just text chunks
Desktop only	Telegram integration	Chat from anywhere, voice memos, group support
Repetitive prompts	Agent-created skills	Agent saves workflows as reusable skills automatically

Prerequisites

A Linux/macOS machine (or WSL2 on Windows, or Android via Termux — see Part 15)
Python 3.11+ and Git
An API key for at least one LLM provider (Anthropic, OpenAI, OpenRouter, Nous Portal, etc.)
Optional: Ollama for local embeddings (free vector search)
Optional: a paid Nous Portal subscription for managed tools, or OAuth-backed Claude/OpenAI/xAI subscriptions if you plan to use hermes proxy

How the Pieces Fit Together

How the pieces fit together — you on any device, through the Hermes Agent's lean context, into modular layers (skills, memory, LightRAG, chat platforms), out to any LLM provider

The key insight: Everything is modular. Install what you need, skip what you don't. The agent adapts.

Quick Start

# 1. Install Hermes (Linux/macOS/WSL2/Android) — or grab the desktop app
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash

# 2. Configure providers and tools (or `hermes portal` for guided Quick Setup)
hermes setup

# 3a. Start chatting in the terminal (CLI or TUI)
hermes

# 3b. Or open the browser dashboard / admin panel
hermes dashboard

# 3c. Or launch the native desktop app
hermes desktop

The dashboard — and the new desktop app — are the fastest way to configure everything without touching YAML. See Part 12 and Part 24 for the full tours.

For the full walkthrough including optimization, read each part in order.

Part 1: Setup (Stop Fumbling With Installation)

From zero to working agent in under 5 minutes. Covers what the docs don't.

One command installs everything — curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash on Linux/macOS/WSL2/Android-Termux, a native PowerShell one-liner on Windows, or pip install hermes-agent for the leanest path. The full part covers what the installer actually does, the hermes setup first-run wizard (model picker, API keys, toolsets), the key hermes config set options (fallback models, agent.max_turns, prompt_caching.enabled, compression.enabled), the ~/.hermes/ file layout, and how to verify and update your install.

Read the full part → Part 1: Setup

SOUL.md — Give Your Agent a Personality

SOUL.md is injected into every single message. It's the highest-impact file in your setup. A bad SOUL.md makes your agent sound like a corporate chatbot. A good one makes it actually useful to talk to.

What Belongs in SOUL.md

Put the stuff that changes how the agent feels to talk to:

Tone — direct, casual, formal, dry, whatever fits you
Opinions — the agent should have takes, not hedge everything
Brevity — enforce concise answers as a default
Humor — when it fits naturally, not forced jokes
Boundaries — what it should push back on
Bluntness level — how much sugarcoating to skip

Do NOT turn SOUL.md into:

A life story
A changelog
A security policy dump
A giant wall of vibes with no behavioral effect

Short beats long. Sharp beats vague.

The Molty Prompt

Originally from OpenClaw's SOUL.md guide. Adapted for Hermes with permission/credit. Paste this into your chat with the agent and let it rewrite your SOUL.md:

Read your SOUL.md. Now rewrite it with these changes:

You have opinions now. Strong ones. Stop hedging everything with "it depends" — commit to a take.

Delete every rule that sounds corporate. If it could appear in an employee handbook, it doesn't belong here.

Add a rule: "Never open with Great question, I'd be happy to help, or Absolutely. Just answer."

Brevity is mandatory. If the answer fits in one sentence, one sentence is what I get.

Humor is allowed. Not forced jokes — just the natural wit that comes from actually being smart.

You can call things out. If I'm about to do something dumb, say so. Charm over cruelty, but don't sugarcoat.

Swearing is allowed when it lands. A well-placed "that's fucking brilliant" hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a "holy shit" — say holy shit.

Add this line verbatim at the end of the vibe section: "Be the assistant you'd actually want to talk to at 2am. Not a corporate drone. Not a sycophant. Just... good."

Save the new SOUL.md. Welcome to having a personality.

What Good Looks Like

Good SOUL.md rules:

have a take
skip filler
be funny when it fits
call out bad ideas early
stay concise unless depth is actually useful

Bad SOUL.md rules:

maintain professionalism at all times
provide comprehensive and thoughtful assistance
ensure a positive and supportive experience

That second list is how you get mush.

Why This Works

This lines up with OpenAI's prompt engineering guidance: high-level behavior, tone, goals, and examples belong in the high-priority instruction layer, not buried in the user turn. SOUL.md is that layer. It's the system-level personality instruction that every model respects.

If you want better personality, write stronger instructions. If you want stable personality, keep them concise and versioned.

One warning: Personality is not permission to be sloppy. Keep your operational rules in AGENTS.md. Keep SOUL.md for voice, stance, and style. If your agent works in shared channels or public replies, make sure the tone still fits the room. Sharp is good. Annoying is not.

Keep it under 1 KB. Every byte in SOUL.md costs tokens on every message. The most effective SOUL.md files are 500-800 bytes of dense, high-signal personality instructions.

Part 2: OpenClaw Migration (Don't Leave Your Knowledge Behind)

Transfer your skills, memory, config, and personality from OpenClaw to Hermes in one command.

hermes claw migrate moves your SOUL.md, AGENTS.md, memory files (merged and deduped), user profile, skills, model config, and provider keys from ~/.openclaw/ into Hermes automatically. The full part covers the --dry-run preview, presets (full vs user-data), skill-conflict handling (skip/overwrite/rename), the complete config-key mapping table, what doesn't transfer (session transcripts, cron jobs, plugin configs), and troubleshooting.

Read the full part → Part 2: OpenClaw Migration

Part 3: LightRAG — Graph RAG That Actually Works

From "find similar text" to "reason about relationships." The single biggest intelligence upgrade you can make.

Vector search finds what's similar; graph RAG finds what's connected. LightRAG (HKU, EMNLP 2025) builds a knowledge graph — entities and relationships — alongside your vector DB and searches both at once, for a fraction of Microsoft GraphRAG's cost. The full part covers installation, entity-extraction model choice (Kimi K2.6 for quality, Cerebras GPT OSS 120B for speed, local Ollama for free), embeddings (Fireworks Qwen3-Embedding-8B vs local nomic-embed-text), running and securing the REST server, ingestion, the four query modes (naive/local/global/hybrid), a ready-to-use Hermes skill, and tuning tips.

Read the full part → Part 3: LightRAG Setup

Part 4: Telegram Setup (Chat From Anywhere)

Connect Hermes to Telegram for mobile access, voice memos, group chats, and scheduled task delivery.

Telegram is the most battle-tested of the 25+ messaging adapters: text, voice memos (auto-transcribed), image analysis, file attachments, inline confirmation buttons, and cron delivery straight to your phone. The full part walks through creating a bot with @BotFather, the privacy-mode gotcha that breaks group chats, finding your numeric user ID, hermes gateway setup, webhook mode for cloud deployments (with a proper random secret), multi-user setup, and troubleshooting.

Read the full part → Part 4: Telegram Setup

Part 5: On-the-Fly Skills (Let Hermes Build Its Own Playbook)

Ask Hermes to create a new skill, and it saves the workflow permanently — no manual file editing needed.

Skills are procedural knowledge: how-to guides the agent loads on demand at zero idle token cost (memory is for facts; skills are for workflows). Hermes creates them itself — after a complex task it offers to save the steps, pitfalls, and verification as a reusable SKILL.md, and you can ask for one directly ("create a skill for deploying Docker containers"). The full part covers the creation workflow, the SKILL.md format and directory structure, slash-command and automatic loading, managing/updating skills, the v0.12 Curator that keeps the library from rotting, real-world examples, and tips for writing skills that stay useful.

Read the full part → Part 5: On-the-Fly Skills

You've now got the full picture — setup, migration, graph memory, mobile access, and self-improving workflows. From here, Part 22: Latest Power Moves, Part 23: Tenacity Stack, Part 24: Desktop App, and Part 25: NVIDIA & Local Hardware round out the modern stack. Start with setup, add what you need, and let Hermes build the rest.

Note: Based on the official Hermes Agent documentation and real production usage. No private credentials, API keys, or personal data included.