alias8818/hermes-tool-slimmer
Reduce Hermes Agent tool-schema overhead with keyword selection and Tool Search support
Hermes Tool Slimmer is a context optimization plugin that reduces prompt overhead by dynamically selecting the most relevant tools for each agent turn. It builds an index of available tool schemas and uses BM25 ranking with keyword boosts to filter large toolsets down to a concise, relevant subset. This approach significantly lowers token consumption in environments with dozens of native or MCP tools while maintaining access to essential safety and file-system utilities. The project includes a dashboard for monitoring estimated token savings and provides a fail-safe mechanism that reverts to the full schema list if selection errors occur.
- Reduces tool-schema token overhead using BM25 ranking and keyword selection
- Supports native Hermes tools and Model Context Protocol (MCP) integrations
- Includes a dashboard plugin for real-time token savings and diagnostics
full readme from github
Hermes Tool Slimmer

Hermes Tool Slimmer reduces repeated tool-schema overhead by selecting the smallest useful tool set for a turn. It builds an indexable corpus from Hermes tool schemas, ranks candidate tools with local BM25 plus explicit boosts, and fails open to the original schema list when anything goes wrong.
Why
Large Hermes installations can expose dozens of native and MCP tools. A 57-tool schema catalog can serialize to roughly 73 KB, or about 18K approximate prompt tokens using the documented bytes / 4 estimate. Selecting 8-12 relevant tools for a repository-search turn can reduce that to about 15 KB / 3.7K approximate tokens while keeping configured safety tools hot.
Tool slimming is only a schema-selection optimization. It must not bypass Hermes approval prompts, tool execution controls, provider auth, disabled toolsets, or any runtime safety policy.
What The Numbers Mean
The dashboard reports estimated schema tokens saved, not guaranteed billable-token savings. The estimate is computed from serialized tool-schema JSON bytes divided by 4 before and after selection. Provider tokenizers, prompt formatting, cache behavior, system prompts, conversation history, and model-specific tool serialization can make actual input-token and billing deltas differ.
The metric is still useful because it measures the repeated tool-catalog payload that Tool Slimmer removes from each request. Treat it as a consistent operational estimate for schema overhead, not as an invoice-grade accounting number.
Dashboard headline totals count real Hermes session events by default. Probe events without a session_id are excluded from headline savings and remain available through the dashboard API's all_summary field for audits.
Install
Hermes Tool Slimmer v0.4.0+ is the supported line for Hermes Agent v0.14.0. Older Tool Slimmer releases can load as dashboard/diagnostic plugins on v0.14.0, but they do not provide active schema slimming because Hermes moved the request construction path.
On Hermes builds with dashboard plugin repair support, you can install from the dashboard Plugins page by pasting:
alias8818/hermes-tool-slimmer
That path clones the repo to ~/.hermes/plugins/tool-slimmer, runs the same deterministic repair installer with --no-restart, and preserves the git checkout so the dashboard Update button can use git pull later. Restart the gateway after dashboard install or update so active schema slimming uses the patched selector hook.
From a terminal on the machine that runs Hermes:
cd /tmp
git clone https://github.com/alias8818/hermes-tool-slimmer.git
cd hermes-tool-slimmer
Then run the installer:
scripts/install-hermes-tool-slimmer.sh
That handles the package install, dashboard plugin copy, Hermes plugin enablement, selector-hook patch, service restart, and final health report. The core patcher supports both the older monolithic run_agent.py Hermes layout and the newer v0.14.0 modular agent/conversation_loop.py plus agent/chat_completion_helpers.py layout.
Verify it worked:
hermes tool-slimmer doctor
If an agent or hosted approval layer blocks direct script execution, run the same installer from a normal terminal, or ask the agent to request approval for this exact command after the repo is downloaded:
bash /tmp/hermes-tool-slimmer/scripts/install-hermes-tool-slimmer.sh
If the repo was unpacked somewhere else, replace /tmp/hermes-tool-slimmer with that directory. A block at this step means the environment denied running the script; it does not mean Hermes config or Tool Slimmer source is broken.
If the machine has multiple hermes launchers, use the Hermes venv launcher:
HERMES_BIN="$HOME/.hermes/hermes-agent/venv/bin/hermes" bash /tmp/hermes-tool-slimmer/scripts/install-hermes-tool-slimmer.sh
This avoids installing the package into one Python environment while running Hermes from another.
If Hermes Agent is doing the install for you, give it this instruction:
Install Hermes Tool Slimmer from https://github.com/alias8818/hermes-tool-slimmer.
After downloading the repo, run:
HERMES_BIN="$HOME/.hermes/hermes-agent/venv/bin/hermes" bash /tmp/hermes-tool-slimmer/scripts/install-hermes-tool-slimmer.sh
If the environment asks for approval to run that script, request approval for that exact command.
Then verify with:
$HOME/.hermes/hermes-agent/venv/bin/hermes tool-slimmer doctor
For a guided setup, see docs/quickstart.md. For the Hermes dashboard page, see docs/dashboard-plugin.md.
The dashboard includes a Tool Index panel with a one-click Rebuild From Hermes Tools action, indexed-tool preview, path, checksum, and last-updated time. The persisted index is for inspection and troubleshooting; live slimming ranks the current request's Hermes schemas in memory.
For a plain-English health report:
scripts/troubleshoot-hermes-tool-slimmer.sh
For local development:
pip install -e ".[dev]"
pytest
Quality Gates
The repository ships focused unit and integration tests for selector behavior, config validation, metrics accounting, dashboard API routes, and provider fallback behavior. Run the same checks used by CI locally:
ruff check .
python -m compileall -q src tests dashboard-plugin/tool-slimmer
pytest -q
Configure
plugins:
enabled:
- tool-slimmer
tool_slimmer:
enabled: true
mode: keyword # eager | keyword | hybrid | anthropic_tool_search
top_k: 8 # selected after always_include
always_include: [terminal, read_file, write_file, patch, search_files]
never_defer: [terminal, read_file]
include_mcp_tools: true
include_native_tools: true
log_decisions: true
min_total_tools: 0
min_estimated_reduction_percent: 5.0
aliases:
browse: [browser, navigate, url, website]
fail_open: true # selector errors preserve the original full schema list
dry_run: false # true logs/injects diagnostics but does not alter schemas
Commands
hermes tool-slimmer status
hermes tool-slimmer doctor
hermes tool-slimmer index rebuild --schemas examples/tools.yaml
hermes tool-slimmer index show --top 20
hermes tool-slimmer select "search this repo for MCP registration code" --schemas tools.yaml
hermes tool-slimmer benchmark --prompts examples/prompts.yaml --schemas examples/tools.yaml
hermes tool-slimmer eval --prompts examples/prompts.yaml --schemas examples/tools.yaml
hermes tool-slimmer eval --prompts examples/prompts.yaml --schemas examples/tools.yaml --markdown
hermes tool-slimmer analyze-config
hermes tool-slimmer privacy
hermes tool-slimmer recommend-config
Slash commands:
/tool-slimmer status
/tool-slimmer select search this repo for MCP registration code
/tool-slimmer dry-run on
/tool-slimmer dry-run off
Provider behavior
| Provider path | Behavior |
|---|---|
| Anthropic native | Tool Search/defer loading if mode: anthropic_tool_search and Hermes core supports the required request serialization/headers. |
| Bedrock/Vertex/Azure Anthropic | Attempt only when the Hermes provider stack supports the Anthropic Tool Search path for that provider/model. |
| OpenRouter/OpenAI/local | Fall back to deterministic keyword selection, hybrid when implemented, or eager mode according to config; do not send Anthropic-only Tool Search definitions. |
Integration status
The standalone plugin registers diagnostics tools, the full-tool fallback tool, slash commands, CLI commands, a short pre_llm_call fallback instruction, and a select_tool_schemas callback when Hermes core supports it.
Supported/target core surfaces:
ctx.register_tool_schema_selector(callback)ctx.register_schema_selector(callback)ctx.register_hook("select_tool_schemas", callback)
If none exists, active schema slimming requires the installer/core patch to add select_tool_schemas before provider request construction. Without that core hook, the plugin remains useful for dashboard visibility, dry-run diagnostics, benchmarking, and configuration recommendations, but it cannot reduce provider request schemas. See docs/hermes-core-selector-hook.patch for a minimal upstreamable Hermes core patch artifact based on current v0.14.0 source inspection.
Safety model
always_includetools are selected first when present and not already disabled by Hermes.tool_slimmer_request_full_toolsis always kept available when Hermes has registered it. If a skill or task needs a hidden tool, the model can call it to make the next provider request use the full schema list instead of inventing a substitute workflow.top_kapplies afteralways_include; always-included tools do not count against thetop_kbudget.top_k: 0is treated as an explicit request to select no ranked tools, so it does not fail open to the full catalog.disabled_tools,disabled_toolsets,include_mcp_tools, andinclude_native_toolsare respected before ranking.min_total_toolsskips catalogs with fewer than that many tools before ranking; equality is allowed to slim. The default is0so subagents and restricted toolsets still get ranked.min_estimated_reduction_percentfails open after ranking if the estimated schema reduction is too small to justify altering the request. Inanthropic_tool_searchmode, this guardrail is measured against the hot tool set because deferred tools are discoverable rather than eagerly loaded.fail_open: truesends the original schema list on selector errors.
Keyword mode is intentionally mostly literal. It includes a small deterministic synonym map for common operation words such as browsing/navigation, but tool-specific synonyms should still be added to tool descriptions or handled by a semantic selector mode when available.
aliasesextends keyword query expansion deterministically; aliases affect ranking and score details but do not rewrite stored tool schemas.hybridmode keeps BM25 ranking and adds a deterministic fuzzy-token boost for close spelling/wording misses.- The standalone
tool_slimmer_selecttool uses provided schemas first, live Hermes tool definitions second, and the persisted index as a final fallback. dry_run: truelogs decisions and returnsNoneto preserve original behavior.- Anthropic Tool Search helpers never defer every tool.
Public release contents
docs/quickstart.md: install, dry-run, and activation walkthrough.docs/hermes-core-integration.md: required Hermes core selector hook contract.docs/hermes-core-selector-hook.patch: minimal upstreamable Hermes core patch artifact.docs/anthropic-tool-search.md: provider capability notes for Anthropic Tool Search.docs/privacy.md: decision log field inventory and privacy notes.docs/reports/latest-eval.md: reproducible example evaluation report.docs/troubleshooting.md: common operational issues.examples/: sample config, prompts, schemas, and expected output.
Release validation
This repository is release-ready only when these checks pass:
ruff check .
mypy src tests
python -m compileall -q src tests
pytest -q
python -m build
When changing the Hermes core patch, also run the validation steps in docs/release-checklist.md.