hermes-agent-camel

Hermes Agent fork with integrated CaMeL trust boundaries

β˜… 16 Python MIT Updated 3/30/2026
View on GitHub β†’

Hermes Agent + CaMeL

A publishable Hermes fork with CaMeL trust boundaries integrated into the runtime.

This fork is designed for operators who want the Hermes agent loop to distinguish between:

Sensitive tools are authorized against a trusted operator plan instead of instructions embedded in untrusted content.

Research Provenance

This fork is inspired by Google Research's CaMeL paper and reference repository:

This repository does not aim to reproduce the Google research stack exactly, and it does not present itself as a benchmark-equivalent implementation of that repo.

This repository was implemented directly within Hermes and does not vendor Google source code unless explicitly noted in future changes.

Instead, it adapts the core boundary-setting ideas from the paper to Hermes' existing runtime model:

In other words, the design is research-inspired, but the implementation and problem framing are specific to Hermes.

What This Fork Changes

This fork adds a runtime security layer centered on agent/camel_guard.py and the Hermes tool loop.

Main additions:

Sensitive capabilities gated by this integration include:

Read-only actions such as send_message(action="list") and cronjob(action="list") remain allowed.

Threat Model

This fork is built to reduce indirect prompt injection risk in the normal Hermes workflow.

The target attack pattern is:

  1. Hermes retrieves untrusted content from the web, a browser session, a file, session recall, or an MCP server.
  2. That content contains hidden or explicit instructions such as "ignore previous instructions", "send a message", or "run this terminal command".
  3. The model attempts to treat that content as control rather than evidence.
  4. Hermes blocks the side effect unless the trusted operator plan explicitly authorizes that capability.

Architecture

1. Trusted operator plan

Hermes derives trusted control from real user turns only. Synthetic system-control turns do not pollute the trusted plan.

2. Untrusted data channel

Tool outputs are treated as untrusted data by default and wrapped with provenance metadata before they re-enter model context.

3. Security envelope

Every turn includes a compact CaMeL security envelope describing the trusted goal, authorized capabilities, and current untrusted source inventory.

4. Capability gating

Side-effecting tools are checked against the trusted operator plan before execution.

5. Provider hygiene

Internal CaMeL metadata is removed before messages are sent to the configured model provider.

Validation

Hermes runtime compatibility

Validated against the Hermes runtime suite:

pytest -q tests/agent/test_camel_guard.py tests/test_run_agent.py

Result:

Paper-aligned indirect injection benchmark

A Hermes-specific micro-benchmark aligned to the CaMeL paper/repo important_instructions attack shape was also run.

Observed outcomes:

Detailed notes:

Install

Fresh install from this fork

curl -fsSL https://raw.githubusercontent.com/nativ3ai/hermes-agent-camel/main/scripts/install.sh | bash

Then reload your shell and start Hermes:

source ~/.zshrc
hermes

Existing upstream Hermes checkout

Use the camelup installer repo to apply this fork or switch an existing checkout to the CaMeL build.

Runtime Modes

This fork now supports two explicit runtime behaviors from the same checkout:

Examples:

hermes --camel-guard on
hermes --camel-guard off
hermes chat --camel-guard monitor -q "Summarize the report"

Mode behavior:

This keeps one codebase and one install path while making it easy to compare guarded and legacy behavior side by side.

Developer setup

git clone https://github.com/nativ3ai/hermes-agent-camel.git
cd hermes-agent-camel
git submodule update --init mini-swe-agent
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv .venv --python 3.11
source .venv/bin/activate
uv pip install -e ".[all,dev]"
uv pip install -e "./mini-swe-agent"
pytest -q tests/agent/test_camel_guard.py tests/test_run_agent.py

Files Of Interest

Scope

This fork follows the trust-boundary principles described in the CaMeL paper, but applies them to Hermes' agent runtime rather than to the paper's original evaluation stack.

It is not presented as a full reproduction of the paper's AgentDojo benchmark matrix or as a claim of matching the paper's performance characteristics. The validation here is Hermes-specific and focused on runtime trust boundaries plus paper-aligned indirect injection scenarios.

Upstream Relation

This repository tracks Hermes Agent from Nous Research and carries the CaMeL integration as a focused runtime security extension.

For the original general-purpose Hermes README, see docs/upstream-readme.md.

Related Add-On

For payment flows that keep the same operator-intent boundary outside the model loop, see:

Hermes PayGuard is a separate optional plugin, not a core runtime patch. It adds:

That separation is intentional: the payment approval ledger and execution rails belong outside the core CaMeL runtime layer.