MukundaKatta/hermes-agentmemory

Name: hermes-agentmemory
Author: MukundaKatta

Pull-model episodic memory plugin for Hermes Agent with real deletion (no persistent summaries that haunt future recall) and a trace.jsonl audit log of every memory operation. Community fix for issue #6715. MIT.

★ 7

view on github →

overview

hermes-agentmemory is a standalone episodic memory plugin for Hermes Agent that prioritizes data integrity and transparency. It utilizes a pull-model architecture where memory operations are synchronous, ensuring that deletions are immediate and leave no persistent summaries or artifacts behind. The plugin records every prefetch operation to a local JSONL audit log, allowing users to verify exactly which past events were injected into the model's prompt. By using Anthropic's Claude models for on-demand summarization, it provides high-quality recall without the background processing overhead common in other memory backends.

Ensures immediate and complete deletion of episodic memory records
Maintains a trace.jsonl audit log of all memory operations
Provides synchronous, on-demand summarization using Anthropic Claude models

full readme from github

hermes-agentmemory

A drop-in Hermes Agent memory plugin built on agentmemory.

Pull-model episodic memory with real deletes and an audit trace. The point: Hermes Agent is good at remembering. This plugin gives it a memory layer that takes deletion seriously.

Why another memory plugin?

Hermes ships with several first-class memory backends (Mem0, Honcho, Hindsight, etc.). They consolidate in the background, which is the dominant pattern in agentic memory right now. That makes recall cheap and fast at the cost of two things:

Deletes are not always real. Once an episode is baked into a derived summary, removing the original event leaves the summary intact.
Memory injection is opaque. Background prefetch happens off the hot path; the user does not see exactly which past events were used until something goes wrong.

agentmemory flips both. It does no background work, every write is synchronous, deletes are immediate and complete, and every prefetch writes a trace record (event_ids + summary + prompt) to $HERMES_HOME/agentmemory/trace.jsonl so the user can audit what entered the prompt.

Install

This is a standalone memory plugin. Hermes Agent stopped accepting new built-in memory providers — see CONTRIBUTING.md. The official path is to drop the plugin into the user-plugins directory that Hermes discovers automatically.

# 1. Clone into the user-plugins dir Hermes scans on startup
git clone https://github.com/MukundaKatta/hermes-agentmemory \
  "${HERMES_HOME:-$HOME/.hermes}/plugins/agentmemory"

# 2. Install the one Python dep used by the summarizer
pip install anthropic

# 3. Activate
hermes config set memory.provider agentmemory

# 4. Set the Anthropic key the summarizer will use
export ANTHROPIC_API_KEY=...

Hermes's discover_memory_providers() scans $HERMES_HOME/plugins/<name>/__init__.py for any class subclassing MemoryProvider, so no extra registration step is needed.

Configuration

Environment variables:

Var	Default	Purpose
`ANTHROPIC_API_KEY`	(required)	key for the on-demand summarizer
`AGENTMEMORY_MODEL`	`claude-sonnet-4-5`	Claude model id
`AGENTMEMORY_TOP_K`	`5`	events to retrieve per prefetch
`AGENTMEMORY_MAX_TOKENS`	`300`	summary token budget
`AGENTMEMORY_TRACE_LOG`	`$HERMES_HOME/agentmemory/trace.jsonl`	where to append audit records

Tools the agent can call

agentmemory_recall(query, top_k?) — surface the top matching past events plus the event ids used.
agentmemory_forget(session_id?, event_id?) — real delete. No tombstone, no derived artifact left behind.
agentmemory_drift() — rolling-window retrieval-quality state, useful when recall starts feeling stale.

Auditing what the model saw

tail -f ~/.hermes/agentmemory/trace.jsonl

Every prefetch produces one JSON line: intent, event_ids, summary, and the live drift snapshot.

Trade-off, honestly

The first turn of every new session pays a 200ms-2s tax for the on-demand summary because there is no background pre-warming. In exchange you get:

deletes that are real and immediate
no quality decay from a smaller summarizer model (the summarizer is the same Claude family the agent uses)
a trace file the user can audit without touching the agent
< 600 lines of Python you can read end-to-end

For a self-hosted agent that markets itself as "the agent that grows with you", auditable memory is the part that lets growth stay reversible.

License

MIT.