hermes atlas
apr·2026 99·repos hermes·v0.10.0 ★ star this repo

AxDSan/mnemosyne

The Zero-Dependency, Sub-Millisecond AI Memory System for Hermes Agents

★ 29 langPython licenseMIT updated2026-04-23

Mnemosyne is a local-first memory system designed to provide private, low-latency context for Hermes Agents. It utilizes a SQLite-backed architecture featuring native vector search and FTS5 full-text search to eliminate the need for external databases or API keys. The system implements a three-tier BEAM architecture that manages hot working memory, long-term episodic memory, and a temporary scratchpad. It includes automatic consolidation processes that summarize and move old memories to long-term storage during sleep cycles.

  • Sub-millisecond read and write latency using local SQLite storage
  • Hybrid search combining vector similarity, keyword ranking, and importance scores
  • BEAM architecture for tiered working, episodic, and scratchpad memory
full readme from github

Mnemosyne

Mnemosyne

Native, zero-cloud memory for AI agents. SQLite-backed. Sub-millisecond. Fully private.

Python PyPI SQLite License CI

Mnemosyne is a local-first memory system for the Hermes Agent framework. It stores conversations, preferences, and knowledge in SQLite with native vector search (sqlite-vec) and full-text search (FTS5) — no external databases, no API keys, no network calls.


Quick Start

Option A: Install from PyPI (recommended)

pip install mnemosyne-memory

Note: The package name on PyPI is mnemosyne-memory.

With all optional features (dense retrieval + local LLM consolidation):

pip install mnemosyne-memory[all]

⚠️ Ubuntu 24.04 / Debian 12 users: If you get error: externally-managed-environment, your system Python is PEP 668-protected. Use a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
pip install mnemosyne-memory[all]

Make sure to activate the venv every time you run Hermes, or install Hermes itself inside the same venv.

Option B: Install from source (for development)

git clone https://github.com/AxDSan/mnemosyne.git
cd mnemosyne
pip install -e ".[all,dev]"

Option C: Hermes MemoryProvider only (no pip needed)

If you only need Mnemosyne as a Hermes memory backend and want to skip pip entirely:

curl -sSL https://raw.githubusercontent.com/AxDSan/mnemosyne/main/deploy_hermes_provider.sh | bash

This symlinks the provider into ~/.hermes/plugins/mnemosyne and adds the repo to sys.path at runtime. No virtual environment required — works out of the box on Ubuntu 24.04.

Register with Hermes

# 1. Install the plugin
python -m mnemosyne.install

# 2. Activate as your memory provider
hermes memory setup
# → Select "mnemosyne" and press Enter

Verify:

hermes memory status       # Should show "Provider: mnemosyne"
hermes mnemosyne stats     # Shows working + episodic memory counts

Note: The hermes memory setup picker defaults to "Built-in only" every time it opens. This is normal Hermes UI behavior — your previous selection is saved. Just select Mnemosyne and press Enter.


What Makes It Different

Mnemosyne Cloud alternatives
Latency < 1ms 10-100ms
Dependencies Python stdlib + optional ONNX External APIs, auth, rate limits
Privacy 100% local Data leaves your machine
Cost Free Freemium / per-call
Setup pip install -e . API keys, accounts, config

Key capabilities:

  • BEAM architecture — Three tiers: hot working memory, long-term episodic memory, temporary scratchpad
  • Hybrid search — 50% vector similarity + 30% FTS5 rank + 20% importance, all inside SQLite
  • Automatic consolidation — Old working memories are summarized and moved to episodic memory via mnemosyne_sleep()
  • Temporal triples — Time-aware knowledge graph with automatic invalidation
  • Export / import — Move your entire memory database to a new machine with one JSON file
  • Cross-session scoperemember(..., scope="global") makes facts visible everywhere
  • Configurable compressionfloat32 (default), int8 (4x smaller), or bit (32x smaller) vectors

Benchmarks

All numbers measured on CPU with sqlite-vec + FTS5 enabled.

LongMemEval (ICLR 2025)

System Score Notes
Mnemosyne (dense) 98.9% Recall@All@5 Oracle subset, 100 instances, bge-small-en-v1.5
Mempalace 96.6% Recall@5 AAAK + Palace architecture
Mastra Observational Memory 84.23% (gpt-4o) Three-date model
Full-context GPT-4o baseline ~60.2% No memory system

Latency vs. Cloud Alternatives

Operation Honcho Zep MemGPT Mnemosyne Speedup
Write 45ms 85ms 120ms 0.81ms 56x
Read 38ms 62ms 95ms 0.076ms 500x
Search 52ms 78ms 140ms 1.2ms 43x
Cold Start 500ms 800ms 1200ms 0ms Instant

BEAM Architecture Scaling

Write throughput:

Operation Count Total Avg
Working memory writes 500 8.7s 17.4 ms
Episodic inserts (with embedding) 500 10.7s 21.3 ms
Sleep consolidation 300 old items 33 ms

Hybrid recall scaling (query latency stays flat as corpus grows):

Corpus Size Query Avg Latency p95
100 "concept 42" 5.1 ms 6.9 ms
500 "concept 42" 5.0 ms 5.7 ms
1,000 "concept 42" 5.3 ms 6.5 ms
2,000 "concept 42" 7.0 ms 8.6 ms

Working memory recall scaling (FTS5 fast path):

WM Size Query Avg Latency p95
1,000 "concept 42" 2.4 ms 3.1 ms
5,000 "domain 7" 3.2 ms 3.8 ms
10,000 "concept 42" 6.4 ms 7.2 ms

Installation

Prerequisites

  • Python 3.9+
  • Hermes Agent (for plugin integration)

From PyPI (recommended for users)

pip install mnemosyne-memory

# With all extras (dense retrieval + local LLM consolidation)
pip install mnemosyne-memory[all]

From source (recommended for contributors)

git clone https://github.com/AxDSan/mnemosyne.git
cd mnemosyne
pip install -e ".[all,dev]"
python -m mnemosyne.install

⚠️ Ubuntu 24.04 / Debian 12 users: If pip install fails with externally-managed-environment, see the Quick Start → Option A note about using a virtual environment.

Optional dependencies

# Dense retrieval (required for semantic search and the 98.9% LongMemEval score)
pip install fastembed>=0.3.0

# Local LLM consolidation (sleep cycle summarization)
pip install ctransformers>=0.2.27 huggingface-hub>=0.20

Note: Without fastembed, Mnemosyne falls back to keyword-only retrieval. It still works, but you won't get competitive semantic search or the benchmark scores above.

Uninstall

python -m mnemosyne.install --uninstall

Updating

If you installed from PyPI:

pip install --upgrade mnemosyne-memory

If you installed from source:

cd mnemosyne
git pull
pip install -e ".[all,dev]"

Always restart Hermes after updating so plugin changes take effect:

hermes gateway restart

If the update includes database schema changes, run the migration helper:

python scripts/migrate_from_legacy.py

See UPDATING.md for detailed troubleshooting and rollback instructions.


Usage

CLI

# Show memory statistics (current session only)
hermes mnemosyne stats

# Show memory statistics across ALL sessions
hermes mnemosyne stats --global

# Search memories
hermes mnemosyne inspect "dark mode preferences"

# Run consolidation (compress old working memory into episodic summaries)
hermes mnemosyne sleep

# Export all memories to a JSON file
hermes mnemosyne export --output mnemosyne_backup.json

# Import memories from a JSON file
hermes mnemosyne import --input mnemosyne_backup.json

# Clear scratchpad
hermes mnemosyne clear

Python API

from mnemosyne import remember, recall

# Store a fact
remember(
    content="User prefers dark mode interfaces",
    importance=0.9,
    source="preference"
)

# Store a global preference (visible in every session)
remember(
    content="User email is abdi.moya@gmail.com",
    importance=0.95,
    source="preference",
    scope="global"
)

# Store a temporary credential with expiry
remember(
    content="API key: sk-abc123",
    importance=0.8,
    source="credential",
    valid_until="2026-12-31T00:00:00"
)

# Search memories
results = recall("interface preferences", top_k=3)

# Temporal knowledge graph
from mnemosyne.core.triples import TripleStore
kg = TripleStore()
kg.add("Maya", "assigned_to", "auth-migration", valid_from="2026-01-15")
kg.query("Maya", as_of="2026-02-01")

Advanced: BEAM direct access

from mnemosyne.core.beam import BeamMemory

beam = BeamMemory(session_id="my_session")

# Working memory (auto-injected into prompts)
beam.remember("Important context", importance=0.9)

# Episodic memory (long-term, searchable)
beam.consolidate_to_episodic(
    summary="User likes Neovim",
    source_wm_ids=["wm1"],
    importance=0.8
)

# Scratchpad (temporary reasoning)
beam.scratchpad_write("todo: fix auth bug")

# Search both tiers
results = beam.recall("editor preferences", top_k=5)

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    HERMES AGENT                              │
│                                                              │
│  ┌─────────────┐     ┌──────────────┐     ┌─────────────┐  │
│  │   pre_llm   │────▶│  Mnemosyne   │────▶│   SQLite    │  │
│  │    hook     │     │    BEAM      │     │             │  │
│  └─────────────┘     └──────────────┘     │ working_mem │  │
│         ▲                                  │ episodic_mem│  │
│         │                                  │ vec_episodes│  │
│         └──────── Auto-injected context ───│ fts_episodes│  │
│                                            │ scratchpad  │  │
│                                            │ triples     │  │
│                                            └─────────────┘  │
│                                                              │
│  No HTTP. No cloud. 100% local.                              │
└─────────────────────────────────────────────────────────────┘

BEAM (Bilevel Episodic-Associative Memory):

  • working_memory — Hot context, auto-injected before LLM calls, TTL-based eviction
  • episodic_memory — Long-term storage with sqlite-vec + FTS5 hybrid search
  • scratchpad — Temporary agent reasoning workspace

Why SQLite for Hermes?

SQLite is already in your stack. Hermes uses it for session persistence. Mnemosyne extends that same file — no new dependencies, no Docker containers, no connection pooling.

Feature Honcho Zep Mnemosyne
Deployment Docker + PostgreSQL Docker + Postgres pip install
Query Language REST API REST API SELECT ... WHERE MATCH
Vector Store pgvector pgvector sqlite-vec
Text Search Separate API Separate API Built-in FTS5
Auth Required Yes (supabase) Yes No
Offline Mode No No Yes
Cold Start Latency 500-800ms 800ms+ 0ms

Backup, Export & Migration

Mnemosyne stores everything in a single SQLite file at ~/.hermes/mnemosyne/data/mnemosyne.db.

# Simple backup
cp ~/.hermes/mnemosyne/data/mnemosyne.db ~/backups/mnemosyne_$(date +%Y%m%d).db

# Export to JSON (portable across machines)
hermes mnemosyne export --output mnemosyne_backup.json

# Import on a new machine
hermes mnemosyne import --input mnemosyne_backup.json

Environment Variables

Variable Default Description
MNEMOSYNE_DATA_DIR ~/.hermes/mnemosyne/data Database directory
MNEMOSYNE_VEC_TYPE float32 Vector compression: float32, int8, or bit
MNEMOSYNE_WM_MAX_ITEMS 10000 Working memory item limit
MNEMOSYNE_WM_TTL_HOURS 24 Working memory TTL
MNEMOSYNE_RECENCY_HALFLIFE 168 Recency decay halflife in hours (1 week)

Testing

# Run tests locally
python -m pytest tests/test_beam.py -v

# Run benchmarks
python tests/benchmark_beam_working_memory.py

All changes are validated through GitHub Actions CI on Python 3.9–3.12 before merging.


Releases

Mnemosyne publishes GitHub Releases and PyPI packages automatically on every v* tag. See CONTRIBUTING.md for the release process.


Contributing

Contributions are welcome. Areas of active interest:

  • Encrypted cloud sync (optional, user-controlled)
  • Browser extension for web context capture
  • Additional embedding models
  • Multi-language support

See CONTRIBUTING.md for guidelines.


License

MIT License — See LICENSE

Copyright (c) 2026 Abdias J


Acknowledgments

  • Hermes Agent Framework — The ecosystem Mnemosyne was built for
  • Honcho — For defining the stateful memory space
  • Mempalace — For proving local-first memory can compete on benchmarks
  • SQLite — The world's most deployed database

"The faintest ink is more powerful than the strongest memory." — Hermes Trismegistus