There are now half a dozen serious AI memory servers in the wild, and the marketing pages do not help you pick one. Each project highlights what it is good at and skips what it is not.

This post is the honest version. I built one of these (SynaBun), so I am biased — and I will flag that bias every time it shows up. If something in here is wrong, message me on Discord and I will fix it.

The four servers

I am comparing four projects that share the same core job — give an AI coding agent persistent memory across sessions — but make wildly different architecture choices.

  • SynaBun — local-first, MCP-native, 106 tools (memory + browser + social + whiteboard + loops), Apache 2.0
  • Mem0 — cloud-first with self-hosted option, MCP-compatible, 6 tools, Apache 2.0
  • Letta (formerly MemGPT) — cloud-first, agent framework with structured memory blocks, Apache 2.0
  • Zep — cloud-first, knowledge graph layer, Apache 2.0 core + commercial Cloud

There is also OpenMemory (a Mem0-compatible local-first variant) and Memory Keeper, but they overlap heavily with the four above.

Quick decision matrix

You want…Pick
One MCP install for everything (memory + browser + social + loops)SynaBun
A managed cloud service with a UI dashboardMem0
An agent framework where memory is a first-class objectLetta
Knowledge-graph reasoning over chat historyZep
Lightweight self-hosted memory with Mem0 compatibilityOpenMemory

The rest of this post is the long version.

Feature matrix

FeatureSynaBunMem0LettaZep
MCP tools10664 (via wrapper)5 (via wrapper)
Memory architectureVector + categories + importanceVector + factsStructured blocks (core/archival/recall)Knowledge graph + vector
Default embeddingall-MiniLM-L6-v2 (local)OpenAI text-embedding-3-smallOpenAI / configurableOpenAI / Cohere
StorageSQLiteQdrant + PostgresPostgresPostgres + Neo4j
Local modeYes (default)Yes (self-hosted)Yes (self-hosted)Yes (self-hosted, complex)
Browser automation38 toolsNoneNoneNone
Social media tools6 platformsNoneNoneNone
Visual whiteboardYesNoneNoneNone
Autonomous loopsYes (cron)NoneAgent loopNone
3D memory vizYesNoneNoneNone
Claude Code hooks7 hooksNoneNoneNone
LicenseApache 2.0Apache 2.0Apache 2.0Apache 2.0 + commercial
MCP-nativeYes (built for MCP)Yes (added MCP layer)WrapperWrapper

SynaBun

Pitch: the kitchen-sink MCP server. Memory is one of 106 tools, not the only one.

Architecture: Node MCP server + SQLite + sqlite-vec + local embeddings. Adds a 3D React UI ("Neural Interface") for browsing memory, an Express REST API, Claude Code hooks, and a Schedules Studio for cron-driven loops. Browser tools wrap Playwright.

Embedding: all-MiniLM-L6-v2 (local, 22MB, 384 dims). Detailed reasoning here.

Strengths:

  • One install gets you memory, browser, social extraction, autonomous loops, Discord bots, Leonardo.ai image/video generation, and 30 Google Search Console tools (no API key required for any of them).
  • Fully local-first. No cloud account required to use any feature.
  • First-class support for Claude Code, Codex, OpenCode, Gemini.
  • Categorical memory model (parent/child categories, project tags, importance scoring) maps cleanly to how developers actually organize knowledge.

Weaknesses:

  • More surface area means more to learn. If you only want memory, this is overkill.
  • The browser + social tools require a headed Chrome instance, which adds setup friction on headless servers.
  • No managed cloud option. You run it.

Bias disclosure: I built this. Every other server in this post is something I evaluated before deciding to write my own.

When to pick: you are a single developer who wants one MCP server that does everything, and you are happy to self-host.

Mem0

Pitch: the AI memory category leader. Both a hosted SaaS and an open-source library.

Architecture: Python library + REST API + optional cloud. Stores facts (extracted from conversations) in Qdrant + structured metadata in Postgres. The "intelligent layer" tries to dedupe and update memories instead of just appending.

Embedding: OpenAI text-embedding-3-small by default. Swappable via the config.

Strengths:

  • Largest user base in the AI memory category. Lots of integrations, lots of documentation.
  • "Intelligent" memory updates — if a new fact contradicts an old one, Mem0 tries to merge.
  • Hosted cloud means zero setup if you do not want to self-host.
  • Well-tested integrations with LangGraph, LlamaIndex, AutoGen.

Weaknesses:

  • Self-hosted is functional but requires running Qdrant + Postgres + the Mem0 API. Not as plug-and-play as SQLite.
  • Cloud version requires sending your data to Mem0 servers. For some teams this is a deal-breaker.
  • 6 MCP tools — pure memory layer, no adjacent tooling.
  • The "intelligent" memory dedupe sometimes erases context you wanted to keep. Tunable but takes work.

When to pick: you want a managed cloud memory service with a polished dashboard and a large community.

Letta (formerly MemGPT)

Pitch: memory as a structured object inside an agent framework. Built around the "context manager" pattern from the original MemGPT paper.

Architecture: Python agent framework with three memory tiers — core (always in context), archival (vector-searched), and recall (chat history). The agent itself decides when to promote/demote memories between tiers via tool calls.

Embedding: OpenAI default, configurable.

Strengths:

  • The structured memory model is genuinely well-designed. "Core memory" for persona/preferences, "archival" for long-term facts, "recall" for chat history is a clean separation.
  • Self-hostable Docker stack. Works offline if you swap the LLM provider.
  • Original MemGPT paper has academic citations going back two years. Real research lineage.
  • Built-in agent SDK if you want to ship a customer-facing AI product.

Weaknesses:

  • Heavier than the others. Postgres + Letta API + the agent runtime.
  • Designed for full agent applications, not for "drop in a memory layer to my Claude Code session".
  • MCP support is via wrapper — not native.
  • The structured memory blocks require schema decisions upfront that can be hard to migrate later.

When to pick: you are building a customer-facing AI product where memory needs structured tiers, and you want an agent framework that comes with that built-in.

Zep

Pitch: knowledge graph + vector memory. Tracks entities, relationships, and facts over time, not just unstructured text.

Architecture: Postgres + Neo4j (or pgvector for the lightweight version). Conversations get parsed into a graph of (entity, relation, entity) triples. Recall combines graph traversal with vector search.

Embedding: OpenAI / Cohere, configurable.

Strengths:

  • The knowledge graph layer is genuinely powerful for use cases that involve tracking relationships over time (CRM-style assistants, customer support agents).
  • Temporal awareness — can answer "what did the user say about X last month vs this week".
  • Open-source core with a polished managed Cloud option.

Weaknesses:

  • The graph layer adds real complexity. For pure dev memory ("remember that we agreed to use TanStack Query"), it is overkill.
  • Self-hosting requires running Neo4j or being comfortable with pgvector edge cases.
  • MCP support is via wrapper.
  • Cloud-first by design — local mode exists but is not the happy path.

When to pick: your AI agent needs to reason about relationships between entities over time. CRM, support, sales-copilot use cases.

Latency comparison

This is where the local vs cloud split matters most. I ran 100 recall queries against each server with a 10k-item corpus on the same M1 MacBook (16GB RAM, broadband internet).

Serverp50 latencyp95 latencySetup mode
SynaBun14ms28msLocal SQLite
Mem0 (cloud)280ms520msHosted SaaS
Mem0 (self-hosted)95ms180msLocal Qdrant + Postgres
Letta (self-hosted)110ms240msLocal Postgres
Zep (self-hosted, pgvector)130ms290msLocal Postgres + pgvector
Zep (cloud)240ms480msHosted SaaS

Three takeaways:

  1. SQLite + local embeddings is 5-10x faster than the next-fastest option.
  2. Cloud APIs all sit in the 200-500ms range. Network is the bottleneck.
  3. Self-hosted versions of cloud-first products are still slower than SQLite-native because of the IPC + service overhead.

For interactive coding sessions, the latency difference is the difference between "memory feels invisible" and "memory feels like it has a loading state".

License + commercial concerns

All four are Apache 2.0 at their core. The differences:

  • SynaBun — Apache 2.0 throughout. No commercial fork.
  • Mem0 — Apache 2.0 + a managed cloud service. The cloud is paid.
  • Letta — Apache 2.0 + Letta Cloud (commercial).
  • Zep — Apache 2.0 community edition + Zep Cloud (commercial). The community edition has fewer features than Cloud.

Nothing on the list is rug-pull risk. All four have track records and active maintainers. But for the long tail, "fully OSS with no commercial fork" matters to some teams.

What about OpenMemory?

OpenMemory is a Mem0-compatible local-first variant. Smaller scope (5 MCP tools, pure memory). If you like Mem0 but want to run it without the cloud and without Qdrant, OpenMemory is the right pick. It runs on Chroma + SQLite.

I did not include it as a top-level option because architecturally it is "Mem0 minus the cloud parts" — same memory model, smaller footprint.

Picking by use case

Solo developer using Claude Code daily: SynaBun. Latency wins, the extras (browser, loops, hooks) come for free, no cloud account.

Team building a customer-facing AI assistant: Letta. The agent framework saves you wiring; the structured memory blocks are battle-tested.

Product team integrating with multiple LLM frameworks (LangGraph, LlamaIndex, AutoGen): Mem0. The integration coverage is unmatched.

CRM / support agent that needs entity reasoning over time: Zep. Knowledge graph is the unique value.

Lightweight Mem0-compatible local memory: OpenMemory.

What none of these do well

The category has gaps. None of these handle:

  • Cross-developer memory. Sharing a memory store across a team without re-implementing access control yourself.
  • Memory garbage collection. Old memories pile up. None of the four have a strong strategy beyond "set TTLs" or "let the user delete".
  • Memory search across documents + code. Most are tuned for chat-style memories. Searching across a code corpus + chat memory in one query is still rough.
  • Retrieval explanation. "Why did this memory rank high?" — none of them surface this in a way that helps you tune retrieval.

If any of those matter to your use case, you will end up writing custom code regardless of which server you pick.

Closing

The "AI memory" category is going to fragment hard over the next year. Cloud-first products are going to get more polished. Local-first products are going to get faster and more capable. And a lot of teams will end up running both — local for dev memory, cloud for production memory.

For dev work specifically, the case for local is overwhelming: lower latency, lower cost, full privacy. SynaBun is my answer to that case. The other three are good answers to different questions.

Related reading: