I kept running into the same problem with AI coding agents.
The agents were getting better, but every new session still felt like starting
from zero.
I would explain the repo again. Then my preferences again. Then the decisions we
already made. Then why we avoided some shortcut. Then the source material I
wanted the agent to use.
After doing that enough times, I started thinking less about "better prompts"
and more about memory.
Not memory as a magic model feature. Not memory hidden inside some hosted black
box. I wanted something boring and inspectable:
raw sources on disk
compiled Markdown wiki
agent tools that query the wiki
That became Link.
Link is a local-first Markdown wiki and MCP server for agent memory. The goal is
simple: keep useful context outside the chat window, but still make it visible,
editable, source-backed, and owned by the user.
This is one approach. It is not the only correct architecture for agent memory.
But it is the one I wanted as a developer using local agents every day.
The Problem: Agents Forget The Work Around The Work
When people talk about agent memory, the first examples are usually preferences:
Use short answers.
Prefer TypeScript.
Use pytest.
Those matter, but they are only part of the problem.
The more expensive context is the work around the work:
- why a project is structured a certain way
- which decisions were already made
- which paths were tried and rejected
- what the user cares about
- what source material supports a claim
- what should be remembered globally versus only for one project
If an agent does not know that context, it can still produce code. But it often
produces the kind of code you have to steer back into place.
So I wanted memory that was not just "store a fact." I wanted memory with shape:
scope, source, review status, graph links, and a way to explain why it exists.
Why I Did Not Start With A Database
I like databases. Link even uses SQLite FTS for fast local search when it is
available.
But I did not want the primary storage format to be a database.
For personal agent memory, I wanted the source of truth to be boring files:
- easy to inspect
- easy to diff
- easy to back up
- easy to edit by hand
- usable in Obsidian or any Markdown editor
- safe to keep local
That is why Link uses Markdown as the storage layer.
A database can be an index. A model can help compile. But the memory itself
should not require trusting an opaque system.
The Pattern: Raw Sources Become A Wiki
The idea that pushed me in this direction was Andrej Karpathy's
LLM Wiki pattern.
The pattern, as I understand it, is:
raw sources -> structured wiki -> better retrieval
The raw source stays around. The agent turns it into clean pages. The pages link
to each other. Later, the agent can retrieve the compiled knowledge instead of
reading a pile of random notes again.
Link follows that pattern, but with a stronger product frame:
The wiki is the storage format.
The product is local memory for agents.
That distinction mattered a lot while building it. If Link is only a wiki, the
user still has to imagine how an agent should use it. If Link is agent memory,
then the system needs lifecycle, review, recall, explain, and validation.
The Storage Model
The on-disk layout is intentionally plain:
link/
raw/
wiki/
sources/
concepts/
entities/
explorations/
comparisons/
memories/
_backlinks.json
index.md
log.md
Each layer has a job.
raw/ is for original material: notes, transcripts, articles, PDFs,
screenshots, project notes, session captures, anything the user wants the agent
to learn from.
wiki/sources/ is for normalized source pages. One raw file should map to one
source page, so the provenance stays clear.
wiki/concepts/, wiki/entities/, wiki/explorations/, and
wiki/comparisons/ are compiled knowledge pages.
wiki/memories/ is for durable memory: preferences, decisions, project facts,
and notes the user explicitly wants future agents to recall.
This separation is important. A raw source is not automatically a memory. A
source can support many pages. A memory should be tighter and more intentional.
For example, one meeting note might become:
- a source page for the meeting
- a concept page for the feature
- a memory saying the project chose option B
- backlinks between the project, people, feature, and decision
That is much more useful than dumping the original meeting note into every
future prompt.
"Remember This" Is Not Just A Write
One thing I underestimated early: memory needs a lifecycle.
The naive version is:
user says remember -> write text somewhere -> retrieve later
That is not enough.
Real memory needs something closer to:
capture -> propose -> approve -> recall -> explain -> update -> archive/forget
In Link, a memory is a Markdown page with metadata:
- type: preference, decision, project, fact, or note
- scope: user, project, or global
- project key when relevant
- source/provenance
- review status
- active or archived lifecycle state
- tags and aliases
This lets an agent ask better questions:
- Is this memory reviewed?
- Is it project-specific?
- Does it conflict with another memory?
- Should this update an existing memory instead of creating a duplicate?
- Why does Link know this?
That last question matters. If an agent is going to use memory, I want to be
able to ask where it came from and whether it is still safe to rely on.
MCP Is The Interface Agents Needed
Markdown is good for storage, but agents need tools.
That is where MCP became useful. Link ships an MCP server called
link-mcp, listed on the MCP Registry as:
io.github.gowtham0992/link
The important tools are workflow-shaped, not just search-shaped:
-
link_status: check whether Link is ready -
starter_prompts: tell a new user what to ask next -
ingest_status: inspect raw files and guided ingest state -
query_link: build an answer-ready context packet -
memory_brief: prime an agent before work -
remember_memory: save explicit memory with duplicate/conflict checks -
propose_memories: generate candidates without writing them -
memory_inbox: show memories needing review -
explain_memory: audit why a memory exists -
validate_wiki: validate generated pages -
get_graph_summary: inspect a bounded graph neighborhood
That means Codex, Kiro, Claude, Cursor, VS Code, Copilot, or any MCP-capable
client can use the same local memory.
The memory does not belong to one chat app. It lives on disk.
The Query Packet Was The Key Product Detail
One easy mistake is to treat memory retrieval as "find everything relevant and
stuff it into context."
That sounds reasonable until the wiki grows.
So Link's main retrieval path is query_link. It returns a budgeted packet:
- relevant memories
- ranked wiki pages
- graph-neighborhood context
- why each item was selected
- estimated packet size
-
has_moreindicators - follow-up actions
From the CLI:
link query "what should I know before changing the MCP tools?" --budget small
Through MCP:
{
"tool": "query_link",
"arguments": {
"query": "what should I know before changing the MCP tools?",
"budget": "small",
"project": "link"
}
}
The agent gets enough context to move, but not so much that the answer becomes a
giant context dump. If it needs more, the packet includes follow-up actions.
That design decision made Link feel more like an agent memory layer and less
like a folder search wrapper.
Trust And Safety Are Part Of Memory
If memory is wrong, stale, or mysterious, it becomes dangerous.
So Link tries to keep the trust model practical:
- no telemetry
- no hosted backend
- no required cloud account
- raw files stay local
- generated pages are plain Markdown
- memory writes check duplicates and conflicts
- secret-looking values are detected before ingest/capture
- validation checks frontmatter, required sections, dead links, and backlinks
- backups exclude raw sources by default unless the user explicitly includes them
- the local web server binds to
127.0.0.1and is not meant to be exposed publicly
That last point is intentional. Link's web UI is a local viewer, not a hosted
team app. It has no login system. That is fine for loopback personal use. It
would not be fine as a public server.
I would rather state that boundary clearly than pretend the tool is something it
is not.
Performance: Good Enough For Personal Memory, Not Magic
The current version is fast for the scale I am targeting right now: personal and
project memory.
Link uses:
- SQLite FTS-backed search when available
- token-index fallback when SQLite FTS is unavailable
- persistent parsed-page cache
- incremental cache reuse
- bounded graph summaries
- large-wiki smoke tests
-
link benchmarkfor local performance checks
The test suite includes a synthetic 1,000-page wiki smoke test. On my machine,
interactive operations are fast at that size.
But I do not want to overclaim. Link is not designed today for a 100,000-page
enterprise corpus. Some indexes are still materialized in memory. For local
personal/project use, that tradeoff is fine. For much larger wikis, I would want
more persistent and lazy indexing.
What Trying It Looks Like
On macOS:
brew install gowtham0992/link/link
link demo
link serve link-demo
Then open:
http://127.0.0.1:3000
From source:
git clone https://github.com/gowtham0992/link.git
cd link
python3 link.py demo
python3 link.py serve link-demo
The demo creates a finished local wiki with raw sources, wiki pages, one starter
memory, backlinks, graph data, and query packets.
After that, the CLI loop looks like:
link ingest-status
link brief "working on Link release"
link query "what should I know before changing the graph view?" --budget small
link remember "Use short release notes for public updates." --type preference --scope project --project link
link validate
Inside an MCP-enabled agent, it can be more natural:
is Link ready?
brief me from Link before we continue
ingest raw/notes.md into Link
remember that I prefer short release notes
query Link for the release process
what does Link remember about local personal memory?
The user should not have to memorize all the paths. The agent should learn the
workflow and use the tools.
What I Learned
The main lesson: agent memory is less about storing text and more about
discipline.
It is easy to save something.
It is harder to know:
- should this be remembered?
- who approved it?
- what source supports it?
- is it project-specific?
- is it stale?
- does it conflict with something older?
- can the agent explain why it knows this?
- can the user archive or delete it later?
Those questions pushed Link toward simple primitives: files, frontmatter,
backlinks, validation, review state, and MCP tools.
That may not be the flashiest design. But I think boring is underrated here.
If an AI agent is going to remember things about me and my work, I want that
memory to be visible, source-backed, correctable, and mine.
If you try it, Iβd love feedback on the memory workflow more than anything else: what should be remembered, what should stay as source-backed wiki knowledge, and where the review step feels too heavy.
Links
- GitHub: https://github.com/gowtham0992/link
- Docs: https://gowtham0992.github.io/link/
- MCP Registry: https://registry.modelcontextprotocol.io/?q=io.github.gowtham0992%2Flink
- PyPI: https://pypi.org/project/link-mcp/
- Karpathy's LLM Wiki gist: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
United States
NORTH AMERICA
Related News
What Does "Building in Public" Actually Mean in 2026?
19h ago
The Agentic Headless Backend: What Vibe Coders Still Need After the UI Is Done
19h ago
Why Iβm Still Learning to Code Even With AI
21h ago
I gave Claude a persistent memory for $0/month using Cloudflare
1d ago
NYT: 'Meta's Embrace of AI Is Making Its Employees Miserable'
1d ago