A single 61 MB binary indexes your codebase into a local KB and exposes 11 LSP-shaped + Graph-RAG tools to any MCP-aware agent. No ports, no auth, no Docker — just init, wire it into .mcp.json, ask questions.
Install, index, wire it in. Claude Code now has 11 tools that operate over your code as graph + vectors + symbols, not as raw text.
# 1. Install (one-time)
curl -L https://github.com/dimensigon/heliosdb-codekb-mcp/releases/download/v0.1.0/heliosdb-codekb-mcp-linux-x86_64 \
-o /usr/local/bin/heliosdb-codekb-mcp && chmod +x /usr/local/bin/heliosdb-codekb-mcp
# 2. Index the repo (~30 s for ~1k files; resumable)
heliosdb-codekb-mcp init --source ~/code/myrepo --mode co-located --ingest
# 3. Wire into Claude Code (.mcp.json in the repo root)
{
"mcpServers": {
"helios": {
"command": "/usr/local/bin/heliosdb-codekb-mcp",
"args": ["serve", "--source", "/home/me/code/myrepo"]
}
}
}
The MCP binary owns transport and per-source KB ergonomics. The tool surface comes from the embedded engine. So the catalog is keyed on the engine version the binary embeds.
| Tool | Purpose |
|---|---|
helios_lsp_definition | Go-to-definition for a symbol at (file, line, col) |
helios_lsp_references | Find all references with caller/callee context |
helios_lsp_hover | Type signature + doc-comment for a symbol |
helios_lsp_document_symbols | Outline of a single file (functions, types, modules) |
helios_lsp_call_hierarchy | Up/down call graph from a symbol, N hops |
helios_lsp_rename_preview | Dry-run rename with diff across all references |
helios_lsp_rename_apply | Commit a rename atomically across the KB |
helios_lsp_body_diff | Body-text diff for a symbol between two KB snapshots |
helios_lsp_references_diff | Reference-set diff between two snapshots (find new callers) |
helios_ast_diff | Tree-sitter AST diff between two file versions |
| Tool | Purpose |
|---|---|
helios_graphrag_search | Hybrid retrieval: BM25 + (optional) vector rerank + N-hop graph expansion |
Coming in v0.2.0 (engine target: Nano v3.23.x): helios_lsp_workspace_symbols, helios_lsp_implementations, helios_graphrag_explain (citation chain). Tracked in the milestone.
| Mode | KB location | Use when |
|---|---|---|
co-located | <source>/.helios-kb (auto-.gitignored) | Single repo; KB travels with the code. |
global | ${XDG_DATA_HOME}/helios-kb/<slug> | Many independent projects on one machine. |
hybrid | Explicit --kb <PATH>; multiple sources register against one KB | Multi-repo aggregate (e.g. monorepo + sister repos). |
Picked at init --ingest or ingest time. Pilot fixture: 666 files / 18 k symbols / 115 k refs (Helios Nano source).
| Tier | Flag | Wall time | What lights up |
|---|---|---|---|
| fast (default) | (none) | 26 s | BM25 + hop-distance ranking on helios_graphrag_search |
| quality (blocking) | --with-embeddings | 3 m 15 s | Adds in-process FastEmbedder body vectors → paraphrase queries hit |
| background-quality | --background-quality | 26 s parent + ~2 m 50 s detached child | User-wait stays at 26 s; paraphrase quality lifts when child finishes |
Recommended for repos > ~1k files: --background-quality.
| Format | Backend |
|---|---|
.rs .py .ts .tsx .js .go .sql | tree-sitter (engine code-graph) |
.md .txt | tree-sitter Markdown / generic text |
.pdf (born-digital) | pdf-extract |
.docx | docx-rs |
.xlsx | calamine |
--features docling — requires a docling-serve endpoint| Format | Backend |
|---|---|
| Scanned / OCR PDFs | Docling layout + OCR |
| Scientific PDFs | Docling table & formula extraction |
.pptx / complex Office | Docling Office pipeline |
Audio (.mp3 .wav .m4a) | Docling Whisper pipeline |
Images (.png .jpg .tiff) | Docling OCR |
The same binary serves all of them; transport is a flag on serve.
| Agent | Config file | Transport |
|---|---|---|
| Claude Code | .mcp.json (project) or ~/.claude/mcp_servers.json (user) | stdio |
| Codex CLI | ~/.codex/config.toml [mcp_servers.helios] | stdio |
| Cursor | Settings → MCP Servers (UI) | HTTP (--http 127.0.0.1:8765) |
| Continue | ~/.continue/config.json | HTTP |
| Aider | aider --mcp-server <cmd> | stdio |
HeliosDB-Nano is a generic database. Baking one specific MCP transport into its binary would adapt the engine to one consumer. This crate keeps the boundary clean.
Publishable to crates.io, consumable by any downstream. No PostgreSQL wire protocol baggage, no MCP coupling, no transport opinions.
MCP packaging, transport choice, per-source KB-location ergonomics, ingest scheduling. You get the engine's tool surface without dragging in vector store DDL or distributed coordination — none of which a coding agent needs.
ingest writes <kb_dir>/.ingest-state.json at each phase transition (walk → code_index → graph_rag). Killed mid-flight (Ctrl-C, OOM, reboot)? Next ingest reads the file at startup and skips already-completed phases. Per-file resume inside code_index is the engine's content-hash gate. Surfaced by status --source X as ingest resume : interrupted at phase = … until cleared.
Two layers, both honored during the walk:
@generated content-marker scan — first 4 KiB (Facebook / Google / Bazel / Go convention).<root>/.gitattributes linguist-generated globs — patterns flagged with linguist-generated, linguist-generated=true, or linguist-generated=set. Long-tail coverage of *.pb.rs, codegen/**, vendored bundles.Apache 2.0, single-file install, embeds HeliosDB-Nano v3.22.2. Linux x86_64 today, macOS x86_64 next.