EXPLAIN and Optimization Tutorial
EXPLAIN and Optimization Tutorial
Available since: v3.6.0 (base EXPLAIN), expanded in v3.12.0 (Top-K, storage-level OFFSET pushdown), refined through v3.19.1.
Build: default — all features in this tutorial run on the standard heliosdb-nano binary.
Surface: PostgreSQL-compatible EXPLAIN plus HeliosDB extensions (STORAGE, AI, WHY_NOT, INDEXES, STATISTICS).
UVP
PostgreSQL EXPLAIN tells you a tree. HeliosDB Nano EXPLAIN tells you the tree and which storage features fired (bloom filters, zone maps, RocksDB prefix seek, ART index point-lookup), why optimisations weren’t applied (WHY_NOT), and what indexes you should add (INDEXES). All four output formats — TEXT, JSON, XML, YAML — are wire-compatible with PostgreSQL clients, so existing tooling (DataGrip, pgcli, dbeaver) renders them without changes. One EXPLAIN, three layers visible: logical plan, physical operators, storage engine.
When the plan rewrites — Top-K instead of full sort, storage-level OFFSET pushdown instead of materialise-and-skip — EXPLAIN ANALYZE shows you the actual rows touched, not the row count after the operator. That’s the difference between a 30 µs query and a 30 ms one on the same SQL.
Prerequisites
- HeliosDB Nano v3.6+ (v3.12+ for Top-K and storage-level OFFSET pushdown)
- A PostgreSQL client
- ~20 minutes
1. The EXPLAIN Surface
-- Core PostgreSQL formsEXPLAIN SELECT * FROM users;EXPLAIN ANALYZE SELECT * FROM users WHERE id = 1;EXPLAIN (ANALYZE, VERBOSE) SELECT * FROM users;EXPLAIN (FORMAT JSON) SELECT * FROM users;EXPLAIN (FORMAT XML) SELECT * FROM users;EXPLAIN (FORMAT YAML) SELECT * FROM users;
-- HeliosDB extensionsEXPLAIN (STORAGE) SELECT * FROM orders;EXPLAIN (INDEXES) SELECT * FROM orders WHERE status = 'open';EXPLAIN (WHY_NOT) SELECT * FROM orders WHERE LOWER(name) = 'a';EXPLAIN (STATISTICS) SELECT * FROM orders;EXPLAIN (AI) SELECT * FROM orders; -- natural-language explainerEXPLAIN (ANALYZE, STORAGE, WHY_NOT) SELECT * FROM orders;The full option set lives in src/sql/explain_options.rs:
| Option | What it does |
|---|---|
ANALYZE | Execute the query and report actual rows + timing |
VERBOSE | Output column lists, source tables, full expression trees |
FORMAT {TEXT, JSON, XML, YAML, TREE} | Output encoding |
COSTS | Show cost estimates (default ON) |
BUFFERS | Storage I/O counters (requires ANALYZE) |
TIMING | Per-operator timing (default ON when ANALYZE is set) |
SUMMARY | Summary block at end |
STORAGE | Bloom filter usage, zone maps, compression codec, prefix seek |
AI | Natural-language explanation of the plan |
WHY_NOT | Why the planner did NOT use a faster path (e.g., function in WHERE blocks index) |
INDEXES | Used vs unused indexes; recommendations |
STATISTICS | Row counts, NDV, histogram availability |
2. Setting Up a Workload
CREATE TABLE users ( id SERIAL PRIMARY KEY, email TEXT UNIQUE, name TEXT, status TEXT, created_at TIMESTAMP DEFAULT now());
CREATE TABLE orders ( id SERIAL PRIMARY KEY, user_id INTEGER REFERENCES users(id), total NUMERIC(10,2), status TEXT, created_at TIMESTAMP DEFAULT now());
INSERT INTO users (email, name, status)SELECT 'u' || g || '@x.com', 'User ' || g, CASE WHEN g % 4 = 0 THEN 'inactive' ELSE 'active' ENDFROM generate_series(1, 10000) AS g;
INSERT INTO orders (user_id, total, status)SELECT (g % 10000) + 1, (random() * 1000)::numeric(10,2), CASE WHEN g % 5 = 0 THEN 'cancelled' ELSE 'open' ENDFROM generate_series(1, 50000) AS g;3. Plan #1 — Simple SELECT
EXPLAIN SELECT * FROM users WHERE id = 42; QUERY PLAN------------------------------------------------------------- Index Scan using users_pkey on users (cost=0.15..8.17 rows=1 width=64) Index Cond: (id = 42)The PK lookup is served by the ART index (zero-copy pk_index_lookup, see src/storage/). EXPLAIN ANALYZE confirms it took microseconds, not milliseconds:
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 42; QUERY PLAN------------------------------------------------------------- Index Scan using users_pkey on users (cost=0.15..8.17 rows=1 width=64) Index Cond: (id = 42) Planning Time: 0.123 ms Execution Time: 0.041 msWhen the index is not used
EXPLAIN (WHY_NOT) SELECT * FROM users WHERE LOWER(email) = 'u42@x.com';WHY_NOT reports: function call on the indexed column blocks the index. Fix: WHERE email = LOWER('u42@x.com') or a functional index.
4. Plan #2 — JOIN
EXPLAIN SELECT u.name, o.totalFROM users uJOIN orders o ON o.user_id = u.idWHERE u.status = 'active' AND o.total > 500; QUERY PLAN--------------------------------------------------------------------- Hash Join (cost=312.80..1410.50 rows=3000 width=72) Hash Cond: (o.user_id = u.id) -> Seq Scan on orders o (cost=0.00..930.00 rows=10000 width=12) Filter: (total > 500) -> Hash (cost=270.00..270.00 rows=7500 width=64) -> Seq Scan on users u (cost=0.00..270.00 rows=7500 width=64) Filter: (status = 'active')Hash join is the planner’s default for inner equi-joins; the build side is the smaller cardinality. Note JoinKey uses cross-type numeric coercion (Int4 ↔ Int8) since v3.8.2.
5. Plan #3 — GROUP BY
EXPLAIN SELECT user_id, COUNT(*) AS n, SUM(total) AS revenueFROM ordersWHERE status = 'open'GROUP BY user_idHAVING COUNT(*) > 5; QUERY PLAN--------------------------------------------------------------------- HashAggregate (cost=1230.00..1280.00 rows=2000 width=24) Group Key: user_id Filter: (count(*) > 5) -> Seq Scan on orders (cost=0.00..930.00 rows=40000 width=12) Filter: (status = 'open')6. Plan #4 — HNSW Vector Search
CREATE TABLE chunks ( id SERIAL PRIMARY KEY, embedding VECTOR(384));CREATE INDEX chunks_emb_hnsw ON chunks USING hnsw (embedding vector_cosine_ops);
EXPLAIN SELECT idFROM chunksORDER BY embedding <=> '[0.1, 0.2, ...]'::vectorLIMIT 10; QUERY PLAN------------------------------------------------------------- Limit (cost=0.00..1.50 rows=10 width=8) -> Index Scan using chunks_emb_hnsw on chunks (cost=0.00..15000.00 rows=N width=8) Order By: (embedding <=> '[…]'::vector)The HNSW operator is invoked directly — no full sort, the heap is bounded at k = LIMIT.
7. Plan #5 — BM25 Full-Text
EXPLAIN SELECT id, title, ts_rank_cd(to_tsvector(body), to_tsquery('helios')) AS rankFROM articlesWHERE to_tsvector(body) @@ to_tsquery('helios')ORDER BY rank DESCLIMIT 10; QUERY PLAN------------------------------------------------------------- Limit (cost=… rows=10) -> Sort Sort Key: rank DESC -> Filter: (to_tsvector(body) @@ to_tsquery('helios')) -> Seq Scan on articlesUSING gin is accepted DDL but doesn’t surface as an Index Scan here — see FULL_TEXT_SEARCH_TUTORIAL for the rationale. The Sort collapses to Top-K automatically (next section).
8. Plan #6 — Hybrid (BM25 + Vector)
EXPLAIN SELECT id, text, 0.7 * (1.0 - (embedding <=> $1::vector)) + 0.3 * ts_rank_cd(to_tsvector(text), plainto_tsquery($2)) AS scoreFROM chunksWHERE tenant_id = $3 AND (embedding <=> $1::vector) < 0.8ORDER BY score DESCLIMIT 10;The plan composes: tenant filter → vector proximity cut (HNSW operator) → score expression → Top-K. The HNSW prefilter wrapper (added v3.19.0, #190) over-fetches k * over_fetch_multiplier, applies the row-level prefilter, then re-scores.
9. Top-K Rewrite (v3.12.0)
When the plan is Limit(Sort(…)) or Limit(Project(Sort(…))), TopKOperator (src/sql/executor/topk.rs) replaces the full sort with a bounded max-heap of size k = limit + offset:
EXPLAIN ANALYZESELECT * FROM orders ORDER BY total DESC LIMIT 5; QUERY PLAN------------------------------------------------------------- Limit (Top-K) (cost=… rows=5 width=24) -> TopK k=5 Sort Key: total DESC -> Seq Scan on orders (rows=50000)Complexity drops from O(N log N) to O(N log k) and memory from O(N) to O(k). The rewrite kicks in automatically whenever LIMIT has a concrete bound — no hint required.
10. Storage-Level OFFSET Pushdown (v3.12.0)
EXPLAIN ANALYZESELECT * FROM orders ORDER BY id LIMIT 5 OFFSET 990;Pre-3.12, this materialised 995+ rows then dropped 990. Since v3.12,
storage::EmbeddedStorage::scan_table_with_offset_limit skips offset rows at the RocksDB iterator level — no bincode decode, no decrypt, no dictionary/CAS resolve. The plan reads:
QUERY PLAN------------------------------------------------------------- Limit (cost=… rows=5 width=24) -> Seq Scan (offset-pushdown) on orders actual rows=5 OFFSET-skipped: 990actual rows=5 (not 995) is the diagnostic: the physical scan only deserialised 5 rows. For deeper pagination, prefer keyset — see KEYSET_PAGINATION_QUICKREF.
11. EXPLAIN (FORMAT JSON | XML | YAML)
EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 42;[ { "Plan": { "Node Type": "Index Scan", "Relation Name": "users", "Index Name": "users_pkey", "Index Cond": "(id = 42)", "Total Cost": 8.17, "Plan Rows": 1 } }]JSON is what you parse from a CI gate; YAML is what you diff in code review; XML is what legacy tooling expects. All three round-trip identically.
12. Storage Layer Visibility
EXPLAIN (STORAGE) SELECT * FROM orders WHERE id = 42; QUERY PLAN------------------------------------------------------------- Index Scan using orders_pkey on orders Index Cond: (id = 42) Storage: ART index point-lookup : yes (zero-copy) RocksDB bloom filter : 14-bit (whole-key) Block size : 16 KB Prefix extractor : table-id (8 bytes) Compression : LZ4 Full-text search (tsvector/@@/ts_rank_cd) : yes (unstemmed, no phrase)This reveals the engine choices — useful for “why is my query 10× slower in prod than dev” investigations.
13. Index Recommender
EXPLAIN (INDEXES) SELECT * FROM orders WHERE status = 'open'; INDEX ANALYSIS Used: orders_pkey (no — sequential scan was cheaper) Recommended: CREATE INDEX orders_status_idx ON orders (status) Estimated speedup: 12.4× Reason: WHERE on low-NDV column, 80% rows excluded ROI score: 78The recommender (src/sql/index_recommender.rs) calculates a ROI score from estimated speedup, affected query count, and storage cost. Apply, re-EXPLAIN, verify.
14. Workflow — Use Plans to Drive Index Design
The end-to-end loop:
- Run
EXPLAIN (ANALYZE, INDEXES, WHY_NOT) <slow query>. - If
WHY_NOTreports a blocker (function call, type mismatch), rewrite the predicate. - If
INDEXESrecommends a CREATE, apply it. - Re-EXPLAIN — confirm
Index Scanappears. - Confirm
Execution Timedropped.
Concrete example — a query that started as WHERE LOWER(email) = $1:
-- Step 1EXPLAIN (ANALYZE, WHY_NOT) SELECT * FROM users WHERE LOWER(email) = 'u42@x.com';-- WHY_NOT: function call on indexed column blocks index
-- Step 2: rewrite (emails are stored canonical-cased)EXPLAIN (ANALYZE) SELECT * FROM users WHERE email = 'u42@x.com';-- Index Scan using users_email_key
-- Or: keep the LOWER() and add a functional indexCREATE INDEX users_email_lower ON users (LOWER(email));15. Keyset Pagination Plans (v3.12+)
EXPLAIN ANALYZESELECT id, created_at, nameFROM usersWHERE (created_at, id) < ($1, $2)ORDER BY created_at DESC, id DESCLIMIT 20;The row-constructor predicate is recognised and evaluated lexicographically (new LogicalExpr::Tuple, v3.12). Plan reads:
Limit (Top-K) k=20 -> Sort Sort Key: created_at DESC, id DESC -> Filter: ((created_at, id) < (…, …)) -> Seq Scan on usersIf (created_at, id) is a btree index, the filter pushes through to Index Scan — see KEYSET_PAGINATION_QUICKREF.
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
EXPLAIN ANALYZE shows actual rows=N >> plan rows | Stats stale | Run ANALYZE <table> to refresh |
Top-K doesn’t fire | LIMIT with non-constant bound | Use a literal or bound parameter |
| OFFSET pushdown not visible | Mixed predicates block the iterator | Move filters into the table scan or use keyset |
Index Scan not chosen on small table | Planner picks Seq Scan for tiny relations | Expected — Seq Scan wins under ~1000 rows |
WHY_NOT says “function on column” | LOWER/UPPER/CAST in WHERE | Rewrite predicate or add functional index |
Where Next
- KEYSET_PAGINATION_QUICKREF — the pagination shape EXPLAIN tells you to use.
- WINDOW_FUNCTIONS_QUICKREF — see how window plans look.
- FULL_TEXT_SEARCH_TUTORIAL — BM25 plans and hybrid retrieval.
- VECTOR_SEARCH_TUTORIAL — HNSW operator surface.
CHANGELOG references: v3.6.0 (fast paths + bench wins vs PG13), v3.12.0 (Top-K + storage-level OFFSET pushdown + row-constructor compare), v3.19.0 (centrality-biased HNSW prefilter wrapper), v3.19.1 (current verified release).