Skip to content

EXPLAIN and Optimization Tutorial

EXPLAIN and Optimization Tutorial

Available since: v3.6.0 (base EXPLAIN), expanded in v3.12.0 (Top-K, storage-level OFFSET pushdown), refined through v3.19.1. Build: default — all features in this tutorial run on the standard heliosdb-nano binary. Surface: PostgreSQL-compatible EXPLAIN plus HeliosDB extensions (STORAGE, AI, WHY_NOT, INDEXES, STATISTICS).


UVP

PostgreSQL EXPLAIN tells you a tree. HeliosDB Nano EXPLAIN tells you the tree and which storage features fired (bloom filters, zone maps, RocksDB prefix seek, ART index point-lookup), why optimisations weren’t applied (WHY_NOT), and what indexes you should add (INDEXES). All four output formats — TEXT, JSON, XML, YAML — are wire-compatible with PostgreSQL clients, so existing tooling (DataGrip, pgcli, dbeaver) renders them without changes. One EXPLAIN, three layers visible: logical plan, physical operators, storage engine.

When the plan rewrites — Top-K instead of full sort, storage-level OFFSET pushdown instead of materialise-and-skip — EXPLAIN ANALYZE shows you the actual rows touched, not the row count after the operator. That’s the difference between a 30 µs query and a 30 ms one on the same SQL.


Prerequisites

  • HeliosDB Nano v3.6+ (v3.12+ for Top-K and storage-level OFFSET pushdown)
  • A PostgreSQL client
  • ~20 minutes

1. The EXPLAIN Surface

-- Core PostgreSQL forms
EXPLAIN SELECT * FROM users;
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 1;
EXPLAIN (ANALYZE, VERBOSE) SELECT * FROM users;
EXPLAIN (FORMAT JSON) SELECT * FROM users;
EXPLAIN (FORMAT XML) SELECT * FROM users;
EXPLAIN (FORMAT YAML) SELECT * FROM users;
-- HeliosDB extensions
EXPLAIN (STORAGE) SELECT * FROM orders;
EXPLAIN (INDEXES) SELECT * FROM orders WHERE status = 'open';
EXPLAIN (WHY_NOT) SELECT * FROM orders WHERE LOWER(name) = 'a';
EXPLAIN (STATISTICS) SELECT * FROM orders;
EXPLAIN (AI) SELECT * FROM orders; -- natural-language explainer
EXPLAIN (ANALYZE, STORAGE, WHY_NOT) SELECT * FROM orders;

The full option set lives in src/sql/explain_options.rs:

OptionWhat it does
ANALYZEExecute the query and report actual rows + timing
VERBOSEOutput column lists, source tables, full expression trees
FORMAT {TEXT, JSON, XML, YAML, TREE}Output encoding
COSTSShow cost estimates (default ON)
BUFFERSStorage I/O counters (requires ANALYZE)
TIMINGPer-operator timing (default ON when ANALYZE is set)
SUMMARYSummary block at end
STORAGEBloom filter usage, zone maps, compression codec, prefix seek
AINatural-language explanation of the plan
WHY_NOTWhy the planner did NOT use a faster path (e.g., function in WHERE blocks index)
INDEXESUsed vs unused indexes; recommendations
STATISTICSRow counts, NDV, histogram availability

2. Setting Up a Workload

CREATE TABLE users (
id SERIAL PRIMARY KEY,
email TEXT UNIQUE,
name TEXT,
status TEXT,
created_at TIMESTAMP DEFAULT now()
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
total NUMERIC(10,2),
status TEXT,
created_at TIMESTAMP DEFAULT now()
);
INSERT INTO users (email, name, status)
SELECT 'u' || g || '@x.com', 'User ' || g, CASE WHEN g % 4 = 0 THEN 'inactive' ELSE 'active' END
FROM generate_series(1, 10000) AS g;
INSERT INTO orders (user_id, total, status)
SELECT (g % 10000) + 1, (random() * 1000)::numeric(10,2),
CASE WHEN g % 5 = 0 THEN 'cancelled' ELSE 'open' END
FROM generate_series(1, 50000) AS g;

3. Plan #1 — Simple SELECT

EXPLAIN SELECT * FROM users WHERE id = 42;
QUERY PLAN
-------------------------------------------------------------
Index Scan using users_pkey on users (cost=0.15..8.17 rows=1 width=64)
Index Cond: (id = 42)

The PK lookup is served by the ART index (zero-copy pk_index_lookup, see src/storage/). EXPLAIN ANALYZE confirms it took microseconds, not milliseconds:

EXPLAIN ANALYZE SELECT * FROM users WHERE id = 42;
QUERY PLAN
-------------------------------------------------------------
Index Scan using users_pkey on users (cost=0.15..8.17 rows=1 width=64)
Index Cond: (id = 42)
Planning Time: 0.123 ms
Execution Time: 0.041 ms

When the index is not used

EXPLAIN (WHY_NOT) SELECT * FROM users WHERE LOWER(email) = 'u42@x.com';

WHY_NOT reports: function call on the indexed column blocks the index. Fix: WHERE email = LOWER('u42@x.com') or a functional index.


4. Plan #2 — JOIN

EXPLAIN SELECT u.name, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.status = 'active' AND o.total > 500;
QUERY PLAN
---------------------------------------------------------------------
Hash Join (cost=312.80..1410.50 rows=3000 width=72)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o (cost=0.00..930.00 rows=10000 width=12)
Filter: (total > 500)
-> Hash (cost=270.00..270.00 rows=7500 width=64)
-> Seq Scan on users u (cost=0.00..270.00 rows=7500 width=64)
Filter: (status = 'active')

Hash join is the planner’s default for inner equi-joins; the build side is the smaller cardinality. Note JoinKey uses cross-type numeric coercion (Int4 ↔ Int8) since v3.8.2.


5. Plan #3 — GROUP BY

EXPLAIN SELECT user_id, COUNT(*) AS n, SUM(total) AS revenue
FROM orders
WHERE status = 'open'
GROUP BY user_id
HAVING COUNT(*) > 5;
QUERY PLAN
---------------------------------------------------------------------
HashAggregate (cost=1230.00..1280.00 rows=2000 width=24)
Group Key: user_id
Filter: (count(*) > 5)
-> Seq Scan on orders (cost=0.00..930.00 rows=40000 width=12)
Filter: (status = 'open')

CREATE TABLE chunks (
id SERIAL PRIMARY KEY,
embedding VECTOR(384)
);
CREATE INDEX chunks_emb_hnsw ON chunks USING hnsw (embedding vector_cosine_ops);
EXPLAIN SELECT id
FROM chunks
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;
QUERY PLAN
-------------------------------------------------------------
Limit (cost=0.00..1.50 rows=10 width=8)
-> Index Scan using chunks_emb_hnsw on chunks (cost=0.00..15000.00 rows=N width=8)
Order By: (embedding <=> '[…]'::vector)

The HNSW operator is invoked directly — no full sort, the heap is bounded at k = LIMIT.


7. Plan #5 — BM25 Full-Text

EXPLAIN SELECT id, title,
ts_rank_cd(to_tsvector(body), to_tsquery('helios')) AS rank
FROM articles
WHERE to_tsvector(body) @@ to_tsquery('helios')
ORDER BY rank DESC
LIMIT 10;
QUERY PLAN
-------------------------------------------------------------
Limit (cost=… rows=10)
-> Sort
Sort Key: rank DESC
-> Filter: (to_tsvector(body) @@ to_tsquery('helios'))
-> Seq Scan on articles

USING gin is accepted DDL but doesn’t surface as an Index Scan here — see FULL_TEXT_SEARCH_TUTORIAL for the rationale. The Sort collapses to Top-K automatically (next section).


8. Plan #6 — Hybrid (BM25 + Vector)

EXPLAIN SELECT id, text,
0.7 * (1.0 - (embedding <=> $1::vector))
+ 0.3 * ts_rank_cd(to_tsvector(text), plainto_tsquery($2)) AS score
FROM chunks
WHERE tenant_id = $3
AND (embedding <=> $1::vector) < 0.8
ORDER BY score DESC
LIMIT 10;

The plan composes: tenant filter → vector proximity cut (HNSW operator) → score expression → Top-K. The HNSW prefilter wrapper (added v3.19.0, #190) over-fetches k * over_fetch_multiplier, applies the row-level prefilter, then re-scores.


9. Top-K Rewrite (v3.12.0)

When the plan is Limit(Sort(…)) or Limit(Project(Sort(…))), TopKOperator (src/sql/executor/topk.rs) replaces the full sort with a bounded max-heap of size k = limit + offset:

EXPLAIN ANALYZE
SELECT * FROM orders ORDER BY total DESC LIMIT 5;
QUERY PLAN
-------------------------------------------------------------
Limit (Top-K) (cost=… rows=5 width=24)
-> TopK k=5
Sort Key: total DESC
-> Seq Scan on orders (rows=50000)

Complexity drops from O(N log N) to O(N log k) and memory from O(N) to O(k). The rewrite kicks in automatically whenever LIMIT has a concrete bound — no hint required.


10. Storage-Level OFFSET Pushdown (v3.12.0)

EXPLAIN ANALYZE
SELECT * FROM orders ORDER BY id LIMIT 5 OFFSET 990;

Pre-3.12, this materialised 995+ rows then dropped 990. Since v3.12, storage::EmbeddedStorage::scan_table_with_offset_limit skips offset rows at the RocksDB iterator level — no bincode decode, no decrypt, no dictionary/CAS resolve. The plan reads:

QUERY PLAN
-------------------------------------------------------------
Limit (cost=… rows=5 width=24)
-> Seq Scan (offset-pushdown) on orders actual rows=5
OFFSET-skipped: 990

actual rows=5 (not 995) is the diagnostic: the physical scan only deserialised 5 rows. For deeper pagination, prefer keyset — see KEYSET_PAGINATION_QUICKREF.


11. EXPLAIN (FORMAT JSON | XML | YAML)

EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 42;
[
{
"Plan": {
"Node Type": "Index Scan",
"Relation Name": "users",
"Index Name": "users_pkey",
"Index Cond": "(id = 42)",
"Total Cost": 8.17,
"Plan Rows": 1
}
}
]

JSON is what you parse from a CI gate; YAML is what you diff in code review; XML is what legacy tooling expects. All three round-trip identically.


12. Storage Layer Visibility

EXPLAIN (STORAGE) SELECT * FROM orders WHERE id = 42;
QUERY PLAN
-------------------------------------------------------------
Index Scan using orders_pkey on orders
Index Cond: (id = 42)
Storage:
ART index point-lookup : yes (zero-copy)
RocksDB bloom filter : 14-bit (whole-key)
Block size : 16 KB
Prefix extractor : table-id (8 bytes)
Compression : LZ4
Full-text search (tsvector/@@/ts_rank_cd) : yes (unstemmed, no phrase)

This reveals the engine choices — useful for “why is my query 10× slower in prod than dev” investigations.


13. Index Recommender

EXPLAIN (INDEXES) SELECT * FROM orders WHERE status = 'open';
INDEX ANALYSIS
Used: orders_pkey (no — sequential scan was cheaper)
Recommended: CREATE INDEX orders_status_idx ON orders (status)
Estimated speedup: 12.4×
Reason: WHERE on low-NDV column, 80% rows excluded
ROI score: 78

The recommender (src/sql/index_recommender.rs) calculates a ROI score from estimated speedup, affected query count, and storage cost. Apply, re-EXPLAIN, verify.


14. Workflow — Use Plans to Drive Index Design

The end-to-end loop:

  1. Run EXPLAIN (ANALYZE, INDEXES, WHY_NOT) <slow query>.
  2. If WHY_NOT reports a blocker (function call, type mismatch), rewrite the predicate.
  3. If INDEXES recommends a CREATE, apply it.
  4. Re-EXPLAIN — confirm Index Scan appears.
  5. Confirm Execution Time dropped.

Concrete example — a query that started as WHERE LOWER(email) = $1:

-- Step 1
EXPLAIN (ANALYZE, WHY_NOT) SELECT * FROM users WHERE LOWER(email) = 'u42@x.com';
-- WHY_NOT: function call on indexed column blocks index
-- Step 2: rewrite (emails are stored canonical-cased)
EXPLAIN (ANALYZE) SELECT * FROM users WHERE email = 'u42@x.com';
-- Index Scan using users_email_key
-- Or: keep the LOWER() and add a functional index
CREATE INDEX users_email_lower ON users (LOWER(email));

15. Keyset Pagination Plans (v3.12+)

EXPLAIN ANALYZE
SELECT id, created_at, name
FROM users
WHERE (created_at, id) < ($1, $2)
ORDER BY created_at DESC, id DESC
LIMIT 20;

The row-constructor predicate is recognised and evaluated lexicographically (new LogicalExpr::Tuple, v3.12). Plan reads:

Limit (Top-K) k=20
-> Sort Sort Key: created_at DESC, id DESC
-> Filter: ((created_at, id) < (…, …))
-> Seq Scan on users

If (created_at, id) is a btree index, the filter pushes through to Index Scan — see KEYSET_PAGINATION_QUICKREF.


Troubleshooting

SymptomCauseFix
EXPLAIN ANALYZE shows actual rows=N >> plan rowsStats staleRun ANALYZE <table> to refresh
Top-K doesn’t fireLIMIT with non-constant boundUse a literal or bound parameter
OFFSET pushdown not visibleMixed predicates block the iteratorMove filters into the table scan or use keyset
Index Scan not chosen on small tablePlanner picks Seq Scan for tiny relationsExpected — Seq Scan wins under ~1000 rows
WHY_NOT says “function on column”LOWER/UPPER/CAST in WHERERewrite predicate or add functional index

Where Next


CHANGELOG references: v3.6.0 (fast paths + bench wins vs PG13), v3.12.0 (Top-K + storage-level OFFSET pushdown + row-constructor compare), v3.19.0 (centrality-biased HNSW prefilter wrapper), v3.19.1 (current verified release).