EXPLAIN and Optimization Tutorial

Available since: v3.6.0 (base EXPLAIN), expanded in v3.12.0 (Top-K, storage-level OFFSET pushdown), refined through v3.19.1. Build: default — all features in this tutorial run on the standard heliosdb-nano binary. Surface: PostgreSQL-compatible EXPLAIN plus HeliosDB extensions (STORAGE, AI, WHY_NOT, INDEXES, STATISTICS).

UVP

PostgreSQL EXPLAIN tells you a tree. HeliosDB Nano EXPLAIN tells you the tree and which storage features fired (bloom filters, zone maps, RocksDB prefix seek, ART index point-lookup), why optimisations weren’t applied (WHY_NOT), and what indexes you should add (INDEXES). All four output formats — TEXT, JSON, XML, YAML — are wire-compatible with PostgreSQL clients, so existing tooling (DataGrip, pgcli, dbeaver) renders them without changes. One EXPLAIN, three layers visible: logical plan, physical operators, storage engine.

When the plan rewrites — Top-K instead of full sort, storage-level OFFSET pushdown instead of materialise-and-skip — EXPLAIN ANALYZE shows you the actual rows touched, not the row count after the operator. That’s the difference between a 30 µs query and a 30 ms one on the same SQL.

Prerequisites

HeliosDB Nano v3.6+ (v3.12+ for Top-K and storage-level OFFSET pushdown)
A PostgreSQL client
~20 minutes

1. The EXPLAIN Surface

-- Core PostgreSQL forms
EXPLAIN SELECT * FROM users;
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 1;
EXPLAIN (ANALYZE, VERBOSE) SELECT * FROM users;
EXPLAIN (FORMAT JSON) SELECT * FROM users;
EXPLAIN (FORMAT XML)  SELECT * FROM users;
EXPLAIN (FORMAT YAML) SELECT * FROM users;

-- HeliosDB extensions
EXPLAIN (STORAGE)             SELECT * FROM orders;
EXPLAIN (INDEXES)             SELECT * FROM orders WHERE status = 'open';
EXPLAIN (WHY_NOT)             SELECT * FROM orders WHERE LOWER(name) = 'a';
EXPLAIN (STATISTICS)          SELECT * FROM orders;
EXPLAIN (AI)                  SELECT * FROM orders;     -- natural-language explainer
EXPLAIN (ANALYZE, STORAGE, WHY_NOT) SELECT * FROM orders;

The full option set lives in src/sql/explain_options.rs:

Option	What it does
`ANALYZE`	Execute the query and report actual rows + timing
`VERBOSE`	Output column lists, source tables, full expression trees
`FORMAT {TEXT, JSON, XML, YAML, TREE}`	Output encoding
`COSTS`	Show cost estimates (default ON)
`BUFFERS`	Storage I/O counters (requires ANALYZE)
`TIMING`	Per-operator timing (default ON when ANALYZE is set)
`SUMMARY`	Summary block at end
`STORAGE`	Bloom filter usage, zone maps, compression codec, prefix seek
`AI`	Natural-language explanation of the plan
`WHY_NOT`	Why the planner did NOT use a faster path (e.g., function in WHERE blocks index)
`INDEXES`	Used vs unused indexes; recommendations
`STATISTICS`	Row counts, NDV, histogram availability

2. Setting Up a Workload

CREATE TABLE users (
    id          SERIAL PRIMARY KEY,
    email       TEXT UNIQUE,
    name        TEXT,
    status      TEXT,
    created_at  TIMESTAMP DEFAULT now()
);

CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    user_id     INTEGER REFERENCES users(id),
    total       NUMERIC(10,2),
    status      TEXT,
    created_at  TIMESTAMP DEFAULT now()
);

INSERT INTO users (email, name, status)
SELECT 'u' || g || '@x.com', 'User ' || g, CASE WHEN g % 4 = 0 THEN 'inactive' ELSE 'active' END
FROM generate_series(1, 10000) AS g;

INSERT INTO orders (user_id, total, status)
SELECT (g % 10000) + 1, (random() * 1000)::numeric(10,2),
       CASE WHEN g % 5 = 0 THEN 'cancelled' ELSE 'open' END
FROM generate_series(1, 50000) AS g;

3. Plan #1 — Simple SELECT

EXPLAIN SELECT * FROM users WHERE id = 42;

                          QUERY PLAN
-------------------------------------------------------------
 Index Scan using users_pkey on users  (cost=0.15..8.17 rows=1 width=64)
   Index Cond: (id = 42)

The PK lookup is served by the ART index (zero-copy pk_index_lookup, see src/storage/). EXPLAIN ANALYZE confirms it took microseconds, not milliseconds:

EXPLAIN ANALYZE SELECT * FROM users WHERE id = 42;

                          QUERY PLAN
-------------------------------------------------------------
 Index Scan using users_pkey on users  (cost=0.15..8.17 rows=1 width=64)
   Index Cond: (id = 42)
   Planning Time: 0.123 ms
   Execution Time: 0.041 ms

When the index is not used

EXPLAIN (WHY_NOT) SELECT * FROM users WHERE LOWER(email) = 'u42@x.com';

WHY_NOT reports: function call on the indexed column blocks the index. Fix: WHERE email = LOWER('u42@x.com') or a functional index.

4. Plan #2 — JOIN

EXPLAIN SELECT u.name, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.status = 'active' AND o.total > 500;

                              QUERY PLAN
---------------------------------------------------------------------
 Hash Join  (cost=312.80..1410.50 rows=3000 width=72)
   Hash Cond: (o.user_id = u.id)
   ->  Seq Scan on orders o  (cost=0.00..930.00 rows=10000 width=12)
         Filter: (total > 500)
   ->  Hash  (cost=270.00..270.00 rows=7500 width=64)
         ->  Seq Scan on users u  (cost=0.00..270.00 rows=7500 width=64)
               Filter: (status = 'active')

Hash join is the planner’s default for inner equi-joins; the build side is the smaller cardinality. Note JoinKey uses cross-type numeric coercion (Int4 ↔ Int8) since v3.8.2.

5. Plan #3 — GROUP BY

EXPLAIN SELECT user_id, COUNT(*) AS n, SUM(total) AS revenue
FROM orders
WHERE status = 'open'
GROUP BY user_id
HAVING COUNT(*) > 5;

                              QUERY PLAN
---------------------------------------------------------------------
 HashAggregate  (cost=1230.00..1280.00 rows=2000 width=24)
   Group Key: user_id
   Filter: (count(*) > 5)
   ->  Seq Scan on orders  (cost=0.00..930.00 rows=40000 width=12)
         Filter: (status = 'open')

6. Plan #4 — HNSW Vector Search

CREATE TABLE chunks (
    id        SERIAL PRIMARY KEY,
    embedding VECTOR(384)
);
CREATE INDEX chunks_emb_hnsw ON chunks USING hnsw (embedding vector_cosine_ops);

EXPLAIN SELECT id
FROM chunks
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

                          QUERY PLAN
-------------------------------------------------------------
 Limit  (cost=0.00..1.50 rows=10 width=8)
   ->  Index Scan using chunks_emb_hnsw on chunks  (cost=0.00..15000.00 rows=N width=8)
         Order By: (embedding <=> '[…]'::vector)

The HNSW operator is invoked directly — no full sort, the heap is bounded at k = LIMIT.

7. Plan #5 — BM25 Full-Text

EXPLAIN SELECT id, title,
       ts_rank_cd(to_tsvector(body), to_tsquery('helios')) AS rank
FROM articles
WHERE to_tsvector(body) @@ to_tsquery('helios')
ORDER BY rank DESC
LIMIT 10;

                          QUERY PLAN
-------------------------------------------------------------
 Limit  (cost=… rows=10)
   ->  Sort
         Sort Key: rank DESC
         ->  Filter: (to_tsvector(body) @@ to_tsquery('helios'))
               ->  Seq Scan on articles

USING gin is accepted DDL but doesn’t surface as an Index Scan here — see FULL_TEXT_SEARCH_TUTORIAL for the rationale. The Sort collapses to Top-K automatically (next section).

8. Plan #6 — Hybrid (BM25 + Vector)

EXPLAIN SELECT id, text,
       0.7 * (1.0 - (embedding <=> $1::vector))
     + 0.3 * ts_rank_cd(to_tsvector(text), plainto_tsquery($2)) AS score
FROM chunks
WHERE tenant_id = $3
  AND (embedding <=> $1::vector) < 0.8
ORDER BY score DESC
LIMIT 10;

The plan composes: tenant filter → vector proximity cut (HNSW operator) → score expression → Top-K. The HNSW prefilter wrapper (added v3.19.0, #190) over-fetches k * over_fetch_multiplier, applies the row-level prefilter, then re-scores.

9. Top-K Rewrite

When the plan is Limit(Sort(…)) or Limit(Project(Sort(…))), TopKOperator (src/sql/executor/topk.rs) replaces the full sort with a bounded max-heap of size k = limit + offset:

EXPLAIN ANALYZE
SELECT * FROM orders ORDER BY total DESC LIMIT 5;

                          QUERY PLAN
-------------------------------------------------------------
 Limit (Top-K)  (cost=… rows=5 width=24)
   ->  TopK  k=5
         Sort Key: total DESC
         ->  Seq Scan on orders  (rows=50000)

Complexity drops from O(N log N) to O(N log k) and memory from O(N) to O(k). The rewrite kicks in automatically whenever LIMIT has a concrete bound — no hint required.

10. Storage-Level OFFSET Pushdown

EXPLAIN ANALYZE
SELECT * FROM orders ORDER BY id LIMIT 5 OFFSET 990;

Pre-3.12, this materialised 995+ rows then dropped 990. Since v3.12, storage::EmbeddedStorage::scan_table_with_offset_limit skips offset rows at the RocksDB iterator level — no bincode decode, no decrypt, no dictionary/CAS resolve. The plan reads:

                          QUERY PLAN
-------------------------------------------------------------
 Limit  (cost=… rows=5 width=24)
   ->  Seq Scan (offset-pushdown)  on orders   actual rows=5
         OFFSET-skipped: 990

actual rows=5 (not 995) is the diagnostic: the physical scan only deserialised 5 rows. For deeper pagination, prefer keyset — see KEYSET_PAGINATION_QUICKREF.

11. EXPLAIN (FORMAT JSON | XML | YAML)

EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 42;

[
  {
    "Plan": {
      "Node Type": "Index Scan",
      "Relation Name": "users",
      "Index Name": "users_pkey",
      "Index Cond": "(id = 42)",
      "Total Cost": 8.17,
      "Plan Rows": 1
    }
  }
]

JSON is what you parse from a CI gate; YAML is what you diff in code review; XML is what legacy tooling expects. All three round-trip identically.

12. Storage Layer Visibility

EXPLAIN (STORAGE) SELECT * FROM orders WHERE id = 42;

                          QUERY PLAN
-------------------------------------------------------------
 Index Scan using orders_pkey on orders
   Index Cond: (id = 42)
   Storage:
     ART index point-lookup        : yes (zero-copy)
     RocksDB bloom filter          : 14-bit (whole-key)
     Block size                    : 16 KB
     Prefix extractor              : table-id (8 bytes)
     Compression                   : LZ4
     Full-text search (tsvector/@@/ts_rank_cd)  : yes (unstemmed, no phrase)

This reveals the engine choices — useful for “why is my query 10× slower in prod than dev” investigations.

13. Index Recommender

EXPLAIN (INDEXES) SELECT * FROM orders WHERE status = 'open';

 INDEX ANALYSIS
   Used: orders_pkey  (no — sequential scan was cheaper)
   Recommended: CREATE INDEX orders_status_idx ON orders (status)
     Estimated speedup: 12.4×
     Reason: WHERE on low-NDV column, 80% rows excluded
     ROI score: 78

The recommender (src/sql/index_recommender.rs) calculates a ROI score from estimated speedup, affected query count, and storage cost. Apply, re-EXPLAIN, verify.

14. Workflow — Use Plans to Drive Index Design

The end-to-end loop:

Run EXPLAIN (ANALYZE, INDEXES, WHY_NOT) <slow query>.
If WHY_NOT reports a blocker (function call, type mismatch), rewrite the predicate.
If INDEXES recommends a CREATE, apply it.
Re-EXPLAIN — confirm Index Scan appears.
Confirm Execution Time dropped.

Concrete example — a query that started as WHERE LOWER(email) = $1:

-- Step 1
EXPLAIN (ANALYZE, WHY_NOT) SELECT * FROM users WHERE LOWER(email) = 'u42@x.com';
-- WHY_NOT: function call on indexed column blocks index

-- Step 2: rewrite (emails are stored canonical-cased)
EXPLAIN (ANALYZE) SELECT * FROM users WHERE email = 'u42@x.com';
-- Index Scan using users_email_key

-- Or: keep the LOWER() and add a functional index
CREATE INDEX users_email_lower ON users (LOWER(email));

15. Keyset Pagination Plans (v3.12+)

EXPLAIN ANALYZE
SELECT id, created_at, name
FROM users
WHERE (created_at, id) < ($1, $2)
ORDER BY created_at DESC, id DESC
LIMIT 20;

The row-constructor predicate is recognised and evaluated lexicographically (new LogicalExpr::Tuple, v3.12). Plan reads:

 Limit (Top-K)  k=20
   ->  Sort   Sort Key: created_at DESC, id DESC
         ->  Filter: ((created_at, id) < (…, …))
               ->  Seq Scan on users

If (created_at, id) is a btree index, the filter pushes through to Index Scan — see KEYSET_PAGINATION_QUICKREF.

Troubleshooting

Symptom	Cause	Fix
`EXPLAIN ANALYZE` shows `actual rows=N` >> plan rows	Stats stale	Run `ANALYZE <table>` to refresh
`Top-K` doesn’t fire	`LIMIT` with non-constant bound	Use a literal or bound parameter
OFFSET pushdown not visible	Mixed predicates block the iterator	Move filters into the table scan or use keyset
`Index Scan` not chosen on small table	Planner picks `Seq Scan` for tiny relations	Expected — `Seq Scan` wins under ~1000 rows
`WHY_NOT` says “function on column”	LOWER/UPPER/CAST in WHERE	Rewrite predicate or add functional index

Where Next

KEYSET_PAGINATION_QUICKREF — the pagination shape EXPLAIN tells you to use.
WINDOW_FUNCTIONS_QUICKREF — see how window plans look.
FULL_TEXT_SEARCH_TUTORIAL — BM25 plans and hybrid retrieval.
VECTOR_SEARCH_TUTORIAL — HNSW operator surface.

CHANGELOG references: v3.6.0 (fast paths + bench wins vs PG13), v3.12.0 (Top-K + storage-level OFFSET pushdown + row-constructor compare), v3.19.0 (centrality-biased HNSW prefilter wrapper), v3.19.1 (current verified release).