Skip to content

Index Types Comparison Guide

Index Types Comparison Guide

Quick Reference | Performance Focus | Decision Support

This guide helps you choose the right index type for your HeliosDB workload. It covers all supported index types, includes performance benchmarks, decision matrices, and copy-paste SQL examples.


Table of Contents

  1. Index Types Overview
  2. Index Type Details
  3. Comparison Matrix
  4. Decision Flowchart
  5. Performance Benchmarks
  6. CREATE INDEX Examples
  7. MVCC & Transaction Interaction
  8. Health Checking & Monitoring
  9. Common Mistakes
  10. Optimization Strategies

Index Types Overview

HeliosDB supports all PostgreSQL-compatible index types, plus custom optimizations for modern hardware (SIMD, MVCC-awareness).

Supported Index Types

TypeUse CaseStorageSpeedMaintenance
B-TreeGeneral purpose, equality, rangeMediumVery FastFast
HashEquality only, fixed-size keysLowFastestMedium
GiSTGeometric/spatial searchesHighModerateMedium
GINFull-text search, arrays, JSONBHighVery FastSlow
BRINLarge sequential tablesVery LowFastVery Fast
Custom B+TreeMVCC-native, concurrent workloadsMediumUltra-FastFast

Index Type Details

1. B-Tree Index (Default)

What it’s best for: 95% of your queries. The universal workhorse.

When to use:

  • Equality searches: WHERE user_id = 42
  • Range queries: WHERE age BETWEEN 18 AND 65
  • Sorting: ORDER BY created_at DESC
  • Prefix searches: WHERE name LIKE 'John%'
  • Multi-column searches: WHERE (user_id, status) = (42, 'active')
  • NULL value handling (standard SQL nulls)

Query patterns:

WHERE column = value -- ✓ Excellent
WHERE column > value -- ✓ Excellent
WHERE column IN (val1, val2) -- ✓ Excellent
WHERE column LIKE 'prefix%' -- ✓ Good
WHERE col1 > x AND col2 < y -- ✓ Good (first condition)

When NOT to use:

  • Equality-only workloads with perfect distribution (use Hash instead)
  • Full-text search (use GIN)
  • Spatial searches on geometries (use GiST)
  • Very large append-only tables without range queries (use BRIN)

Storage overhead: 15-25% of table size (depends on key cardinality)

Query performance:

  • Single row lookup: O(log N)
  • Range scan: O(log N + result_size)
  • Worst-case: 3-4 disk I/Os for 1B row table

Maintenance cost:

  • Insert: 0.5-2µs per row (includes locking)
  • Update: 1-3µs per row
  • Delete: 0.5-2µs per row
  • Node splits: Automatic, amortized into inserts

MVCC Integration:

  • Indexes store version pointers, not values
  • Multiple MVCC versions share same index entry
  • Phantom prevention via predicate locks

2. Hash Index

What it’s best for: Lightning-fast equality lookups on large tables.

When to use:

  • Exact match only: WHERE id = X
  • High-selectivity columns (distinct values > 50% of table)
  • Very large tables (>100M rows) with equality workloads
  • Low cardinality with frequent lookups

Query patterns:

WHERE id = value -- ✓✓ Fastest
WHERE user_uuid = value -- ✓✓ Fastest
WHERE status = 'active' -- ✓✓ Fastest (if binary)
WHERE id IN (v1, v2, v3) -- ✓ Good (expands to OR)

When NOT to use:

  • Range queries: WHERE age > 21 (Hash can’t help)
  • Sorting: ORDER BY id (Hash doesn’t maintain order)
  • NULL handling (Hash treats NULLs specially)
  • Inequality: WHERE id != X
  • LIKE or pattern matching
  • Queries with INDEX SCANS (Hash forces index lookups)

Storage overhead: 10-15% of table size (very efficient)

Query performance:

  • Single row: 50-100ns (10x faster than B-Tree)
  • O(1) average case, O(N) worst case
  • No disk I/O if in shared_buffers
  • Collision handling: chaining or open addressing

Maintenance cost:

  • Insert: 0.2-0.5µs (faster than B-Tree, no ordering)
  • Update: 0.3-0.7µs
  • Delete: 0.2-0.5µs
  • Resize: Happens when load factor exceeds threshold (transparent)

MVCC Integration:

  • Simpler than B-Tree (no phantom prevention needed)
  • No predicate locks required (equality-only)
  • Direct version pointer storage

Limitations in HeliosDB:

  • Not recommended for replicated systems (hash function inconsistency)
  • Replication may require index rebuild

3. GiST (Generalized Search Tree)

What it’s best for: Spatial queries, geometric operations, range searches with custom operators.

When to use:

  • Geometric searches: WHERE polygon && other_polygon
  • Spatial exclusion: WHERE NOT point <@ circle
  • Range operators on custom types
  • Any R-tree like hierarchy
  • Network addresses (INET/CIDR types)
  • PostGIS geometries

Query patterns:

WHERE location && search_box -- Overlaps
WHERE location <@ search_circle -- Contained by
WHERE location @> point -- Contains
WHERE gist_col LIKE pattern -- Custom operators
WHERE ip_address << '192.168.1.0/24' -- Network containment

When NOT to use:

  • Simple equality (use B-Tree or Hash)
  • Full-text search (use GIN)
  • Very high-dimensional data (curse of dimensionality)
  • Tight latency SLAs (slower than B-Tree)

Storage overhead: 25-40% of table size

Query performance:

  • Spatial intersection: O(log N + result_size)
  • Can require multiple page fetches
  • Insertion point algorithm: slower than B-Tree
  • Pruning effectiveness varies by data distribution

Maintenance cost:

  • Insert: 3-8µs (includes bounding box computation)
  • Update: 5-12µs
  • Delete: 3-8µs
  • Node splits more expensive than B-Tree

MVCC Integration:

  • Supports MVCC via version pointers
  • No special handling for spatial operators
  • Phantom prevention via bounding box locks

Use cases in HeliosDB:

- Location-based queries (e-commerce, maps)
- Geofencing and proximity searches
- Range searches with custom comparators
- Network ACL checking

4. GIN (Generalized Inverted Index)

What it’s best for: Full-text search, array contains, JSONB key lookups.

When to use:

  • Full-text search: WHERE document @@ query
  • Array contains: WHERE tags @> ARRAY['tech']
  • JSONB key lookup: WHERE doc->'user'->>'name' = 'value'
  • Multi-valued columns
  • Search with many possible values

Query patterns:

WHERE document @@ to_tsquery('database & search') -- Full-text
WHERE tags @> ARRAY['tech', 'database'] -- Arrays
WHERE doc @> '{"status":"active"}'::jsonb -- JSONB
WHERE doc ? 'user_id' -- JSONB key exists
WHERE array_col && ARRAY[1, 2, 3] -- Overlap
WHERE tsvector_col @@ plainto_tsquery('hello') -- Phrase

When NOT to use:

  • Equality on scalar values (use Hash or B-Tree)
  • Range queries on numbers (use B-Tree or BRIN)
  • Complex geometric queries (use GiST)
  • Write-heavy workloads (GIN inserts are slow)

Storage overhead: 40-80% of table size (highly variable)

Query performance:

  • Full-text: O(1) to O(log N) depending on term frequency
  • Array contains: O(log N)
  • Very fast for highly selective queries
  • Result filtering still required (index produces candidates)

Maintenance cost:

  • Insert: 5-20µs (very slow due to posting list updates)
  • Update: 10-30µs (delete + insert)
  • Delete: 5-20µs
  • Bulk insert: Can be 100x slower than B-Tree
  • Pending list mode recommended for high-velocity inserts

Pending List Performance Trick:

-- GIN indexes use pending lists for write-heavy workloads
-- Build index with: gin_pending_list_limit = 4MB
-- Allows batched inserts, then auto-flushes to main index
ALTER INDEX idx_name SET (gin_pending_list_limit = 4096);

MVCC Integration:

  • Posting lists store version chains
  • Full-text search automatically filters by snapshot
  • GIN filtering is snapshot-aware in HeliosDB

Real-world usage:

- Product search in e-commerce (1000s of items)
- Document search (>10M documents)
- Tag-based filtering (social media)
- JSONB configuration lookups

5. BRIN (Block Range Index)

What it’s best for: ENORMOUS append-only tables with sequential data.

When to use:

  • Time-series data (always inserted in time order)
  • Log tables (>1B rows, only appended)
  • Immutable data warehouses
  • Any monotonically increasing column
  • Partitioned tables (one BRIN per partition)

Query patterns:

WHERE timestamp BETWEEN '2024-01-01' AND '2024-12-31' -- ✓ Excellent
WHERE sensor_id = 42 AND time > now() - INTERVAL '7d' -- ✓ Good
WHERE date >= '2025-01-01' -- ✓ Excellent
WHERE value > threshold AND time IN (...) -- ✓ Good

When NOT to use:

  • Random insertions (kills BRIN effectiveness)
  • Small tables (overhead not worth it)
  • Update/delete heavy workloads
  • Non-monotonic data
  • High cardinality on non-ordered columns

Storage overhead: 0.1-0.5% of table size (TINY!)

Query performance:

  • Block index scan: O(1)
  • Returns many block ranges (false positives)
  • Requires sequential scan through blocks
  • For 1B row table: 256MB index, handles 1B rows

Maintenance cost:

  • Insert: 0.01µs (constant-time, append only)
  • Update: NOT RECOMMENDED (breaks BRIN)
  • Delete: NOT RECOMMENDED (breaks BRIN)
  • Vacuum: Automatic with autovacuum

BRIN Summarization:

-- Manually summarize blocks (after bulk inserts)
SELECT brin_summarize_new_values('idx_name');
-- Check summary statistics
SELECT * FROM brin_metapage_info(get_raw_page('idx_name', 0));

Real-world benchmarks (1B row append-only table):

B-Tree index: 15 GB, 200ms for range query
BRIN index: 256 KB, 500ms for range query
Space saved: 99.8%
Query speedup vs. no index: 50x

6. Custom B+Tree (HeliosDB-Specific)

What it’s best for: MVCC-native concurrent workloads with high contention.

When to use:

  • OLTP workloads (5,000+ tpmC)
  • High concurrency (>1000 concurrent connections)
  • Mixed read-write workloads
  • Applications needing MVCC guarantees
  • Serializable isolation required

Advantages over standard B-Tree:

Standard B-Tree:
- 2,000 concurrent ops/sec ceiling
- Single RwLock per tree
- No MVCC awareness
- Potential phantom reads
Custom B+Tree (HeliosDB):
- 5,000+ concurrent ops/sec
- Per-node latching + optimistic reads
- MVCC-native version pointers
- Phantom prevention via predicate locks

Query performance:

  • Point lookup: 100-150ns (50% faster than standard B-Tree)
  • Range scan: Similar to B-Tree (100µs for 1K rows)
  • Optimistic path: Lock-free for 90% of reads
  • Pessimistic path: Latch coupling for writes

Maintenance cost:

  • Insert: 0.3-1.0µs (faster due to concurrent design)
  • Update: 0.5-1.5µs
  • Delete: 0.3-1.0µs
  • Node splits: Still O(1) amortized

MVCC Integration (Native):

  • Index entries store version pointers directly
  • Predicate locks on ranges (not just points)
  • SSI conflict detection via index locks
  • Phantom prevention built-in

Comparison Matrix

CharacteristicB-TreeHashGiSTGINBRINCustom B+Tree
Equality searchesExcellentFastestGoodFairPoorExcellent
Range searchesExcellentNoGoodPoorExcellentExcellent
Sorting (ORDER BY)ExcellentNoNoNoPoorExcellent
LIKE prefixExcellentNoFairGoodPoorExcellent
Full-text searchNoNoNoExcellentNoNo
GeometricFairNoExcellentNoNoNo
ArraysNoNoNoExcellentNoNo
JSONBFairNoNoGoodNoFair
NULL handlingGoodFairGoodNoFairGood
Storage sizeMediumLowHighVery HighMinimalMedium
Insert speedFastVery FastMediumSlowInstantVery Fast
Update speedFastVery FastMediumSlowN/AVery Fast
Delete speedFastVery FastMediumSlowN/AVery Fast
Memory efficiencyGoodExcellentFairPoorExcellentGood
MVCC friendlyGoodFairFairFairExcellentExcellent
Concurrent ops/sec2,00010,000*1,500500N/A5,000+
Recommended tables<500M rows<10MGeo dataText/Arrays>1B rowsOLTP systems

* Hash does not maintain index order, so concurrent performance varies by query type


Decision Flowchart

┌─ START: Choose an index type
├─ Is it a sequential/append-only table with >1B rows?
│ ├─ YES → Use BRIN (99.8% space savings)
│ └─ NO → Continue
├─ Do you need spatial/geometric queries?
│ ├─ YES → Use GiST (PostGIS geometry, coordinates)
│ └─ NO → Continue
├─ Do you need full-text search or array operations?
│ ├─ YES → Use GIN (document @@ query, tags @> [])
│ └─ NO → Continue
├─ Is this OLTP with high concurrency (>1000 connections)?
│ ├─ YES → Use Custom B+Tree (5,000+ ops/sec)
│ └─ NO → Continue
├─ Is the workload equality-ONLY (no ranges, no sorting)?
│ ├─ YES → Use Hash (50% faster than B-Tree)
│ └─ NO → Continue
└─ Use B-Tree (covers 95% of remaining cases)
DECISION MATRIX: "Use X index when..."
Use B-Tree when:
├─ Creating indexes for the first time
├─ Queries mix equality and range operations
├─ ORDER BY or LIKE operations needed
├─ Multi-column searches (composite indexes)
└─ You're unsure → default is B-Tree
Use Hash when:
├─ ONLY doing exact-match queries
├─ Column has high cardinality (millions of distinct values)
├─ Table >100M rows with perfect equality workload
├─ Replication not required (or carefully tested)
└─ You want 50% faster lookups than B-Tree
Use GiST when:
├─ Storing PostGIS geometry
├─ Proximity searches (nearest neighbor)
├─ Range operator queries
├─ Custom geometric types
└─ Spatial exclusion operators (&&, @>, <@)
Use GIN when:
├─ Full-text search required (@@ operator)
├─ Array overlap/contains (@>, &&, @>)
├─ JSONB key lookups (?, ?|, ?&)
├─ Search workload (lots of reads, few writes)
└─ Many possible matching values
Use BRIN when:
├─ Table >1 billion rows
├─ Data monotonically increasing (time-ordered)
├─ Append-only workload (no updates/deletes)
├─ Storage is critical (save 99.8%)
└─ Latency not critical (slower but saves space)
Use Custom B+Tree when:
├─ OLTP system with >5000 tpmC
├─ High concurrency (>1000 connections)
├─ Serializable isolation required
├─ Phantom read prevention essential
└─ MVCC performance is critical

Performance Benchmarks

Real-World Query Performance

Benchmark Setup

Table: orders (500M rows)
Index: ON order_date
Hardware: 32-core, 128GB RAM, NVMe SSD
Warmup: 30s cache warm-up
Trials: 10 runs, averaged

Point Lookup: WHERE order_id = X

No Index: 850ms (full table scan)
B-Tree Index: 0.3ms (1,000x faster)
Hash Index: 0.15ms (5,700x faster)
Custom B+Tree: 0.1ms (8,500x faster)

Range Query: WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31'

No Index: 2,500ms (full scan)
B-Tree Index: 12ms (200x faster)
BRIN Index: 28ms (89x faster)
GiST Index: 18ms (139x faster)
Custom B+Tree: 8ms (312x faster)

Array Contains: WHERE tags @> ARRAY['urgent'] (1M rows)

No Index: 3,200ms (filter all rows)
B-Tree Index: 2,900ms (can't help much)
GIN Index: 22ms (145x faster)
No Index: 5,000ms (sequential scan)
B-Tree Index: 4,800ms (can't help)
GIN Index: 35ms (143x faster)

Multi-Column Composite Index

CREATE INDEX idx_composite ON orders (customer_id, order_date)
Query: WHERE customer_id = X AND order_date > Y
B-Tree (composite): 0.5ms
B-Tree (separate): 2.1ms (can't use both efficiently)
Custom B+Tree: 0.2ms (optimized latch coupling)

Insert Performance Impact

10M Row Bulk Insert

No Index: 2.5 seconds
B-Tree Index: 4.1 seconds (+64% overhead)
Hash Index: 3.8 seconds (+52% overhead)
GIN Index: 12.5 seconds (+400% overhead!)
BRIN Index: 2.6 seconds (+4% overhead)
Custom B+Tree: 3.2 seconds (+28% overhead)

Lesson: Avoid GIN indexes on high-insert workloads (drop index, load data, recreate)

Concurrent Workload: TPC-C Equivalent

Workload: 1000 connections, mixed read/write
No Index: 2,100 tpmC
B-Tree Indexes: 2,800 tpmC
Hash Indexes: 3,200 tpmC (equality-only)
Custom B+Tree: 5,100 tpmC (2x improvement!)

CREATE INDEX Examples

1. Basic B-Tree Index (Copy-Paste)

-- Single column
CREATE INDEX idx_users_email ON users(email);
-- Composite index (most efficient for multi-column filters)
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, order_date DESC);
-- Partial index (exclude inactive records)
CREATE INDEX idx_active_orders
ON orders(order_date)
WHERE status = 'active';
-- Covering index (includes extra columns for index-only scans)
CREATE INDEX idx_orders_covering
ON orders(customer_id, order_date)
INCLUDE (total_amount);

2. Hash Index (Copy-Paste)

-- Fast equality lookups
CREATE INDEX idx_users_id USING HASH ON users(user_id);
-- Hash on UUID (common use case)
CREATE INDEX idx_session_uuid USING HASH ON sessions(session_uuid);
-- Multi-column hash (not recommended, use separate indexes instead)
-- ❌ DON'T: CREATE INDEX idx_hash_multi USING HASH ON table(col1, col2);
-- ✓ DO: CREATE INDEX idx_col1 USING HASH ON table(col1);
-- CREATE INDEX idx_col2 USING HASH ON table(col2);

3. GiST Index for Spatial Queries (Copy-Paste)

-- PostGIS geometry
CREATE INDEX idx_locations_geo ON stores USING GIST(location);
-- Query: Find stores within a radius
SELECT * FROM stores
WHERE location && ST_DWithin(
ST_GeomFromText('POINT(-74 40)', 4326),
5000 -- 5km radius
);
-- Network address search
CREATE INDEX idx_ip_network USING GIST(ip_address inet_ops);
SELECT * FROM access_logs WHERE ip_address << '192.168.1.0/24';
-- Geometric exclusion
CREATE INDEX idx_exclusion ON zones(geom) USING GIST(geom);
SELECT * FROM zones WHERE geom && other_zone_geom;

4. GIN Index for Full-Text Search (Copy-Paste)

-- Full-text search index
CREATE INDEX idx_documents_fts ON documents USING GIN(document);
-- Query full-text search
SELECT * FROM documents
WHERE document @@ to_tsquery('database & search')
ORDER BY ts_rank(document, query) DESC
LIMIT 10;
-- Array contains
CREATE INDEX idx_tags_gin ON articles USING GIN(tags);
SELECT * FROM articles WHERE tags @> ARRAY['tech', 'database'];
-- JSONB search
CREATE INDEX idx_metadata_gin ON products USING GIN(metadata);
SELECT * FROM products
WHERE metadata @> '{"color":"red","size":"large"}'::jsonb;
-- JSONB key path
CREATE INDEX idx_jsonb_path ON logs USING GIN(data);
SELECT * FROM logs WHERE data ? 'user_id';

5. BRIN Index for Time-Series (Copy-Paste)

-- Time-series append-only table
CREATE TABLE events (
id BIGSERIAL,
timestamp TIMESTAMP NOT NULL,
sensor_id INT,
value FLOAT
);
-- Create BRIN on timestamp (append-only)
CREATE INDEX idx_events_time ON events USING BRIN(timestamp)
WITH (pages_per_range = 128); -- Adjust for your data size
-- Query 7 days of data
SELECT COUNT(*) FROM events
WHERE timestamp > now() - INTERVAL '7 days'
AND sensor_id = 42;
-- BRIN with custom pages_per_range
-- pages_per_range = 64 → finer granularity, slightly larger index
-- pages_per_range = 256 → coarser, smaller index, might scan more blocks
-- Default is 128

6. Composite Index (Most Important) - Copy-Paste

-- Query: SELECT * FROM orders WHERE customer_id = X AND order_date > Y
--
-- ✓ GOOD: Composite index
CREATE INDEX idx_orders_composite
ON orders(customer_id, order_date);
-- ❌ BAD: Separate indexes (planner picks one, other is wasted)
-- CREATE INDEX idx_customer ON orders(customer_id);
-- CREATE INDEX idx_date ON orders(order_date);
-- Advanced: Covering index (PostgreSQL 11+)
CREATE INDEX idx_orders_covering
ON orders(customer_id, order_date)
INCLUDE (total_amount, status); -- Non-key columns for index-only scans
-- Query can be answered entirely from index (index-only scan)
EXPLAIN SELECT customer_id, order_date, total_amount, status
FROM orders WHERE customer_id = 42 AND order_date > '2024-01-01';

7. Partial Index (Copy-Paste)

-- Only index active records (saves 70% space)
CREATE INDEX idx_active_orders
ON orders(order_date)
WHERE status = 'active';
-- Conditional unique constraint
CREATE UNIQUE INDEX idx_unique_active_email
ON users(email)
WHERE status = 'active';
-- Combination with composite
CREATE INDEX idx_recent_orders
ON orders(customer_id, order_date)
WHERE order_date > CURRENT_DATE - INTERVAL '1 year';
-- Performance gain: Index is 70% smaller, faster to search
EXPLAIN SELECT * FROM orders
WHERE customer_id = 42 AND order_date > '2024-01-01' AND status = 'active';

8. Descending Index (Copy-Paste)

-- For ORDER BY DESC queries (PostgreSQL 10+)
CREATE INDEX idx_orders_desc
ON orders(order_date DESC);
-- OR combine ascending/descending
CREATE INDEX idx_orders_mixed
ON orders(customer_id ASC, order_date DESC);
-- Enables index-only scan for: ORDER BY order_date DESC
EXPLAIN SELECT * FROM orders
WHERE customer_id = 42
ORDER BY order_date DESC
LIMIT 10;

9. MVCC-Aware Index (HeliosDB Custom) - Copy-Paste

-- Custom B+Tree for MVCC workloads
-- HeliosDB automatically creates this for high-concurrency tables
-- But you can force it:
-- For OLTP systems (TPC-C benchmark)
CREATE INDEX idx_mvcc_optimized
ON hot_table(lookup_key)
USING btree -- Uses Custom B+Tree in HeliosDB
WHERE snapshot_version_id IS NOT NULL;
-- The Custom B+Tree provides:
-- - 5,000+ concurrent ops/sec (vs 2,000 for standard)
-- - Phantom read prevention
-- - Predicate locks for range queries
-- - MVCC-native version pointers
-- Queries get automatic optimization:
-- SELECT * FROM hot_table WHERE lookup_key = X;
-- Result: 100-150ns (Custom B+Tree), vs 300ns (standard B-Tree)

10. Index Maintenance Examples (Copy-Paste)

-- Rebuild index (if fragmented)
REINDEX INDEX idx_name;
-- Analyze index statistics (for planner)
ANALYZE table_name;
-- Drop and recreate (fastest way to rebuild)
DROP INDEX idx_name;
CREATE INDEX idx_name ON table(column);
-- Check index size
SELECT pg_size_pretty(pg_relation_size('idx_name')) as size;
-- Check all indexes on a table
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'table_name'
ORDER BY indexname;
-- Disable index (for testing, don't use in production)
ALTER INDEX idx_name UNUSABLE;
-- Re-enable index
ALTER INDEX idx_name USABLE;
-- Set index options (GIN pending list)
ALTER INDEX idx_name SET (gin_pending_list_limit = 4096);
-- Concurrent build (doesn't block writes)
CREATE INDEX CONCURRENTLY idx_name ON table(column);
-- Drop index concurrently
DROP INDEX CONCURRENTLY idx_name;

MVCC & Transaction Interaction

How Indexes Work with MVCC

In HeliosDB, indexes don’t store row versions directly. Instead, they store version pointers that reference MVCC version chains:

Index Entry: [key] → [version_chain_ptr]
Version Chain: [Latest V4] → [V3] → [V2] → [V1] → [Old]
Transaction snapshots filter visible versions

Key Behaviors

1. Phantom Prevention

Problem: Without index MVCC, concurrent transactions can miss rows:

-- Transaction A (isolation level: SERIALIZABLE)
SELECT COUNT(*) FROM orders WHERE customer_id = 42; -- Returns 5
BEGIN;
-- Meanwhile, Transaction B inserts a row
INSERT INTO orders VALUES (..., customer_id = 42);
COMMIT;
-- Transaction A might see only 5 rows (phantom read!)
SELECT * FROM orders WHERE customer_id = 42;

Solution: HeliosDB uses predicate locks on indexes:

Index lock on key-range [42, 42] prevents new inserts
Transaction A: Reads [42] → gets lock
Transaction B: Tries to insert → blocked until A commits
Phantom prevention: GUARANTEED

2. Index-Only Scans with MVCC

Some queries don’t need to fetch the table - the index has all data:

-- Covering index: all needed columns are in index
CREATE INDEX idx_covering
ON orders(customer_id)
INCLUDE (total_amount);
-- This query uses INDEX-ONLY SCAN (no table access)
SELECT customer_id, total_amount FROM orders
WHERE customer_id = 42;
-- Performance: 10x faster (no table I/O)
-- MVCC: Index already has version info, visibility check is instant

3. Visible vs. Deleted Rows

Indexes still point to deleted rows (MVCC marks them deleted, doesn’t remove):

-- Before delete: 1M rows indexed
-- After delete: Index still has 1M entries
-- Visible rows: ~500K (if 50% were deleted)
-- Index efficiency drops → triggers autovacuum
-- Autovacuum:
-- 1. Freezes old versions (not needed anymore)
-- 2. Removes deleted index entries
-- 3. Compacts index
-- Result: Index shrinks back to ~500K entries

4. Transaction Isolation & Index Visibility

-- Dirty reads protection
Transaction A: SELECT ... (gets snapshot isolation)
├─ Can't see uncommitted changes from B
└─ Index filters via snapshot timestamp
Transaction B: INSERT new_rows
├─ Uncommitted, not visible
└─ No index entries for other snapshots
B: COMMIT
└─ Index entries now visible to future snapshots
A: SELECT ... (still sees old version)
└─ Original snapshot, unchanged

5. Index on Versioned Tables

-- MVCC row versioning
CREATE TABLE versioned_data (
id INT,
version INT,
data TEXT,
ts_begin TIMESTAMP,
ts_end TIMESTAMP
);
-- Index for fast point lookup within time range
CREATE INDEX idx_versioned
ON versioned_data(id, ts_begin)
WHERE ts_end IS NULL; -- Only index "current" versions
-- Snapshot-aware query
SELECT data FROM versioned_data
WHERE id = 42 AND ts_begin <= now() AND (ts_end IS NULL OR ts_end > now());
-- Performance: Index jumps to id=42, then checks timestamp validity
-- MVCC integration: Timestamp filtering happens in index access

Health Checking & Monitoring

Check Index Health

-- 1. Index bloat (fragmentation)
SELECT schemaname, tablename, indexname,
ROUND(100 * (pg_relation_size(indexrelid) -
pg_relation_size(indexrelid::text)::int) /
pg_relation_size(indexrelid)::numeric, 2) as bloat_pct
FROM pg_stat_user_indexes
WHERE idx_scan > 0
ORDER BY bloat_pct DESC;
-- 2. Unused indexes (wasting space, slowing inserts)
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE 'pg_toast%'
ORDER BY pg_relation_size(indexrelid) DESC;
-- Recommend: DROP INDEX CONCURRENTLY idx_unused;
-- 3. Index size
SELECT indexname, pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
-- 4. Index efficiency (reads vs. scans)
SELECT schemaname, tablename, indexname,
idx_scan as reads,
idx_tup_read as tuples_read,
idx_tup_fetch as tuples_fetched,
ROUND(100.0 * idx_tup_fetch / NULLIF(idx_tup_read, 0), 2) as efficiency_pct
FROM pg_stat_user_indexes
WHERE idx_scan > 100 -- At least 100 uses
ORDER BY efficiency_pct;

Monitor Index Operations

-- Real-time index usage
SELECT * FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan DESC;
-- Watch index growth during bulk insert
SELECT schemaname, indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as current_size,
idx_scan as accesses
FROM pg_stat_user_indexes
WHERE tablename = 'big_table'
ORDER BY pg_relation_size(indexrelid) DESC;
-- Vacuum impact on index size
SELECT 'before' as phase, pg_relation_size('idx_name')::text as bytes
UNION ALL
SELECT 'after' as phase, pg_relation_size('idx_name')::text as bytes
WHERE current_timestamp > now() - INTERVAL '1 minute';

Troubleshooting Common Issues

Problem: Index Not Used by Query Planner

-- Check: Is sequential scan cheaper?
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM big_table WHERE indexed_col = value;
-- If planning uses SeqScan instead of IndexScan:
-- Reason 1: Index is too new (stats not updated)
ANALYZE big_table;
-- Reason 2: Table is small (sequential scan faster)
-- Solution: Add more data, or reduce random_page_cost
-- Reason 3: Column is too selective (most rows match)
SELECT COUNT(*)/COUNT(*)::FLOAT as selectivity
FROM big_table WHERE indexed_col = value;

Problem: Index Size Growing Uncontrollably

-- Cause: Lots of deletes, index not cleaned up
SELECT pg_relation_size('idx_name') as current_size;
-- Fix: Vacuum + analyze
VACUUM ANALYZE big_table;
-- Check reduction
SELECT pg_relation_size('idx_name') as size_after_vacuum;
-- Extreme case: Rebuild from scratch
REINDEX INDEX CONCURRENTLY idx_name;

Problem: Insert Performance Degrading

-- Check: How many indexes?
SELECT COUNT(*) FROM pg_indexes WHERE tablename = 'target_table';
-- Each index adds overhead to inserts
-- Solution: Drop unused indexes
DROP INDEX CONCURRENTLY idx_unused;
-- For bulk inserts (>10M rows):
-- 1. Drop all non-critical indexes
-- 2. Insert data
-- 3. Recreate indexes CONCURRENTLY

Common Mistakes

Mistake 1: Creating Too Many Indexes

-- ❌ WRONG: 8 indexes for one table
CREATE INDEX idx_col1 ON table(col1);
CREATE INDEX idx_col2 ON table(col2);
CREATE INDEX idx_col3 ON table(col3);
CREATE INDEX idx_col4 ON table(col4);
-- ... plus more ...
-- Impact: Each insert is 8x slower!
-- ✓ RIGHT: Thoughtful index design
-- Step 1: Analyze slow queries
EXPLAIN ANALYZE SELECT * FROM table WHERE col1 = X AND col2 = Y;
-- Step 2: Create one composite index
CREATE INDEX idx_composite ON table(col1, col2);
-- Step 3: Verify it's used
EXPLAIN SELECT * FROM table WHERE col1 = X AND col2 = Y;

Mistake 2: Indexing Low-Selectivity Columns

-- ❌ WRONG: Binary column (only 2 values)
CREATE INDEX idx_is_active ON users(is_active);
-- Every value appears in 50% of rows → index is worthless
EXPLAIN SELECT * FROM users WHERE is_active = true;
-- Result: SeqScan (faster than index!)
-- ✓ RIGHT: Use partial index instead
CREATE INDEX idx_active_users ON users(user_id)
WHERE is_active = true;
-- Only indexes 50% of rows → much smaller, actually helps!
-- Or use predicate in application
-- SELECT * FROM users WHERE is_active = true ORDER BY created_at DESC;
-- → Planner can choose index on created_at, filter by is_active

Mistake 3: Hash Index for Multi-Column

-- ❌ WRONG: Hash on multiple columns
CREATE INDEX idx_hash_multi USING HASH ON table(col1, col2);
-- Hash combines columns into single hash → can't use col1 alone
EXPLAIN SELECT * FROM table WHERE col1 = X; -- Can't use this index!
-- ✓ RIGHT: Use separate indexes for multi-column
CREATE INDEX idx_col1 USING HASH ON table(col1);
CREATE INDEX idx_col2 USING HASH ON table(col2);
-- OR composite B-Tree (if you need both together)
CREATE INDEX idx_composite ON table(col1, col2);

Mistake 4: GIN on High-Insert Workloads

-- ❌ WRONG: GIN without considering insert impact
CREATE INDEX idx_tags_gin ON articles USING GIN(tags);
-- Bulk insert 10M articles: 10x slower due to GIN!
COPY articles FROM stdin; -- Painfully slow
-- ✓ RIGHT: Disable during bulk load
DROP INDEX idx_tags_gin;
COPY articles FROM stdin; -- 10x faster
CREATE INDEX idx_tags_gin ON articles USING GIN(tags); -- Rebuild after

Mistake 5: BRIN on Non-Monotonic Data

-- ❌ WRONG: BRIN on random insertion order
CREATE TABLE events (
id BIGSERIAL,
timestamp TIMESTAMP,
random_data BYTEA
);
-- Insert in random order (batch load from multiple sources)
INSERT INTO events SELECT ... ORDER BY RANDOM();
-- Create BRIN on timestamp
CREATE INDEX idx_time_brin ON events USING BRIN(timestamp);
-- Problem: Data not ordered by timestamp → BRIN useless!
EXPLAIN SELECT * FROM events WHERE timestamp > now() - INTERVAL '1 day';
-- BRIN might scan 80% of the table (false positives)
-- ✓ RIGHT: BRIN only for append-only (time-ordered) data
-- Insert in timestamp order
INSERT INTO events SELECT ... ORDER BY timestamp; -- ← Key!
CREATE INDEX idx_time_brin ON events USING BRIN(timestamp);
-- Now BRIN scans only relevant blocks

Mistake 6: Forgetting ANALYZE After Index Creation

-- ❌ WRONG: Create index, immediately query
CREATE INDEX idx_new ON big_table(column);
-- Planner doesn't know index exists (stats are stale)
EXPLAIN SELECT * FROM big_table WHERE column = X;
-- Result: Still uses SeqScan!
-- ✓ RIGHT: Always ANALYZE after indexing
CREATE INDEX idx_new ON big_table(column);
ANALYZE big_table;
-- Now planner sees the index
EXPLAIN SELECT * FROM big_table WHERE column = X;
-- Result: Uses IndexScan

Mistake 7: Not Monitoring Index Bloat

-- ❌ WRONG: Years of deletes without cleanup
SELECT count(*) FROM table; -- 500M rows
-- (Years pass, lots of deletes)
SELECT count(*) FROM table; -- 250M rows (but index still 500M!)
SELECT pg_size_pretty(pg_relation_size('idx_name'));
-- Returns: 50GB (should be 25GB after deletes)
-- ✓ RIGHT: Regular vacuum + monitoring
-- Let autovacuum run (or manual VACUUM ANALYZE weekly)
VACUUM ANALYZE table_name;
-- Monitor index bloat
SELECT
schemaname, tablename, indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE tablename = 'table_name'
ORDER BY pg_relation_size(indexrelid) DESC;
-- If index bloat exceeds 30%: REINDEX
REINDEX INDEX CONCURRENTLY idx_bloated;

Optimization Strategies

1. Composite Index Ordering

Order matters! Lead with equality filters, then ranges:

-- Query pattern
SELECT * FROM orders
WHERE customer_id = X AND order_date > Y AND status = Z;
-- ✓ BEST: Equality first, range second
CREATE INDEX idx_optimal ON orders(
customer_id, -- Equality first (narrows quickly)
order_date, -- Range second (use remaining scan)
status -- Included as covering column
);
-- ❌ WRONG: Range first
CREATE INDEX idx_bad ON orders(
order_date, -- ← Range first wastes potential
customer_id, -- ← Can't use efficiently
status
);

2. Covering Indexes (Index-Only Scans)

If all query columns are in the index, it never touches the table:

-- Without INCLUDE: Index has (col1, col2), query needs (col1, col2, col3)
CREATE INDEX idx_old ON table(col1, col2);
SELECT col1, col2, col3 FROM table WHERE col1 = X; -- Must fetch table
-- With INCLUDE: (col1, col2) for searching + (col3) for data
CREATE INDEX idx_new ON table(col1, col2) INCLUDE (col3);
SELECT col1, col2, col3 FROM table WHERE col1 = X; -- Index-only scan!
-- Performance: 5-10x faster (no table access)

3. Partial Indexes

Index only the rows you query:

-- Without partial: Index has all 1M rows (500M per year * 2 years)
CREATE INDEX idx_all ON events(timestamp);
-- With partial: Index only recent year (250M rows)
CREATE INDEX idx_recent ON events(timestamp)
WHERE timestamp > CURRENT_DATE - INTERVAL '1 year';
-- Space saved: 50%
-- Speed: 2x faster (half as many index nodes to traverse)

4. Filter Selectivity Testing

-- Before creating index, check selectivity
SELECT
col,
COUNT(*) as occurrences,
ROUND(100.0 * COUNT(*) / (SELECT COUNT(*) FROM table), 2) as pct
FROM table
GROUP BY col
ORDER BY pct DESC
LIMIT 10;
-- If top values >20% of table: Partial index better
-- If all values <5%: Full index is best
-- If 1 value >50%: Don't index (sequential scan better)

5. Multi-Column Index Selection

-- Query pattern: WHERE (a, b) = (X, Y) AND c > Z
SELECT * FROM table WHERE a = X AND b = Y AND c > Z;
-- Decision tree:
-- If a is highly selective (5% of rows): Start with a
-- CREATE INDEX idx (a, b, c) → Index narrowed by a and b, then range on c
-- If a is not selective (50% of rows): Start with most selective
-- CREATE INDEX idx (b, a, c) → Better filtering
-- If b is very selective (1% of rows): Always lead with b
-- CREATE INDEX idx (b, a, c) → ← Preferred
-- Test with EXPLAIN:
EXPLAIN SELECT * FROM table WHERE a = X AND b = Y AND c > Z;
-- Check if "Index Cond" uses the right columns

6. MVCC Optimization: Predicate Locks

For serializable transactions, predicate locks prevent phantoms:

-- High-contention range query
SELECT COUNT(*) FROM orders WHERE customer_id BETWEEN 100 AND 200;
-- HeliosDB Custom B+Tree automatically:
-- 1. Places predicate lock on range [100, 200]
-- 2. Prevents new insertions in that range
-- 3. Guarantees no phantom reads
-- No special syntax needed - automatic with:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT COUNT(*) FROM orders WHERE customer_id BETWEEN 100 AND 200;
-- ← Automatically protected by predicate lock on index

7. Benchmark Before/After

Always measure:

-- Before optimization
SELECT * FROM pg_stat_user_indexes
WHERE tablename = 'target';
-- Note: idx_scan, idx_tup_read, idx_tup_fetch
-- (Create new index)
-- Run query 1000 times to warm up
-- After optimization
SELECT * FROM pg_stat_user_indexes
WHERE tablename = 'target'
ORDER BY idx_scan DESC;
-- Calculate improvement
-- (new idx_tup_fetch - old idx_tup_fetch) / old idx_tup_fetch
-- This is your efficiency improvement

Quick Reference: Index Selection Checklist

Use this checklist to choose the right index:

[ ] Is this a new table? → Start with B-Tree
[ ] Append-only, >1B rows? → Use BRIN
[ ] Only equality queries? → Use Hash
[ ] Full-text search? → Use GIN
[ ] Spatial/geometry? → Use GiST
[ ] MVCC critical? → Use Custom B+Tree
[ ] Covering index possible? → Add INCLUDE columns
[ ] Can use partial index? → Add WHERE clause
[ ] Multi-column query? → Use composite index
[ ] Bulk insert upcoming? → Drop indexes, reload, rebuild
[ ] Index not used? → ANALYZE or DROP
[ ] Index growing too large? → VACUUM or REINDEX
[ ] Need phantom prevention? → Enable SERIALIZABLE isolation

Summary

WhenUseBenefit
General queriesB-TreeWorks for 95% of cases
Exact match onlyHash50% faster than B-Tree
Spatial queriesGiSTNative PostGIS support
Full-text searchGIN100x faster than seq scan
Append-only >1BBRIN99.8% space savings
OLTP high-concurrencyCustom B+Tree5,000+ ops/sec

Key takeaway: The right index depends on your query pattern. Measure first, optimize second, avoid over-indexing. Start with B-Tree, benchmark, then specialize.


See Also