Skip to content

Index MVCC Version Tracking - Design Complete

Index MVCC Version Tracking - Design Complete

Status: Design Complete - Ready for Week 8 Implementation Created: November 28, 2025 Production Blocker: #5 Phase: Week 8-15 (Months 2-4)


Quick Summary

Problem: Indexes don’t track MVCC versions β†’ phantom reads violate snapshot isolation

Solution: Version-aware index entries + predicate locks + GC coordination

Deliverables:

  1. Complete Specification (25K tokens) - /docs/planning/BLOCKER5_INDEX_MVCC_VERSION_TRACKING_SPECIFICATION.md
  2. 4 Rust Module Templates (~3,350 LOC) - /heliosdb-storage/src/index/
  • index_version_entry.rs (400 LOC)
  • snapshot_index_scan.rs (600 LOC)
  • index_predicate_lock.rs (500 LOC)
  • index_garbage_collector.rs (400 LOC)

πŸ“‹ Implementation Roadmap

WeekTaskEngineersHoursCost
Week 8Versioned Entry Design1 Senior40$4,800
Week 9-10Snapshot Scans1 Senior80$9,600
Week 11-12Predicate Locks1 Senior80$9,600
Week 13-14GC Integration1 Senior + 1 Mid80$8,800
Week 15SSI Integration2 Senior40$9,600
Total8 weeks2-3320$42,400

Buffer (20%): $8,500 Total Investment: $51,000


πŸ— Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Index MVCC Manager β”‚
β”‚ - Coordinate version tracking across all index types β”‚
β”‚ - Manage predicate locks for range queries β”‚
β”‚ - Integrate with SSI for conflict detection β”‚
β”‚ - Coordinate GC with MVCC Store β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Version Entry β”‚ β”‚ Snapshot β”‚ β”‚ Predicate β”‚
β”‚ (16-32 bytes) β”‚ β”‚ Scanner β”‚ β”‚ Lock Mgr β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ MVCC Version Store β”‚
β”‚ - Manages version chains β”‚
β”‚ - Garbage collection β”‚
β”‚ - Snapshot management β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”‘ Key Design Decisions

1. Version Pointers (Not Inline Versions)

  • Choice: Store 8-byte pointer to external version chain
  • Alternative: Inline version chains in index nodes
  • Rationale:
    • Memory efficiency: 8 bytes vs. hundreds of bytes
    • Clean separation: Index handles ordering, MVCC handles versions
    • Shared chains: Multiple indexes can reference same version chain

2. Predicate Locks for Phantom Prevention

  • Choice: Interval tree with range-based locks
  • Alternative: Next-key locking (like PostgreSQL)
  • Rationale:
    • More flexible: Supports arbitrary ranges
    • Lower overhead: No false conflicts on adjacent keys
    • SSI-compatible: Natural fit with serializable snapshot isolation

3. Two-Phase GC Coordination

  • Choice: Index GC β†’ MVCC reclaim β†’ Index cleanup
  • Alternative: MVCC-only GC with weak references
  • Rationale:
    • Safety: No dangling pointers
    • Correctness: Guaranteed consistency
    • Metrics: Full visibility into GC effectiveness

Performance Targets

MetricWithout VersioningWith VersioningOverhead
Point Lookup150ns180ns+20%
Range Scan (1K)100Β΅s120Β΅s+20%
Insert1Β΅s1.2Β΅s+20%
Phantom Prevention0% (broken)100%∞ improvement
SSI CorrectnessViolatedGuaranteedCritical fix

Target: <25% overhead for 100% correctness


πŸ§ͺ Testing Strategy

Correctness Tests

  • Unit tests for all modules (1,450 LOC)
  • Phantom read prevention tests
  • SSI serializability tests
  • GC coordination tests

Performance Tests

  • Microbenchmarks (point lookup, range scan)
  • TPC-C workload (phantom-prone)
  • Concurrent stress tests

Integration Tests

  • MVCC store integration
  • Custom B+Tree integration (Blocker #1)
  • SSI integration (Blocker #3)

πŸ”— Integration Points

1. MVCC Store (/heliosdb-multi-model/src/mvcc.rs)

Extensions Required:

  • allocate_version_for_index() - Allocate version chain pointer
  • get_version_chain() - Retrieve version chain by pointer
  • scan_inserted_after() - Check for phantoms in range

2. Custom B+Tree (Blocker #1)

Modifications Required:

  • Update LeafNode to use VersionedIndexEntry
  • Replace (key, value_ptr) with (key, version_chain_ptr)
  • Integrate snapshot-aware scans

3. Transaction Manager (/heliosdb-multi-model/src/transaction.rs)

Extensions Required:

  • Add predicate_locks: Vec<PredicateLock> to Transaction
  • Add register_predicate_lock() method
  • Extend commit() to validate predicate locks

πŸ“ File Locations

Specification

  • Main Spec: /home/claude/HeliosDB/docs/planning/BLOCKER5_INDEX_MVCC_VERSION_TRACKING_SPECIFICATION.md

Implementation Modules

  • Entry: /home/claude/HeliosDB/heliosdb-storage/src/index/index_version_entry.rs
  • Scan: /home/claude/HeliosDB/heliosdb-storage/src/index/snapshot_index_scan.rs
  • Lock: /home/claude/HeliosDB/heliosdb-storage/src/index/index_predicate_lock.rs
  • GC: /home/claude/HeliosDB/heliosdb-storage/src/index/index_garbage_collector.rs
  • Mod: /home/claude/HeliosDB/heliosdb-storage/src/index/mod.rs

Next Steps

Week 8 (Starting Point)

  1. Review specification with engineering team
  2. Set up development environment
  3. Begin index_version_entry.rs implementation
  4. Write unit tests for versioned entries

Weeks 9-15 (Execution)

  1. Implement snapshot-aware scans
  2. Build predicate lock manager
  3. Integrate GC coordination
  4. Full SSI integration and testing

Dependencies

  • MVCC Store exists (heliosdb-multi-model/src/mvcc.rs)
  • ⏳ Custom B+Tree (Blocker #1, Weeks 5-9)
  • ⏳ SSI Implementation (Blocker #3, Weeks 1-15)

πŸ“– References

Academic Papers

  1. Cahill et al. (2008): β€œSerializable isolation for snapshot databases” - SSI foundation
  2. Graefe & Zwilling (2004): β€œTransaction support for indexed views” - Versioned indexes
  3. Neumann & MΓΌhlbauer (2011): β€œFast serializable MVCC for main-memory databases”

Implementation References

  • PostgreSQL SSI: src/backend/storage/lmgr/predicate.c
  • CockroachDB: MVCC indexes
  • FoundationDB: Redwood B+Tree with MVCC

Completion Checklist

  • Problem analysis and requirements
  • Architecture design and ADR
  • Data structure design (32-byte entries)
  • Algorithm design (snapshot scans, predicate locks)
  • GC coordination protocol
  • Performance analysis and targets
  • Testing strategy
  • Integration contracts
  • Rust module templates (3,350 LOC)
  • Complete specification document (25K tokens)
  • Engineering team review (Week 8)
  • Implementation (Weeks 8-15)
  • Testing and validation
  • Production deployment

Status: Design phase complete. Ready for Week 8 implementation kick-off.

Owner: Engineering Team (2-3 engineers) Timeline: 8 weeks (320 hours) Investment: $51,000 Impact: Fixes Production Blocker #5, enables SSI, prevents phantom reads