Index MVCC Version Tracking - Design Complete
Index MVCC Version Tracking - Design Complete
Status: Design Complete - Ready for Week 8 Implementation Created: November 28, 2025 Production Blocker: #5 Phase: Week 8-15 (Months 2-4)
Quick Summary
Problem: Indexes donβt track MVCC versions β phantom reads violate snapshot isolation
Solution: Version-aware index entries + predicate locks + GC coordination
Deliverables:
- Complete Specification (25K tokens) -
/docs/planning/BLOCKER5_INDEX_MVCC_VERSION_TRACKING_SPECIFICATION.md - 4 Rust Module Templates (~3,350 LOC) -
/heliosdb-storage/src/index/
index_version_entry.rs(400 LOC)snapshot_index_scan.rs(600 LOC)index_predicate_lock.rs(500 LOC)index_garbage_collector.rs(400 LOC)
π Implementation Roadmap
| Week | Task | Engineers | Hours | Cost |
|---|---|---|---|---|
| Week 8 | Versioned Entry Design | 1 Senior | 40 | $4,800 |
| Week 9-10 | Snapshot Scans | 1 Senior | 80 | $9,600 |
| Week 11-12 | Predicate Locks | 1 Senior | 80 | $9,600 |
| Week 13-14 | GC Integration | 1 Senior + 1 Mid | 80 | $8,800 |
| Week 15 | SSI Integration | 2 Senior | 40 | $9,600 |
| Total | 8 weeks | 2-3 | 320 | $42,400 |
Buffer (20%): $8,500 Total Investment: $51,000
π Architecture Overview
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ Index MVCC Manager ββ - Coordinate version tracking across all index types ββ - Manage predicate locks for range queries ββ - Integrate with SSI for conflict detection ββ - Coordinate GC with MVCC Store ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β βββββββββββββββββββββΌββββββββββββββββββββ βΌ βΌ βΌβββββββββββββββββ βββββββββββββββββ ββββββββββββββββββ Version Entry β β Snapshot β β Predicate ββ (16-32 bytes) β β Scanner β β Lock Mgr ββββββββββββββββββ βββββββββββββββββ βββββββββββββββββ β β β βββββββββββββββββββββ΄ββββββββββββββββββββ β βΌ βββββββββββββββββββββββββββββββββββββββββββ β MVCC Version Store β β - Manages version chains β β - Garbage collection β β - Snapshot management β βββββββββββββββββββββββββββββββββββββββββββπ Key Design Decisions
1. Version Pointers (Not Inline Versions)
- Choice: Store 8-byte pointer to external version chain
- Alternative: Inline version chains in index nodes
- Rationale:
- Memory efficiency: 8 bytes vs. hundreds of bytes
- Clean separation: Index handles ordering, MVCC handles versions
- Shared chains: Multiple indexes can reference same version chain
2. Predicate Locks for Phantom Prevention
- Choice: Interval tree with range-based locks
- Alternative: Next-key locking (like PostgreSQL)
- Rationale:
- More flexible: Supports arbitrary ranges
- Lower overhead: No false conflicts on adjacent keys
- SSI-compatible: Natural fit with serializable snapshot isolation
3. Two-Phase GC Coordination
- Choice: Index GC β MVCC reclaim β Index cleanup
- Alternative: MVCC-only GC with weak references
- Rationale:
- Safety: No dangling pointers
- Correctness: Guaranteed consistency
- Metrics: Full visibility into GC effectiveness
Performance Targets
| Metric | Without Versioning | With Versioning | Overhead |
|---|---|---|---|
| Point Lookup | 150ns | 180ns | +20% |
| Range Scan (1K) | 100Β΅s | 120Β΅s | +20% |
| Insert | 1Β΅s | 1.2Β΅s | +20% |
| Phantom Prevention | 0% (broken) | 100% | β improvement |
| SSI Correctness | Violated | Guaranteed | Critical fix |
Target: <25% overhead for 100% correctness
π§ͺ Testing Strategy
Correctness Tests
- Unit tests for all modules (1,450 LOC)
- Phantom read prevention tests
- SSI serializability tests
- GC coordination tests
Performance Tests
- Microbenchmarks (point lookup, range scan)
- TPC-C workload (phantom-prone)
- Concurrent stress tests
Integration Tests
- MVCC store integration
- Custom B+Tree integration (Blocker #1)
- SSI integration (Blocker #3)
π Integration Points
1. MVCC Store (/heliosdb-multi-model/src/mvcc.rs)
Extensions Required:
allocate_version_for_index()- Allocate version chain pointerget_version_chain()- Retrieve version chain by pointerscan_inserted_after()- Check for phantoms in range
2. Custom B+Tree (Blocker #1)
Modifications Required:
- Update
LeafNodeto useVersionedIndexEntry - Replace
(key, value_ptr)with(key, version_chain_ptr) - Integrate snapshot-aware scans
3. Transaction Manager (/heliosdb-multi-model/src/transaction.rs)
Extensions Required:
- Add
predicate_locks: Vec<PredicateLock>toTransaction - Add
register_predicate_lock()method - Extend
commit()to validate predicate locks
π File Locations
Specification
- Main Spec:
/home/claude/HeliosDB/docs/planning/BLOCKER5_INDEX_MVCC_VERSION_TRACKING_SPECIFICATION.md
Implementation Modules
- Entry:
/home/claude/HeliosDB/heliosdb-storage/src/index/index_version_entry.rs - Scan:
/home/claude/HeliosDB/heliosdb-storage/src/index/snapshot_index_scan.rs - Lock:
/home/claude/HeliosDB/heliosdb-storage/src/index/index_predicate_lock.rs - GC:
/home/claude/HeliosDB/heliosdb-storage/src/index/index_garbage_collector.rs - Mod:
/home/claude/HeliosDB/heliosdb-storage/src/index/mod.rs
Next Steps
Week 8 (Starting Point)
- Review specification with engineering team
- Set up development environment
- Begin
index_version_entry.rsimplementation - Write unit tests for versioned entries
Weeks 9-15 (Execution)
- Implement snapshot-aware scans
- Build predicate lock manager
- Integrate GC coordination
- Full SSI integration and testing
Dependencies
- MVCC Store exists (
heliosdb-multi-model/src/mvcc.rs) - β³ Custom B+Tree (Blocker #1, Weeks 5-9)
- β³ SSI Implementation (Blocker #3, Weeks 1-15)
π References
Academic Papers
- Cahill et al. (2008): βSerializable isolation for snapshot databasesβ - SSI foundation
- Graefe & Zwilling (2004): βTransaction support for indexed viewsβ - Versioned indexes
- Neumann & MΓΌhlbauer (2011): βFast serializable MVCC for main-memory databasesβ
Implementation References
- PostgreSQL SSI:
src/backend/storage/lmgr/predicate.c - CockroachDB: MVCC indexes
- FoundationDB: Redwood B+Tree with MVCC
Completion Checklist
- Problem analysis and requirements
- Architecture design and ADR
- Data structure design (32-byte entries)
- Algorithm design (snapshot scans, predicate locks)
- GC coordination protocol
- Performance analysis and targets
- Testing strategy
- Integration contracts
- Rust module templates (3,350 LOC)
- Complete specification document (25K tokens)
- Engineering team review (Week 8)
- Implementation (Weeks 8-15)
- Testing and validation
- Production deployment
Status: Design phase complete. Ready for Week 8 implementation kick-off.
Owner: Engineering Team (2-3 engineers) Timeline: 8 weeks (320 hours) Investment: $51,000 Impact: Fixes Production Blocker #5, enables SSI, prevents phantom reads