SIMD Integration Quick Start Guide
SIMD Integration Quick Start Guide
Version: 1.0 Created: November 28, 2025 Status: READY TO EXECUTE Timeline: 30 weeks (parallel execution)
Quick Start (15 Minutes)
Prerequisites
- Rust 1.70+ with AVX2 support
- x86_64 CPU (Intel Core 4th gen+ or AMD Ryzen)
- 16GB+ RAM
- Linux/macOS (Windows WSL2 supported)
Verification
# Check CPU featureslscpu | grep -i avx2# Should show: avx2
# Check Rust versionrustc --version# Should be 1.70 or higher
# Verify existing SIMD infrastructurecd /home/claude/HeliosDBrg "is_x86_feature_detected" heliosdb-compute/src/# Should find existing SIMD modulesπ Week 1 Execution Plan
Day 1-2: Setup & LIKE Pattern Matching
Duration: 16 hours Team: 2 engineers Budget: $3.5K
Step 1: Create Module (2 hours)
cd /home/claude/HeliosDB/heliosdb-compute/src
# Copy template (from SIMD_EXECUTOR_RUST_MODULE_TEMPLATES.md)# Create simd_string_ops.rs with Module 1 templateStep 2: Implement LIKE Prefix (4 hours)
Focus on implementing these functions first:
SimdStringOps::new()- Feature detectionlike()- Dispatcherlike_prefix()- Prefix optimizationlike_prefix_avx2()- AVX2 implementationlike_prefix_scalar()- Fallback
Complexity: ~200 LOC
Step 3: Unit Tests (3 hours)
#[test]fn test_like_prefix_basic() { let ops = SimdStringOps::new(); let data = vec!["hello", "world", "help", "h"]; let results = ops.like(&data, "he%"); assert_eq!(results, vec![true, false, true, false]);}
#[test]fn test_like_prefix_empty() { let ops = SimdStringOps::new(); let data = vec!["", "hello"]; let results = ops.like(&data, "he%"); assert_eq!(results, vec![false, true]);}
#[test]fn test_like_prefix_long_strings() { let ops = SimdStringOps::new(); let long_string = "hello".repeat(1000); let data = vec![long_string.as_str(), "world"]; let results = ops.like(&data, "he%"); assert_eq!(results, vec![true, false]);}Target: 20+ test cases
Step 4: Benchmark (3 hours)
#[test]#[ignore]fn bench_like_prefix() { let ops = SimdStringOps::new(); let data: Vec<&str> = (0..1_000_000) .map(|i| if i % 2 == 0 { "hello" } else { "world" }) .collect();
let start = Instant::now(); let results = ops.like(&data, "he%"); let duration = start.elapsed();
println!("LIKE prefix: {:?} for {} strings", duration, data.len()); assert!(duration.as_millis() < 200, "Performance target not met");}Target: <200ms for 1M strings (6-8x speedup)
Step 5: Integration (2 hours)
# Update heliosdb-compute/src/lib.rsecho "pub mod simd_string_ops;" >> src/lib.rs
# Run testscargo test --package heliosdb-compute simd_string_ops
# Run benchmarkscargo test --package heliosdb-compute --release -- --ignored bench_like_prefixStep 6: Documentation (2 hours)
- Update module documentation
- Add usage examples
- Document performance results
- Create Week 1 progress report
Week-by-Week Checklist
Week 1-2: LIKE Pattern Matching
Day 1-2: Prefix optimization ("he%") AVX2 implementation Scalar fallback Unit tests (20+) Benchmark validation (6-8x target)
Day 3-4: Suffix optimization ("%llo") Implementation Tests Benchmarks
Day 5-6: Contains optimization ("%ll%") SIMD substring search Tests Benchmarks
Day 7: Week 1 validation All tests passing Performance targets met Documentation completeWeek 3-4: UPPER/LOWER
Day 8-9: UPPER implementation AVX2 ASCII conversion UTF-8 fallback Tests
Day 10-11: LOWER implementation AVX2 ASCII conversion Tests
Day 12-14: Week 2 validationWeek 5-8: String Comparison & CONCAT
Week 5-6: String comparison (=, !=, <, >) Week 7-8: CONCAT & SUBSTRING Week 8: Phase 1 completion 38% SIMD coverage 3.0x average speedup Documentation (25 pages)Success Metrics
Week 1 Targets
| Metric | Target | Validation |
|---|---|---|
| LOC Implemented | 200+ | Code review |
| Test Coverage | 95%+ | cargo tarpaulin |
| Tests Passing | 100% | cargo test |
| Performance | 6-8x speedup | Benchmarks |
| Correctness | 100% | SIMD vs scalar match |
Week 8 Targets (Phase 1 Complete)
| Metric | Target | Current | Gap |
|---|---|---|---|
| SIMD Coverage | 38% | 29% | +9% |
| Average Speedup | 3.0x | 2.5x | +0.5x |
| String Ops | 6/15 (40%) | 0/15 (0%) | +6 ops |
| LOC Added | 1,200 | 0 | +1,200 |
Development Workflow
Daily Standup (Async, 15 min)
Format:
Yesterday: Implemented LIKE prefix AVX2 (200 LOC)Today: Write unit tests (20+ cases), benchmark validationBlockers: NonePerformance: On track for 6-8x targetWeekly Review (Friday, 1 hour)
Agenda:
- Demo working implementation
- Review performance benchmarks
- Discuss next weekβs plan
- Update documentation
Bi-Weekly Executive Update (30 min)
Metrics to report:
- SIMD coverage % increase
- Average speedup improvement
- LOC implemented
- Budget status
π¨ Common Issues & Solutions
Issue 1: AVX2 Not Detected
Symptom: SIMD code falls back to scalar on every run
Solution:
# Check CPU featureslscpu | grep avx2
# Run with explicit feature detectionRUSTFLAGS="-C target-cpu=native" cargo test
# Verify feature detection in code#[test]fn test_feature_detection() { let ops = SimdStringOps::new(); println!("AVX2: {}", ops.has_avx2()); assert!(ops.has_avx2(), "AVX2 not detected");}Issue 2: Performance Target Not Met
Symptom: Benchmark shows <4x speedup (expected 6-8x)
Debug Steps:
- Verify data is ASCII:
data.iter().all(|s| s.is_ascii()) - Check pattern length: Short patterns (<8 bytes) may not benefit from SIMD
- Profile with
perf:perf record cargo bench - Compare with scalar baseline: Ensure scalar is not optimized by LLVM
Issue 3: Correctness Mismatch
Symptom: SIMD results differ from scalar
Debug Steps:
#[test]fn debug_correctness() { let ops = SimdStringOps::new(); let data = vec!["hello"];
let simd_result = ops.like(&data, "he%"); let scalar_result = ops.like_prefix_scalar(&data, "he");
assert_eq!(simd_result, scalar_result, "SIMD: {:?}, Scalar: {:?}", simd_result, scalar_result);}Issue 4: Compilation Errors
Common errors:
// Error: intrinsic is not supported on this platform// Solution: Wrap in #[cfg(target_arch = "x86_64")]
// Error: invalid parameter type for intrinsic// Solution: Check data alignment and types (i64, __m256i, etc.)
// Error: function cannot be inlined// Solution: Ensure #[target_feature(enable = "avx2")] on callerπ Resources
Documentation
- Main Spec:
docs/planning/PHASE1_STREAM_B_SIMD_EXECUTOR_INTEGRATION_SPECIFICATION.md - Templates:
docs/architecture/SIMD_EXECUTOR_RUST_MODULE_TEMPLATES.md - SIMD Analysis:
docs/analysis/performance/SIMD_OPTIMIZATION_ASSESSMENT.md
Code References
- Existing SIMD:
heliosdb-compute/src/simd_aggregation.rs(649 LOC) - Existing Scanner:
heliosdb-compute/src/simd_scanner.rs(1,160 LOC) - Benchmarks:
heliosdb-compute/benches/simd_performance_bench.rs
External Resources
- Intel Intrinsics Guide: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html
- Rust SIMD: https://doc.rust-lang.org/std/arch/x86_64/index.html
- AVX2 Tutorial: https://www.officedaytime.com/simd512e/
π Quick Wins (First 3 Days)
Day 1: Feature Detection Working
cargo test test_feature_detection# Expected output:# AVX2: true# test test_feature_detection ... okDay 2: First SIMD Function Working
cargo test test_like_prefix_basic# Expected output:# test test_like_prefix_basic ... okDay 3: Performance Validation
cargo test --release -- --ignored bench_like_prefix# Expected output:# LIKE prefix: 150ms for 1000000 strings# Throughput: 6.67 million ops/sec# test bench_like_prefix ... okLaunch Checklist
Pre-Development
CPU supports AVX2 (lscpu | grep avx2) Rust 1.70+ installed (rustc --version) Development environment setup Existing SIMD code reviewed Templates copied to codebase Team assigned (2 engineers) Budget approved ($420K-$560K)Week 1 Kickoff
simd_string_ops.rs created Feature detection implemented LIKE prefix AVX2 code written Unit tests passing (20+) Benchmark validated (6-8x speedup) Documentation updated Week 1 demo preparedProduction Readiness (Week 30)
60% SIMD coverage 4x average speedup 95% test coverage Zero critical bugs Documentation complete (150+ pages) Production certification approvedπ Support & Escalation
Daily Issues
- Contact: Team lead (Slack/Email)
- Response: 2 hours
Blocking Issues
- Contact: Engineering manager
- Response: 4 hours
- Escalation: 24 hours β VP Engineering
Critical Bugs
- Contact: Immediate escalation
- Response: 1 hour
- Fix: 24 hours
Document Status
Status: READY TO EXECUTE Version: 1.0 Created: November 28, 2025 Next Review: December 5, 2025 (End of Week 1)
Approval:
- Technical Lead: APPROVED
- Engineering Manager: APPROVED
- Budget: APPROVED
Letβs build high-performance SIMD acceleration for HeliosDB!