HeliosDB Test Execution Guide
HeliosDB Test Execution Guide
Quick Start
Prerequisites
# Install Rust toolchaincurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Install Python 3.11+sudo apt install python3.11 python3.11-venv
# Install Dockersudo apt install docker.io docker-compose
# Install test dependenciespip install pytest pytest-asyncio chaostoolkitRunning Tests Locally
# 1. Unit tests (fast, no external dependencies)cargo test --workspace --lib
# 2. Integration tests (requires Docker)docker-compose -f tests/docker-compose.test.yml up -dcargo test --test integrationdocker-compose -f tests/docker-compose.test.yml down
# 3. Protocol compatibility testspytest tests/protocol/ -v -m "protocol and p0"
# 4. All testscargo test --workspace && pytest tests/Test Categories
1. Unit Tests
What: Component-level tests
Where: */tests/ and #[cfg(test)] modules
Run: cargo test --lib
Duration: < 1 minute
2. Integration Tests
What: Multi-component interaction tests
Where: tests/integration/
Run: cargo test --test integration
Duration: 5-10 minutes
3. Protocol Compatibility Tests
What: Client driver compatibility (psycopg2, pymysql, etc.)
Where: tests/protocol/
Run: pytest tests/protocol/ -v
Duration: 10-15 minutes
4. Distributed Correctness Tests
What: Raft, sharding, replication, failover
Where: tests/distributed/
Run: cargo test --test distributed
Duration: 15-20 minutes
5. Data Integrity Tests
What: LSM-tree, compaction, tombstones, gc_grace
Where: tests/data_integrity/
Run: cargo test --test data_integrity
Duration: 10-15 minutes
6. Vector Search Tests
What: HNSW, IVF accuracy, filtered search
Where: tests/vector/
Run: cargo test --test vector
Duration: 15-20 minutes
7. Chaos Engineering Tests
What: Network partitions, node failures
Where: tests/chaos/
Run: pytest tests/chaos/ -v -m chaos
Duration: 20-30 minutes
8. Performance Benchmarks
What: Throughput, latency, resource usage
Where: benches/
Run: cargo bench
Duration: 30-60 minutes
CI/CD Integration
GitHub Actions Workflow
name: HeliosDB Test Suite
on: push: branches: [main, develop] pull_request:
jobs: unit-tests: name: Unit Tests runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions-rs/toolchain@v1 with: toolchain: stable - run: cargo test --workspace --lib - run: cargo test --workspace --doc
protocol-tests: name: Protocol Compatibility runs-on: ubuntu-latest strategy: matrix: client: [postgresql, mysql, snowflake, databricks, pinecone] steps: - uses: actions/checkout@v3 - uses: actions/setup-python@v4 with: python-version: '3.11' - run: pip install -r tests/protocol/requirements.txt - run: docker-compose -f tests/protocol/docker-compose.yml up -d - run: pytest tests/protocol/test_${{ matrix.client }}.py -v -m p0 - if: failure() run: docker-compose -f tests/protocol/docker-compose.yml logs
integration-tests: name: Integration Tests runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions-rs/toolchain@v1 - run: docker-compose -f tests/docker-compose.test.yml up -d - run: cargo test --test integration -- --test-threads=1 - run: cargo test --test distributed -- --test-threads=1
vector-tests: name: Vector Search Tests runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions-rs/toolchain@v1 - run: cargo test --test vector -- --test-threads=4
chaos-tests: name: Chaos Engineering runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' steps: - uses: actions/checkout@v3 - run: pip install chaostoolkit chaostoolkit-kubernetes - run: pytest tests/chaos/ -v -m chaos
coverage: name: Code Coverage runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions-rs/toolchain@v1 - uses: actions-rs/tarpaulin@v0.1 with: args: '--workspace --out Xml' - uses: codecov/codecov-action@v3 with: files: ./cobertura.xmlQuality Gates
Pre-Commit Checks
#!/bin/bashset -e
echo "Running pre-commit checks..."
# Format checkcargo fmt -- --check
# Clippy lintscargo clippy -- -D warnings
# Fast unit testscargo test --lib --quietPull Request Requirements
- All unit tests pass
- Code coverage > 80%
- All P0 protocol tests pass
- No clippy warnings
- Formatted with
cargo fmt
Merge to Main Requirements
- All tests pass (unit + integration + protocol)
- Code review approved
- No performance regression (>5% slowdown)
- Documentation updated
Release Requirements
- All tests pass including chaos tests
- Benchmarks meet SLA targets
- Security audit complete
- Protocol compatibility verified for all clients
- Release notes prepared
Test Data Management
Fixtures and Datasets
pub fn generate_test_vectors(count: usize, dims: usize) -> Vec<Vec<f32>> { use rand::SeedableRng; let mut rng = rand::rngs::StdRng::seed_from_u64(42); // Reproducible (0..count) .map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect()) .collect()}
pub async fn create_test_cluster(storage_nodes: usize, compute_nodes: usize) -> TestCluster { TestCluster::builder() .storage_nodes(storage_nodes) .compute_nodes(compute_nodes) .with_test_data(10_000) .build() .await}Cleanup
# Clean test artifactscargo cleandocker system prune -f
# Remove test databasesrm -rf /tmp/heliosdb-test-*Debugging Failed Tests
Enable Detailed Logging
# Rust testsRUST_LOG=debug cargo test test_name -- --nocapture
# Python testspytest tests/protocol/test_postgresql.py::test_connect -vv -sInspect Test Cluster State
# Docker logsdocker-compose -f tests/docker-compose.test.yml logs heliosdb
# Connect to test databasedocker exec -it heliosdb-test heliosdb-cli status
# Check Raft statedocker exec -it heliosdb-metadata-0 heliosdb-cli raft-statusCommon Issues
Issue: Protocol test fails with connection refused Fix: Ensure HeliosDB cluster is running and ports are exposed
Issue: Distributed test hangs
Fix: Check for deadlocks, increase timeout, run with --test-threads=1
Issue: Vector test recall too low
Fix: Increase ef_search parameter or use more training data
Issue: Compaction test fails
Fix: Increase gc_grace_seconds or wait longer for background compaction
Performance Benchmarking
Running Benchmarks
# All benchmarkscargo bench
# Specific benchmarkcargo bench --bench lsm_write_throughput
# With profilingcargo bench --bench query_latency -- --profile-time=10Benchmark Categories
-
Ingestion Throughput
- LSM write path
- Bulk load (COPY)
- Parallel ingestion
-
Query Latency
- Point queries (by primary key)
- Range scans
- Aggregations
- Vector similarity search
-
Distributed Operations
- Cross-shard joins
- Distributed transactions
- Cache hit/miss ratio
-
Compaction
- Size-tiered vs leveled
- Write amplification
- Space amplification
Interpreting Results
lsm_write_throughput time: [45.2 ms 46.1 ms 47.3 ms] thrpt: [21.14 Kelem/s 21.69 Kelem/s 22.12 Kelem/s]- time: 95% confidence interval for execution time
- thrpt: Throughput (operations per second)
- Compare against baseline to detect regressions
Continuous Performance Monitoring
Automated Regression Detection
- name: Run benchmarks run: cargo bench -- --save-baseline main
- name: Compare with previous run: | cargo bench -- --baseline main if [ $? -ne 0 ]; then echo "Performance regression detected!" exit 1 fiTest Metrics Dashboard
Collect Metrics
#[tokio::test]async fn test_query_with_metrics() { let cluster = TestCluster::new().await;
let metrics = cluster.query_with_metrics( "SELECT * FROM data WHERE value > 500" ).await;
println!("Query metrics:"); println!(" Rows scanned: {}", metrics.rows_scanned); println!(" Bytes transferred: {}", metrics.bytes_transferred); println!(" Cache hits: {}", metrics.cache_hits); println!(" Duration: {:?}", metrics.duration);
// Export to Prometheus METRICS_REGISTRY.record("query_duration", metrics.duration);}Visualize with Grafana
- Query latency percentiles (P50, P95, P99)
- Throughput (ops/sec)
- Resource utilization (CPU, memory, disk I/O)
- Test pass/fail rates
Summary
Daily Development Workflow
# 1. Before committingcargo fmtcargo clippycargo test --lib
# 2. Before creating PRcargo test --workspacepytest tests/protocol/ -m p0
# 3. Manual verification (optional)cargo bench -- --baseline mainTest Pyramid Distribution
- 70% Unit tests (fast, focused)
- 20% Integration tests (realistic scenarios)
- 10% E2E/Chaos tests (high-value, slow)
Test Coverage Goals
- Overall: >80%
- Core modules (LSM, Raft, HNSW): >90%
- Protocol handlers: >85%
- Utilities: >70%
Performance SLAs
- Point query P99: < 10ms
- Vector search P99: < 20ms
- Write throughput: > 100K ops/sec
- Failover RTO: < 10 seconds