HeliosDB Test Execution Guide

Quick Start

Prerequisites

# Install Rust toolchain
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install Python 3.11+
sudo apt install python3.11 python3.11-venv

# Install Docker
sudo apt install docker.io docker-compose

# Install test dependencies
pip install pytest pytest-asyncio chaostoolkit

Running Tests Locally

# 1. Unit tests (fast, no external dependencies)
cargo test --workspace --lib

# 2. Integration tests (requires Docker)
docker-compose -f tests/docker-compose.test.yml up -d
cargo test --test integration
docker-compose -f tests/docker-compose.test.yml down

# 3. Protocol compatibility tests
pytest tests/protocol/ -v -m "protocol and p0"

# 4. All tests
cargo test --workspace && pytest tests/

Test Categories

1. Unit Tests

What: Component-level tests Where: */tests/ and #[cfg(test)] modules Run: cargo test --lib Duration: < 1 minute

2. Integration Tests

What: Multi-component interaction tests Where: tests/integration/ Run: cargo test --test integration Duration: 5-10 minutes

3. Protocol Compatibility Tests

What: Client driver compatibility (psycopg2, pymysql, etc.) Where: tests/protocol/ Run: pytest tests/protocol/ -v Duration: 10-15 minutes

4. Distributed Correctness Tests

What: Raft, sharding, replication, failover Where: tests/distributed/ Run: cargo test --test distributed Duration: 15-20 minutes

5. Data Integrity Tests

What: LSM-tree, compaction, tombstones, gc_grace Where: tests/data_integrity/ Run: cargo test --test data_integrity Duration: 10-15 minutes

6. Vector Search Tests

What: HNSW, IVF accuracy, filtered search Where: tests/vector/ Run: cargo test --test vector Duration: 15-20 minutes

7. Chaos Engineering Tests

What: Network partitions, node failures Where: tests/chaos/ Run: pytest tests/chaos/ -v -m chaos Duration: 20-30 minutes

8. Performance Benchmarks

What: Throughput, latency, resource usage Where: benches/ Run: cargo bench Duration: 30-60 minutes

CI/CD Integration

GitHub Actions Workflow

name: HeliosDB Test Suite

on:
  push:
    branches: [main, develop]
  pull_request:

jobs:
  unit-tests:
    name: Unit Tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions-rs/toolchain@v1
        with:
          toolchain: stable
      - run: cargo test --workspace --lib
      - run: cargo test --workspace --doc

  protocol-tests:
    name: Protocol Compatibility
    runs-on: ubuntu-latest
    strategy:
      matrix:
        client: [postgresql, mysql, snowflake, databricks, pinecone]
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      - run: pip install -r tests/protocol/requirements.txt
      - run: docker-compose -f tests/protocol/docker-compose.yml up -d
      - run: pytest tests/protocol/test_${{ matrix.client }}.py -v -m p0
      - if: failure()
        run: docker-compose -f tests/protocol/docker-compose.yml logs

  integration-tests:
    name: Integration Tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions-rs/toolchain@v1
      - run: docker-compose -f tests/docker-compose.test.yml up -d
      - run: cargo test --test integration -- --test-threads=1
      - run: cargo test --test distributed -- --test-threads=1

  vector-tests:
    name: Vector Search Tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions-rs/toolchain@v1
      - run: cargo test --test vector -- --test-threads=4

  chaos-tests:
    name: Chaos Engineering
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v3
      - run: pip install chaostoolkit chaostoolkit-kubernetes
      - run: pytest tests/chaos/ -v -m chaos

  coverage:
    name: Code Coverage
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions-rs/toolchain@v1
      - uses: actions-rs/tarpaulin@v0.1
        with:
          args: '--workspace --out Xml'
      - uses: codecov/codecov-action@v3
        with:
          files: ./cobertura.xml

Quality Gates

Pre-Commit Checks

#!/bin/bash
set -e

echo "Running pre-commit checks..."

# Format check
cargo fmt -- --check

# Clippy lints
cargo clippy -- -D warnings

# Fast unit tests
cargo test --lib --quiet

Pull Request Requirements

All unit tests pass
Code coverage > 80%
All P0 protocol tests pass
No clippy warnings
Formatted with cargo fmt

Merge to Main Requirements

All tests pass (unit + integration + protocol)
Code review approved
No performance regression (>5% slowdown)
Documentation updated

Release Requirements

All tests pass including chaos tests
Benchmarks meet SLA targets
Security audit complete
Protocol compatibility verified for all clients
Release notes prepared

Test Data Management

Fixtures and Datasets

pub fn generate_test_vectors(count: usize, dims: usize) -> Vec<Vec<f32>> {
    use rand::SeedableRng;
    let mut rng = rand::rngs::StdRng::seed_from_u64(42); // Reproducible
    (0..count)
        .map(|_| (0..dims).map(|_| rng.gen_range(-1.0..1.0)).collect())
        .collect()
}

pub async fn create_test_cluster(storage_nodes: usize, compute_nodes: usize) -> TestCluster {
    TestCluster::builder()
        .storage_nodes(storage_nodes)
        .compute_nodes(compute_nodes)
        .with_test_data(10_000)
        .build()
        .await
}

Cleanup

# Clean test artifacts
cargo clean
docker system prune -f

# Remove test databases
rm -rf /tmp/heliosdb-test-*

Debugging Failed Tests

Enable Detailed Logging

# Rust tests
RUST_LOG=debug cargo test test_name -- --nocapture

# Python tests
pytest tests/protocol/test_postgresql.py::test_connect -vv -s

Inspect Test Cluster State

# Docker logs
docker-compose -f tests/docker-compose.test.yml logs heliosdb

# Connect to test database
docker exec -it heliosdb-test heliosdb-cli status

# Check Raft state
docker exec -it heliosdb-metadata-0 heliosdb-cli raft-status

Common Issues

Issue: Protocol test fails with connection refused Fix: Ensure HeliosDB cluster is running and ports are exposed

Issue: Distributed test hangs Fix: Check for deadlocks, increase timeout, run with --test-threads=1

Issue: Vector test recall too low Fix: Increase ef_search parameter or use more training data

Issue: Compaction test fails Fix: Increase gc_grace_seconds or wait longer for background compaction

Performance Benchmarking

Running Benchmarks

# All benchmarks
cargo bench

# Specific benchmark
cargo bench --bench lsm_write_throughput

# With profiling
cargo bench --bench query_latency -- --profile-time=10

Benchmark Categories

Ingestion Throughput
- LSM write path
- Bulk load (COPY)
- Parallel ingestion
Query Latency
- Point queries (by primary key)
- Range scans
- Aggregations
- Vector similarity search
Distributed Operations
- Cross-shard joins
- Distributed transactions
- Cache hit/miss ratio
Compaction
- Size-tiered vs leveled
- Write amplification
- Space amplification

Interpreting Results

lsm_write_throughput    time:   [45.2 ms 46.1 ms 47.3 ms]
                        thrpt:  [21.14 Kelem/s 21.69 Kelem/s 22.12 Kelem/s]

time: 95% confidence interval for execution time
thrpt: Throughput (operations per second)
Compare against baseline to detect regressions

Continuous Performance Monitoring

Automated Regression Detection

- name: Run benchmarks
  run: cargo bench -- --save-baseline main

- name: Compare with previous
  run: |
    cargo bench -- --baseline main
    if [ $? -ne 0 ]; then
      echo "Performance regression detected!"
      exit 1
    fi

Test Metrics Dashboard

Collect Metrics

#[tokio::test]
async fn test_query_with_metrics() {
    let cluster = TestCluster::new().await;

    let metrics = cluster.query_with_metrics(
        "SELECT * FROM data WHERE value > 500"
    ).await;

    println!("Query metrics:");
    println!("  Rows scanned: {}", metrics.rows_scanned);
    println!("  Bytes transferred: {}", metrics.bytes_transferred);
    println!("  Cache hits: {}", metrics.cache_hits);
    println!("  Duration: {:?}", metrics.duration);

    // Export to Prometheus
    METRICS_REGISTRY.record("query_duration", metrics.duration);
}

Visualize with Grafana

Query latency percentiles (P50, P95, P99)
Throughput (ops/sec)
Resource utilization (CPU, memory, disk I/O)
Test pass/fail rates

Summary

Daily Development Workflow

# 1. Before committing
cargo fmt
cargo clippy
cargo test --lib

# 2. Before creating PR
cargo test --workspace
pytest tests/protocol/ -m p0

# 3. Manual verification (optional)
cargo bench -- --baseline main

Test Pyramid Distribution

70% Unit tests (fast, focused)
20% Integration tests (realistic scenarios)
10% E2E/Chaos tests (high-value, slow)

Test Coverage Goals

Overall: >80%
Core modules (LSM, Raft, HNSW): >90%
Protocol handlers: >85%
Utilities: >70%

Performance SLAs

Point query P99: < 10ms
Vector search P99: < 20ms
Write throughput: > 100K ops/sec
Failover RTO: < 10 seconds