Skip to content

HeliosCore Operations Guide

HeliosCore Operations Guide

This guide covers day-to-day operations for the HeliosCore Native Storage Engine, including monitoring, configuration, troubleshooting, and performance tuning.


Table of Contents

  1. Quick Reference
  2. System Views
  3. Filter Advisor Operations
  4. Storage Budget Management
  5. Performance Monitoring
  6. Configuration Reference
  7. What to Expect
  8. Troubleshooting
  9. Maintenance Operations

Quick Reference

Essential Commands

-- View filter status
SELECT * FROM helios_filters;
-- Check storage budget
SELECT * FROM helios_filter_budget;
-- View filter recommendations
SELECT * FROM helios_filter_recommendations;
-- Check build queue
SELECT * FROM helios_filter_build_queue;
-- Change filter budget (online, no restart)
SET helios.filter_budget_pct = 0.15;
-- View current settings
SHOW helios.filter_budget_pct;

Health Check Script

-- Quick system health check
SELECT
(SELECT COUNT(*) FROM helios_filters WHERE status = 'ready') as active_filters,
(SELECT COUNT(*) FROM helios_filters WHERE status = 'building') as building,
(SELECT COUNT(*) FROM helios_filters WHERE status = 'stale') as stale,
(SELECT usage_pct FROM helios_filter_budget) as budget_usage_pct;

System Views

helios_filters

Central view of all registered filters.

Schema:

ColumnTypeDescription
filter_idBIGINTUnique filter identifier
filter_typeTEXTbloom, cuckoo, xor, ribbon, mphf, etc.
target_typeTEXTtable, column, segment
target_tableTEXTTarget table name
target_columnTEXTTarget column name (if applicable)
statusTEXTpending, building, ready, rebuilding, stale, evicting, disabled
visibilityBOOLEANWhether visible to query optimizer
created_atTIMESTAMPWhen filter was created
last_usedTIMESTAMPLast query that used this filter
use_countBIGINTTotal times filter was used
hit_rateFLOATSuccessful lookups / total lookups
false_positive_rateFLOATObserved FPR
size_bytesBIGINTFilter size in bytes
build_cost_msBIGINTTime to build filter

Example Queries:

-- Find underutilized filters (candidates for eviction)
SELECT filter_id, filter_type, target_table, target_column,
use_count, hit_rate, size_bytes
FROM helios_filters
WHERE use_count < 10 AND created_at < NOW() - INTERVAL '7 days'
ORDER BY use_count ASC;
-- Find filters with high false positive rates
SELECT filter_id, filter_type, target_table, target_column,
false_positive_rate, status
FROM helios_filters
WHERE false_positive_rate > 0.05
ORDER BY false_positive_rate DESC;
-- Show filters by table
SELECT target_table,
COUNT(*) as filter_count,
SUM(size_bytes) / 1024 / 1024 as total_mb,
AVG(hit_rate) as avg_hit_rate
FROM helios_filters
GROUP BY target_table
ORDER BY filter_count DESC;

helios_filter_budget

Storage budget status for secondary filters.

Schema:

ColumnTypeDescription
budget_pctFLOATConfigured budget percentage (0-50%)
budget_bytesBIGINTAbsolute budget in bytes
usage_bytesBIGINTCurrent usage in bytes
usage_pctFLOATCurrent usage as percentage of budget
filter_countBIGINTTotal number of filters
total_evictionsBIGINTNumber of evictions performed
last_evictionTIMESTAMPWhen last eviction occurred
bytes_evictedBIGINTTotal bytes freed by evictions

Example:

-- Check if approaching budget limit
SELECT budget_pct,
usage_pct,
CASE
WHEN usage_pct > 90 THEN 'CRITICAL'
WHEN usage_pct > 70 THEN 'WARNING'
ELSE 'OK'
END as status,
(budget_bytes - usage_bytes) / 1024 / 1024 as remaining_mb
FROM helios_filter_budget;

helios_filter_recommendations

ML-based filter recommendations from the Filter Advisor.

Schema:

ColumnTypeDescription
target_tableTEXTRecommended table
target_columnTEXTRecommended column
recommended_typeTEXTSuggested filter type
priority_scoreFLOATPriority (higher = more important)
estimated_benefitFLOATExpected query improvement
estimated_cost_bytesBIGINTExpected storage cost
confidenceFLOATModel confidence (0-1)
reasonTEXTWhy this is recommended

Recommendation Reasons:

ReasonDescription
HighSelectivityColumnColumn frequently used in equality predicates
FrequentJoinColumnColumn often appears in JOIN conditions
HotAccessPatternVery frequently accessed data
DeleteHeavyTableTable with frequent deletes (Cuckoo recommended)
LargeStaticSegmentLarge, rarely-modified data (XOR recommended)
StringDictionaryCandidateLow-cardinality strings (MPHF recommended)
FrequencyAnalyticsColumnColumn used for frequency analysis
ColdDataOptimizationInfrequently accessed data

Example:

-- View top recommendations
SELECT target_table, target_column, recommended_type,
priority_score, reason
FROM helios_filter_recommendations
WHERE confidence > 0.7
ORDER BY priority_score DESC
LIMIT 10;

helios_filter_build_queue

Pending filter build tasks.

Schema:

ColumnTypeDescription
task_idBIGINTBuild task identifier
filter_idBIGINTFilter being built
filter_typeTEXTType of filter
target_tableTEXTTarget table
target_columnTEXTTarget column
priorityINTBuild priority (higher = sooner)
queued_atTIMESTAMPWhen task was queued
estimated_duration_msBIGINTEstimated build time
statusTEXTqueued, building, paused

Filter Advisor Operations

Automatic Filter Creation

The Filter Advisor automatically:

  1. Analyzes query patterns
  2. Identifies columns that would benefit from filters
  3. Recommends optimal filter types
  4. Creates filters within budget constraints

Enable/Disable per table:

-- Enable automatic filter creation
ALTER TABLE orders SET (helios.auto_filter = true);
-- Disable automatic filter creation
ALTER TABLE orders SET (helios.auto_filter = false);
-- Set preferred filter type
ALTER TABLE orders SET (helios.preferred_filter_type = 'cuckoo');
-- Set target false positive rate
ALTER TABLE orders SET (helios.filter_fpr = 0.01);

Manual Filter Operations

-- Request filter creation (advisory)
SELECT helios_request_filter('orders', 'customer_id', 'bloom');
-- Force rebuild of a filter
SELECT helios_rebuild_filter(filter_id);
-- Disable a filter (keeps data, stops usage)
SELECT helios_disable_filter(filter_id);
-- Enable a disabled filter
SELECT helios_enable_filter(filter_id);
-- Remove a filter permanently
SELECT helios_drop_filter(filter_id);

Filter Type Selection Guide

ScenarioRecommended FilterReason
Equality predicates, read-heavyBloomSimple, fast, space-efficient
Equality predicates with deletesCuckooSupports deletion
Static/archival dataXORSmallest footprint, fastest lookup
Very low FPR required (<0.1%)RibbonOptimal space at low FPR
Dictionary encodingMPHFPerfect hash, zero FPR
Frequency estimationCount-MinApproximate counting
Range predicatesZone MapMin/max bounds checking

Storage Budget Management

Online Reconfiguration

Budget can be changed without restart:

-- Increase budget to 15%
SET helios.filter_budget_pct = 0.15;
-- Reduce budget (may trigger evictions)
SET helios.filter_budget_pct = 0.05;
-- View current setting
SHOW helios.filter_budget_pct;

Eviction Behavior

When over budget, filters are evicted in this order:

PriorityCategoryCriteria
1 (first)Never Useduse_count=0, optimizer never selected
2StaleFPR >10%, needs rebuild
3RedundantAnother filter provides same coverage
4Low Impacthit_rate <50%, minimal improvement
5InfrequentNot used in 24+ hours
6InefficientLarge size, low benefit
7Budget PressureNormal filters when over budget
NEVERProtectedPrimary filters, user-protected

Monitor Eviction Activity

-- Recent evictions
SELECT filter_type, target_table, target_column,
eviction_reason, evicted_at, size_bytes
FROM helios_eviction_log
ORDER BY evicted_at DESC
LIMIT 20;
-- Eviction summary
SELECT eviction_reason,
COUNT(*) as count,
SUM(size_bytes) / 1024 / 1024 as total_mb
FROM helios_eviction_log
WHERE evicted_at > NOW() - INTERVAL '24 hours'
GROUP BY eviction_reason;

Performance Monitoring

Filter Performance Statistics

-- Filter performance summary
SELECT
filter_type,
COUNT(*) as count,
AVG(hit_rate) as avg_hit_rate,
AVG(false_positive_rate) as avg_fpr,
SUM(use_count) as total_uses,
SUM(size_bytes) / 1024 / 1024 as total_mb
FROM helios_filters
WHERE status = 'ready'
GROUP BY filter_type;
-- Identify best-performing filters
SELECT filter_id, filter_type, target_table, target_column,
hit_rate, use_count,
(hit_rate * use_count) / (size_bytes / 1024) as efficiency_score
FROM helios_filters
WHERE use_count > 100
ORDER BY efficiency_score DESC
LIMIT 10;

Query Optimizer Filter Usage

-- See filter usage in query plans
EXPLAIN (ANALYZE, FILTERS)
SELECT * FROM orders WHERE customer_id = 123;
-- Sample output shows:
-- Filter: bloom_filter_17 on orders.customer_id
-- Filter Stats: lookups=1, hits=1, fpr=0.001

Cost Model Metrics

-- View filter cost model decisions
SELECT * FROM helios_filter_cost_analysis
WHERE query_id = 'last';
-- Shows:
-- filter_type, filter_id, cpu_cost, io_savings,
-- net_benefit, recommended, explanation

Configuration Reference

Global Settings

SettingDefaultRangeDescription
helios.filter_budget_pct0.100.0-0.5Secondary filter storage budget
helios.auto_filtertrueboolEnable automatic filter creation
helios.filter_build_workersCPU/21-CPUParallel filter build workers
helios.filter_build_cpu_limit0.300.0-1.0CPU budget for builds
helios.adaptive_throttlingtrueboolPause builds during high load

Per-Table Settings

SettingDefaultDescription
helios.auto_filterinheritOverride global auto-filter
helios.preferred_filter_typeautoPreferred filter type
helios.filter_fpr0.01Target false positive rate
helios.filter_protectedfalseProtect filters from eviction

Build Coordinator Settings

SettingDefaultDescription
helios.max_concurrent_builds4Max parallel builds
helios.build_batch_size10000Rows per build batch
helios.group_commit_timeout_ms10Batching window for deltas

What to Expect

Normal Behaviors

Filter Build During High Load:

  • Build operations automatically throttle during high query load
  • Expect build times to increase 2-5x during peak hours
  • This is intentional to maintain query performance

Gradual Filter Population:

  • New tables start with no filters
  • Filter Advisor learns patterns over ~1 hour of queries
  • Optimal filter coverage typically achieved within 24 hours

Budget Pressure Evictions:

  • When approaching budget limit, low-value filters are evicted
  • Eviction is normal and expected behavior
  • High-value filters are protected

Rebuild Cycles:

  • Filters periodically rebuild to maintain low FPR
  • During rebuild, old version continues serving queries
  • Brief pause (<100ms) during atomic swap

Performance Expectations

MetricTargetAcceptableAlert
Filter hit rate>80%60-80%<60%
Avg false positive rate<2%2-5%>5%
Budget usage<70%70-90%>90%
Build queue depth0-55-20>20
Build latency (avg)<5s5-30s>30s

Storage Overhead

Typical filter storage overhead:

Filter TypeBits/Item10M Rows100M Rows
Bloom (1% FPR)~1012 MB120 MB
Cuckoo (1% FPR)~1215 MB150 MB
XOR (1% FPR)~911 MB110 MB
Ribbon (0.1% FPR)~810 MB100 MB
MPHF~34 MB40 MB

Troubleshooting

High False Positive Rate

Symptoms: Filter FPR >5%, queries slower than expected

Diagnosis:

SELECT filter_id, filter_type, target_table,
false_positive_rate, status, last_used
FROM helios_filters
WHERE false_positive_rate > 0.05;

Resolution:

  1. Check if filter is stale (needs rebuild)
  2. Verify data hasn’t changed significantly
  3. Request rebuild: SELECT helios_rebuild_filter(filter_id);

Build Queue Backlog

Symptoms: Large build queue, filters stuck in “pending”

Diagnosis:

SELECT COUNT(*), status
FROM helios_filter_build_queue
GROUP BY status;

Resolution:

  1. Check CPU usage - builds throttle during high load
  2. Increase build workers: SET helios.filter_build_workers = 4;
  3. Temporarily increase CPU budget: SET helios.filter_build_cpu_limit = 0.5;

Unexpected Evictions

Symptoms: Useful filters being evicted

Diagnosis:

SELECT * FROM helios_eviction_log
WHERE evicted_at > NOW() - INTERVAL '1 hour'
ORDER BY evicted_at DESC;

Resolution:

  1. Increase budget: SET helios.filter_budget_pct = 0.20;
  2. Protect important filters:
    ALTER TABLE important_table SET (helios.filter_protected = true);
  3. Review eviction reasons and adjust workload

Filters Not Being Used

Symptoms: Filters exist but optimizer doesn’t select them

Diagnosis:

-- Check filter visibility
SELECT filter_id, visibility, status
FROM helios_filters
WHERE target_table = 'my_table';
-- Check cost model decision
EXPLAIN (ANALYZE, FILTERS, COSTS)
SELECT * FROM my_table WHERE column = value;

Resolution:

  1. Ensure filter status is ‘ready’
  2. Ensure visibility is true
  3. Check selectivity - filters only help when <30% of rows match
  4. Verify predicate type is supported by filter

Memory Pressure

Symptoms: High memory usage from filter operations

Diagnosis:

SELECT SUM(size_bytes) / 1024 / 1024 as total_filter_mb
FROM helios_filters;
SELECT * FROM helios_filter_budget;

Resolution:

  1. Reduce budget percentage
  2. Enable adaptive throttling
  3. Reduce concurrent builds

Maintenance Operations

Daily Checks

-- Quick health check
SELECT
'Filters' as metric,
COUNT(*) FILTER (WHERE status = 'ready') as ready,
COUNT(*) FILTER (WHERE status = 'building') as building,
COUNT(*) FILTER (WHERE status = 'stale') as stale,
COUNT(*) FILTER (WHERE status = 'pending') as pending
FROM helios_filters;
-- Budget status
SELECT budget_pct, usage_pct,
(budget_bytes - usage_bytes) / 1024 / 1024 as remaining_mb
FROM helios_filter_budget;
-- Build queue
SELECT COUNT(*) as queue_depth,
MIN(queued_at) as oldest_task
FROM helios_filter_build_queue
WHERE status = 'queued';

Weekly Maintenance

-- Review stale filters
SELECT filter_id, filter_type, target_table, target_column,
created_at, last_used, false_positive_rate
FROM helios_filters
WHERE status = 'stale' OR false_positive_rate > 0.05;
-- Review recommendations not yet implemented
SELECT * FROM helios_filter_recommendations
WHERE priority_score > 0.8 AND confidence > 0.7
ORDER BY priority_score DESC;
-- Check for inefficient filters
SELECT filter_id, filter_type, target_table,
size_bytes / 1024 / 1024 as size_mb,
use_count,
hit_rate
FROM helios_filters
WHERE size_bytes > 100 * 1024 * 1024 -- >100MB
AND hit_rate < 0.5;

Periodic Cleanup

-- Rebuild filters with high FPR
SELECT helios_rebuild_filter(filter_id)
FROM helios_filters
WHERE false_positive_rate > 0.05 AND status = 'ready';
-- Clear orphaned build tasks
SELECT helios_cleanup_build_queue();
-- Compact filter storage
SELECT helios_compact_filter_storage();