HeliosCore Operations Guide
HeliosCore Operations Guide
This guide covers day-to-day operations for the HeliosCore Native Storage Engine, including monitoring, configuration, troubleshooting, and performance tuning.
Table of Contents
- Quick Reference
- System Views
- Filter Advisor Operations
- Storage Budget Management
- Performance Monitoring
- Configuration Reference
- What to Expect
- Troubleshooting
- Maintenance Operations
Quick Reference
Essential Commands
-- View filter statusSELECT * FROM helios_filters;
-- Check storage budgetSELECT * FROM helios_filter_budget;
-- View filter recommendationsSELECT * FROM helios_filter_recommendations;
-- Check build queueSELECT * FROM helios_filter_build_queue;
-- Change filter budget (online, no restart)SET helios.filter_budget_pct = 0.15;
-- View current settingsSHOW helios.filter_budget_pct;Health Check Script
-- Quick system health checkSELECT (SELECT COUNT(*) FROM helios_filters WHERE status = 'ready') as active_filters, (SELECT COUNT(*) FROM helios_filters WHERE status = 'building') as building, (SELECT COUNT(*) FROM helios_filters WHERE status = 'stale') as stale, (SELECT usage_pct FROM helios_filter_budget) as budget_usage_pct;System Views
helios_filters
Central view of all registered filters.
Schema:
| Column | Type | Description |
|---|---|---|
filter_id | BIGINT | Unique filter identifier |
filter_type | TEXT | bloom, cuckoo, xor, ribbon, mphf, etc. |
target_type | TEXT | table, column, segment |
target_table | TEXT | Target table name |
target_column | TEXT | Target column name (if applicable) |
status | TEXT | pending, building, ready, rebuilding, stale, evicting, disabled |
visibility | BOOLEAN | Whether visible to query optimizer |
created_at | TIMESTAMP | When filter was created |
last_used | TIMESTAMP | Last query that used this filter |
use_count | BIGINT | Total times filter was used |
hit_rate | FLOAT | Successful lookups / total lookups |
false_positive_rate | FLOAT | Observed FPR |
size_bytes | BIGINT | Filter size in bytes |
build_cost_ms | BIGINT | Time to build filter |
Example Queries:
-- Find underutilized filters (candidates for eviction)SELECT filter_id, filter_type, target_table, target_column, use_count, hit_rate, size_bytesFROM helios_filtersWHERE use_count < 10 AND created_at < NOW() - INTERVAL '7 days'ORDER BY use_count ASC;
-- Find filters with high false positive ratesSELECT filter_id, filter_type, target_table, target_column, false_positive_rate, statusFROM helios_filtersWHERE false_positive_rate > 0.05ORDER BY false_positive_rate DESC;
-- Show filters by tableSELECT target_table, COUNT(*) as filter_count, SUM(size_bytes) / 1024 / 1024 as total_mb, AVG(hit_rate) as avg_hit_rateFROM helios_filtersGROUP BY target_tableORDER BY filter_count DESC;helios_filter_budget
Storage budget status for secondary filters.
Schema:
| Column | Type | Description |
|---|---|---|
budget_pct | FLOAT | Configured budget percentage (0-50%) |
budget_bytes | BIGINT | Absolute budget in bytes |
usage_bytes | BIGINT | Current usage in bytes |
usage_pct | FLOAT | Current usage as percentage of budget |
filter_count | BIGINT | Total number of filters |
total_evictions | BIGINT | Number of evictions performed |
last_eviction | TIMESTAMP | When last eviction occurred |
bytes_evicted | BIGINT | Total bytes freed by evictions |
Example:
-- Check if approaching budget limitSELECT budget_pct, usage_pct, CASE WHEN usage_pct > 90 THEN 'CRITICAL' WHEN usage_pct > 70 THEN 'WARNING' ELSE 'OK' END as status, (budget_bytes - usage_bytes) / 1024 / 1024 as remaining_mbFROM helios_filter_budget;helios_filter_recommendations
ML-based filter recommendations from the Filter Advisor.
Schema:
| Column | Type | Description |
|---|---|---|
target_table | TEXT | Recommended table |
target_column | TEXT | Recommended column |
recommended_type | TEXT | Suggested filter type |
priority_score | FLOAT | Priority (higher = more important) |
estimated_benefit | FLOAT | Expected query improvement |
estimated_cost_bytes | BIGINT | Expected storage cost |
confidence | FLOAT | Model confidence (0-1) |
reason | TEXT | Why this is recommended |
Recommendation Reasons:
| Reason | Description |
|---|---|
HighSelectivityColumn | Column frequently used in equality predicates |
FrequentJoinColumn | Column often appears in JOIN conditions |
HotAccessPattern | Very frequently accessed data |
DeleteHeavyTable | Table with frequent deletes (Cuckoo recommended) |
LargeStaticSegment | Large, rarely-modified data (XOR recommended) |
StringDictionaryCandidate | Low-cardinality strings (MPHF recommended) |
FrequencyAnalyticsColumn | Column used for frequency analysis |
ColdDataOptimization | Infrequently accessed data |
Example:
-- View top recommendationsSELECT target_table, target_column, recommended_type, priority_score, reasonFROM helios_filter_recommendationsWHERE confidence > 0.7ORDER BY priority_score DESCLIMIT 10;helios_filter_build_queue
Pending filter build tasks.
Schema:
| Column | Type | Description |
|---|---|---|
task_id | BIGINT | Build task identifier |
filter_id | BIGINT | Filter being built |
filter_type | TEXT | Type of filter |
target_table | TEXT | Target table |
target_column | TEXT | Target column |
priority | INT | Build priority (higher = sooner) |
queued_at | TIMESTAMP | When task was queued |
estimated_duration_ms | BIGINT | Estimated build time |
status | TEXT | queued, building, paused |
Filter Advisor Operations
Automatic Filter Creation
The Filter Advisor automatically:
- Analyzes query patterns
- Identifies columns that would benefit from filters
- Recommends optimal filter types
- Creates filters within budget constraints
Enable/Disable per table:
-- Enable automatic filter creationALTER TABLE orders SET (helios.auto_filter = true);
-- Disable automatic filter creationALTER TABLE orders SET (helios.auto_filter = false);
-- Set preferred filter typeALTER TABLE orders SET (helios.preferred_filter_type = 'cuckoo');
-- Set target false positive rateALTER TABLE orders SET (helios.filter_fpr = 0.01);Manual Filter Operations
-- Request filter creation (advisory)SELECT helios_request_filter('orders', 'customer_id', 'bloom');
-- Force rebuild of a filterSELECT helios_rebuild_filter(filter_id);
-- Disable a filter (keeps data, stops usage)SELECT helios_disable_filter(filter_id);
-- Enable a disabled filterSELECT helios_enable_filter(filter_id);
-- Remove a filter permanentlySELECT helios_drop_filter(filter_id);Filter Type Selection Guide
| Scenario | Recommended Filter | Reason |
|---|---|---|
| Equality predicates, read-heavy | Bloom | Simple, fast, space-efficient |
| Equality predicates with deletes | Cuckoo | Supports deletion |
| Static/archival data | XOR | Smallest footprint, fastest lookup |
| Very low FPR required (<0.1%) | Ribbon | Optimal space at low FPR |
| Dictionary encoding | MPHF | Perfect hash, zero FPR |
| Frequency estimation | Count-Min | Approximate counting |
| Range predicates | Zone Map | Min/max bounds checking |
Storage Budget Management
Online Reconfiguration
Budget can be changed without restart:
-- Increase budget to 15%SET helios.filter_budget_pct = 0.15;
-- Reduce budget (may trigger evictions)SET helios.filter_budget_pct = 0.05;
-- View current settingSHOW helios.filter_budget_pct;Eviction Behavior
When over budget, filters are evicted in this order:
| Priority | Category | Criteria |
|---|---|---|
| 1 (first) | Never Used | use_count=0, optimizer never selected |
| 2 | Stale | FPR >10%, needs rebuild |
| 3 | Redundant | Another filter provides same coverage |
| 4 | Low Impact | hit_rate <50%, minimal improvement |
| 5 | Infrequent | Not used in 24+ hours |
| 6 | Inefficient | Large size, low benefit |
| 7 | Budget Pressure | Normal filters when over budget |
| NEVER | Protected | Primary filters, user-protected |
Monitor Eviction Activity
-- Recent evictionsSELECT filter_type, target_table, target_column, eviction_reason, evicted_at, size_bytesFROM helios_eviction_logORDER BY evicted_at DESCLIMIT 20;
-- Eviction summarySELECT eviction_reason, COUNT(*) as count, SUM(size_bytes) / 1024 / 1024 as total_mbFROM helios_eviction_logWHERE evicted_at > NOW() - INTERVAL '24 hours'GROUP BY eviction_reason;Performance Monitoring
Filter Performance Statistics
-- Filter performance summarySELECT filter_type, COUNT(*) as count, AVG(hit_rate) as avg_hit_rate, AVG(false_positive_rate) as avg_fpr, SUM(use_count) as total_uses, SUM(size_bytes) / 1024 / 1024 as total_mbFROM helios_filtersWHERE status = 'ready'GROUP BY filter_type;
-- Identify best-performing filtersSELECT filter_id, filter_type, target_table, target_column, hit_rate, use_count, (hit_rate * use_count) / (size_bytes / 1024) as efficiency_scoreFROM helios_filtersWHERE use_count > 100ORDER BY efficiency_score DESCLIMIT 10;Query Optimizer Filter Usage
-- See filter usage in query plansEXPLAIN (ANALYZE, FILTERS)SELECT * FROM orders WHERE customer_id = 123;
-- Sample output shows:-- Filter: bloom_filter_17 on orders.customer_id-- Filter Stats: lookups=1, hits=1, fpr=0.001Cost Model Metrics
-- View filter cost model decisionsSELECT * FROM helios_filter_cost_analysisWHERE query_id = 'last';
-- Shows:-- filter_type, filter_id, cpu_cost, io_savings,-- net_benefit, recommended, explanationConfiguration Reference
Global Settings
| Setting | Default | Range | Description |
|---|---|---|---|
helios.filter_budget_pct | 0.10 | 0.0-0.5 | Secondary filter storage budget |
helios.auto_filter | true | bool | Enable automatic filter creation |
helios.filter_build_workers | CPU/2 | 1-CPU | Parallel filter build workers |
helios.filter_build_cpu_limit | 0.30 | 0.0-1.0 | CPU budget for builds |
helios.adaptive_throttling | true | bool | Pause builds during high load |
Per-Table Settings
| Setting | Default | Description |
|---|---|---|
helios.auto_filter | inherit | Override global auto-filter |
helios.preferred_filter_type | auto | Preferred filter type |
helios.filter_fpr | 0.01 | Target false positive rate |
helios.filter_protected | false | Protect filters from eviction |
Build Coordinator Settings
| Setting | Default | Description |
|---|---|---|
helios.max_concurrent_builds | 4 | Max parallel builds |
helios.build_batch_size | 10000 | Rows per build batch |
helios.group_commit_timeout_ms | 10 | Batching window for deltas |
What to Expect
Normal Behaviors
Filter Build During High Load:
- Build operations automatically throttle during high query load
- Expect build times to increase 2-5x during peak hours
- This is intentional to maintain query performance
Gradual Filter Population:
- New tables start with no filters
- Filter Advisor learns patterns over ~1 hour of queries
- Optimal filter coverage typically achieved within 24 hours
Budget Pressure Evictions:
- When approaching budget limit, low-value filters are evicted
- Eviction is normal and expected behavior
- High-value filters are protected
Rebuild Cycles:
- Filters periodically rebuild to maintain low FPR
- During rebuild, old version continues serving queries
- Brief pause (<100ms) during atomic swap
Performance Expectations
| Metric | Target | Acceptable | Alert |
|---|---|---|---|
| Filter hit rate | >80% | 60-80% | <60% |
| Avg false positive rate | <2% | 2-5% | >5% |
| Budget usage | <70% | 70-90% | >90% |
| Build queue depth | 0-5 | 5-20 | >20 |
| Build latency (avg) | <5s | 5-30s | >30s |
Storage Overhead
Typical filter storage overhead:
| Filter Type | Bits/Item | 10M Rows | 100M Rows |
|---|---|---|---|
| Bloom (1% FPR) | ~10 | 12 MB | 120 MB |
| Cuckoo (1% FPR) | ~12 | 15 MB | 150 MB |
| XOR (1% FPR) | ~9 | 11 MB | 110 MB |
| Ribbon (0.1% FPR) | ~8 | 10 MB | 100 MB |
| MPHF | ~3 | 4 MB | 40 MB |
Troubleshooting
High False Positive Rate
Symptoms: Filter FPR >5%, queries slower than expected
Diagnosis:
SELECT filter_id, filter_type, target_table, false_positive_rate, status, last_usedFROM helios_filtersWHERE false_positive_rate > 0.05;Resolution:
- Check if filter is stale (needs rebuild)
- Verify data hasn’t changed significantly
- Request rebuild:
SELECT helios_rebuild_filter(filter_id);
Build Queue Backlog
Symptoms: Large build queue, filters stuck in “pending”
Diagnosis:
SELECT COUNT(*), statusFROM helios_filter_build_queueGROUP BY status;Resolution:
- Check CPU usage - builds throttle during high load
- Increase build workers:
SET helios.filter_build_workers = 4; - Temporarily increase CPU budget:
SET helios.filter_build_cpu_limit = 0.5;
Unexpected Evictions
Symptoms: Useful filters being evicted
Diagnosis:
SELECT * FROM helios_eviction_logWHERE evicted_at > NOW() - INTERVAL '1 hour'ORDER BY evicted_at DESC;Resolution:
- Increase budget:
SET helios.filter_budget_pct = 0.20; - Protect important filters:
ALTER TABLE important_table SET (helios.filter_protected = true);
- Review eviction reasons and adjust workload
Filters Not Being Used
Symptoms: Filters exist but optimizer doesn’t select them
Diagnosis:
-- Check filter visibilitySELECT filter_id, visibility, statusFROM helios_filtersWHERE target_table = 'my_table';
-- Check cost model decisionEXPLAIN (ANALYZE, FILTERS, COSTS)SELECT * FROM my_table WHERE column = value;Resolution:
- Ensure filter status is ‘ready’
- Ensure visibility is true
- Check selectivity - filters only help when <30% of rows match
- Verify predicate type is supported by filter
Memory Pressure
Symptoms: High memory usage from filter operations
Diagnosis:
SELECT SUM(size_bytes) / 1024 / 1024 as total_filter_mbFROM helios_filters;
SELECT * FROM helios_filter_budget;Resolution:
- Reduce budget percentage
- Enable adaptive throttling
- Reduce concurrent builds
Maintenance Operations
Daily Checks
-- Quick health checkSELECT 'Filters' as metric, COUNT(*) FILTER (WHERE status = 'ready') as ready, COUNT(*) FILTER (WHERE status = 'building') as building, COUNT(*) FILTER (WHERE status = 'stale') as stale, COUNT(*) FILTER (WHERE status = 'pending') as pendingFROM helios_filters;
-- Budget statusSELECT budget_pct, usage_pct, (budget_bytes - usage_bytes) / 1024 / 1024 as remaining_mbFROM helios_filter_budget;
-- Build queueSELECT COUNT(*) as queue_depth, MIN(queued_at) as oldest_taskFROM helios_filter_build_queueWHERE status = 'queued';Weekly Maintenance
-- Review stale filtersSELECT filter_id, filter_type, target_table, target_column, created_at, last_used, false_positive_rateFROM helios_filtersWHERE status = 'stale' OR false_positive_rate > 0.05;
-- Review recommendations not yet implementedSELECT * FROM helios_filter_recommendationsWHERE priority_score > 0.8 AND confidence > 0.7ORDER BY priority_score DESC;
-- Check for inefficient filtersSELECT filter_id, filter_type, target_table, size_bytes / 1024 / 1024 as size_mb, use_count, hit_rateFROM helios_filtersWHERE size_bytes > 100 * 1024 * 1024 -- >100MB AND hit_rate < 0.5;Periodic Cleanup
-- Rebuild filters with high FPRSELECT helios_rebuild_filter(filter_id)FROM helios_filtersWHERE false_positive_rate > 0.05 AND status = 'ready';
-- Clear orphaned build tasksSELECT helios_cleanup_build_queue();
-- Compact filter storageSELECT helios_compact_filter_storage();