Performance Benchmark Comparison — Baseline vs Optimized
Performance Benchmark Comparison — Baseline vs Optimized
Test Configuration
- Debug build (unoptimized + debuginfo)
- Schema: 5 tables (customers 100, products 50, orders 200, order_items 500, categories 20)
- 35 SQL categories, 20 iterations each
- Deployment: Embedded/In-Memory
Summary of Improvements Implemented
| # | Improvement | Files Changed | Impact |
|---|---|---|---|
| 1 | EXISTS semi-join hash optimization | executor/mod.rs | 33.7x faster EXISTS subqueries |
| 2 | LIKE prefix/suffix/contains fast-path | evaluator.rs | 47x faster LIKE patterns |
| 3 | Fix EXPLAIN ANALYZE 0ms timeout | executor/explain.rs | EXPLAIN ANALYZE now works |
| 4 | Fix UPSERT column resolution | lib.rs | UPSERT now works |
| 5 | Fix UPDATE with subquery in SET clause | lib.rs, executor/mod.rs | UPDATE w/subquery now works |
| 6 | Implement INSERT…SELECT | lib.rs, planner.rs, logical_plan.rs | New SQL feature |
| 7 | Implement RESET statement in query() | lib.rs | RESET now works via query() API |
| 8 | Optimize VIEW DDL (plan.schema()) | executor/mod.rs | 831x faster CREATE VIEW |
| 9 | Optimize REFRESH MV (bulk_load_mode) | executor/phase3.rs | Minimal effect (I/O-bound) |
| 10 | Wire recursive optimizer rules | lib.rs | Pushdown/pruning rules fire |
| 11 | Cost-based optimizer skip for simple plans | optimizer/mod.rs | Eliminates JOIN regressions |
| 12 | LIKE underscore wildcard fast-path | evaluator.rs | Handles _ without regex |
Comparison Table (Embedded/In-Memory, avg per query)
| # | Category | Baseline | After All | Change | Key Improvement |
|---|---|---|---|---|---|
| 1 | CREATE TABLE | 156us | 160us | +3% | — |
| 2 | CREATE INDEX | 47us | 49us | +4% | — |
| 3 | ALTER TABLE | 80us | 81us | +1% | — |
| 4 | DROP TABLE | 153us | 160us | +5% | — |
| 5 | CREATE/DROP VIEW | 123.81ms | 149us | -99.9% | #8 plan.schema() (831x) |
| 6 | REFRESH MV | 284.07ms | 289ms | +2% | #9 bulk_load (no effect) |
| 7 | TRUNCATE | 214us | 229us | +7% | — |
| 8 | INSERT single | 216us | 243us | +13% | — |
| 9 | INSERT multi-row | 780us | 803us | +3% | — |
| 10 | INSERT…SELECT | FAILED | 404ms | FIXED | #6 new feature |
| 11 | UPDATE point | 1.28ms | 1.27ms | -1% | — |
| 12 | DELETE point | 338us | 333us | -1% | — |
| 13 | UPSERT | FAILED | 123ms | FIXED | #4 schema fix |
| 14 | UPDATE w/subquery | FAILED | 2.03s | FIXED | #5 subquery materialization |
| 15 | Point lookup | 1.43ms | 421us | -71% | #10 recursive optimizer (3.4x) |
| 16 | Full scan + filter | 1.54ms | 1.13ms | -27% | #10 predicate pushdown |
| 17 | Aggregation | 1.65ms | 1.63ms | -1% | cost-based skip |
| 18 | INNER JOIN | 4.64ms | 4.62ms | 0% | cost-based skip (was +32%) |
| 19 | LEFT JOIN | 4.52ms | 4.64ms | +3% | cost-based skip (was +42%) |
| 20 | Multi-table JOIN (4) | 14.68ms | 15.13ms | +3% | cost-based skip (was +27%) |
| 21 | Scalar subquery | 42.40ms | 42.10ms | -1% | — |
| 22 | EXISTS subquery | 185.14ms | 5.50ms | -97.0% | #1 semi-join (33.7x) |
| 23 | IN subquery | 3.48ms | 3.52ms | +1% | — |
| 24 | CTE | 6.79ms | 6.85ms | +1% | cost-based skip (was +15%) |
| 25 | Recursive CTE | 1.97ms | 2.01ms | +2% | cost-based skip (was +15%) |
| 26 | Window functions | 1.55ms | 1.22ms | -21% | — |
| 27 | UNION | 2.68ms | 2.73ms | +2% | — |
| 28 | DISTINCT | 1.50ms | 1.54ms | +3% | — |
| 29 | ORDER BY + LIMIT | 2.14ms | 2.18ms | +2% | — |
| 30 | CASE expressions | 1.79ms | 1.49ms | -17% | — |
| 31 | JSON-like filter | 1.60ms | 1.26ms | -21% | — |
| 32 | LIKE/BETWEEN/IN | 90.16ms | 1.90ms | -97.9% | #2 LIKE fast-path (47x) |
| 33 | Transaction control | 220us | 231us | +5% | — |
| 34 | Prepared statements | 175us | 182us | +4% | — |
| 35 | SET/SHOW/RESET | FAILED | ~125us | FIXED | #7 RESET in query() |
Key Results
Big Wins (targeted improvements)
- EXISTS subquery: 185ms -> 5.5ms (33.7x) — Semi-join hash optimization replaces O(NxM) nested loop with O(N+M) hash build + O(1) probes
- LIKE/BETWEEN/IN: 90ms -> 1.9ms (47x) — Fast-path string ops (starts_with/ends_with/contains) avoid regex compilation; underscore wildcards handled with char-by-char matching
- CREATE/DROP VIEW: 124ms -> 149us (831x) — Use plan.schema() to derive column types instead of executing the full view query
- Point lookup: 1.43ms -> 421us (3.4x) — Recursive optimizer fires predicate pushdown and projection pruning
Bug Fixes (previously FAILED categories now working)
- INSERT…SELECT: New SQL feature, 404ms per query
- UPSERT (ON CONFLICT DO UPDATE): Fixed schema columns missing source_table_name
- UPDATE with subquery in SET: Subqueries in SET clause now materialized per-row
- SET/SHOW/RESET: RESET now handled before sqlparser in query() path
Cost-Based Optimizer Skip (eliminates regressions)
- Simple plans (no joins) use single root-level optimize() instead of walking entire tree
- Complex plans with joins still get full recursive bottom-up optimization
- Result: JOINs back to baseline (+0-3%), aggregation/CTE regressions eliminated
- Previously: JOINs were 27-42% slower, aggregation +24%, CTE +15%
Unchanged
- DDL operations (CREATE/ALTER/DROP TABLE) unchanged
- INSERT/DELETE point operations unchanged
- REFRESH MV: bulk_load_mode had minimal effect (I/O-dominated by storage operations)