MongoDB redefined how developers think about databases -- schema flexibility, horizontal scaling, and a developer-friendly document model made it the default NoSQL choice for over a decade. But modern applications demand more than documents. They need vector search for AI, full-text search for discovery, relational JOINs for analytics, time-travel for auditing, and ACID transactions that actually work across collections. HeliosDB delivers all of this in a single binary with a familiar PostgreSQL wire protocol, eliminating the "MongoDB for documents, Postgres for relations, Pinecone for vectors" architecture that has become the norm.


Quick Comparison

Feature MongoDB HeliosDB
Data modelBSON documents in collectionsTables + JSONB + vectors + relational
SchemaSchema-optional (validation rules)Schema-enforced + JSONB for flexibility
Query languageMongoDB Query Language (MQL)Standard SQL (PostgreSQL-compatible)
Wire protocolMongoDB wire protocolPostgreSQL wire protocol + REST + gRPC
JOINs$lookup (limited, no nested)Full SQL JOINs (INNER, LEFT, RIGHT, FULL, CROSS, LATERAL)
Vector searchAtlas Vector Search (separate add-on)Native HNSW + Product Quantization + SIMD
Full-text searchAtlas Search (Lucene-based add-on)Native @@ operator, integrated with optimizer
ACID transactionsMulti-document (since 4.0, with caveats)Full MVCC, snapshot isolation, savepoints
Time-travel queriesNot supportedSELECT * FROM t AT TIME '2026-01-01'
Data branchingNot supportedGit-like branches with merge
EncryptionAt-rest (WiredTiger) + in-transit (TLS)TDE (AES-256-GCM) + Zero-Knowledge Encryption
Embedded deploymentNot available (server-only)60MB binary, in-process or standalone
ReplicationReplica sets (primary-secondary)3-tier: WAL streaming, multi-primary, sharding
ShardingHash or range-based, mongos routerHash, range, or list partitioning + proxy
Change streamsNative (oplog-based)WAL-based change feeds (HeliosDB-Full)
AggregationAggregation pipeline (stages)SQL: GROUP BY, window functions, CTEs, subqueries
IndexesB-tree, hashed, text, 2dsphere, wildcardART (Adaptive Radix Tree), bloom filters, zone maps, HNSW

The Architecture Problem

With MongoDB: The Multi-Database Stack

When your application outgrows documents, you bolt on additional databases:

+--------------+     +--------------+     +--------------+
|   MongoDB    |     |  PostgreSQL  |     |   Pinecone   |
| (documents)  |     | (relational) |     |  (vectors)   |
+------+-------+     +------+-------+     +------+-------+
       |                    |                    |
       +--------+-----------+------------+-------+
                |                        |
         +------+------+          +------+------+
         | Application |          |  Sync Layer |
         |   (joins    |          | (keep DBs   |
         |  in code)   |          |  in sync)   |
         +-------------+          +-------------+

 3+ databases to deploy and keep in sync
 JOINs happen in application code
 No cross-database transactions
 Vector search is a separate billing dimension

With HeliosDB: Documents + Relations + Vectors in One Engine

+-------------------------------------------------+
|                   HeliosDB                       |
|                                                 |
|  +----------+  +----------+  +--------------+   |
|  |  JSONB   |  |Relational|  | Vector Search|   |
|  |20+ funcs |  |  Tables  |  |  HNSW + PQ   |   |
|  +----+-----+  +----+-----+  +------+-------+   |
|       |             |               |            |
|  +----+-------------+---------------+----------+ |
|  |    MVCC Storage Engine + SQL Optimizer       | |
|  |  (ACID + TDE + WAL + Branching + FTS)       | |
|  +---------------------------------------------+ |
|                                                 |
|  PostgreSQL wire protocol (psql, any PG driver) |
+-------------------------------------------------+

 1 binary, 60MB, embeddable or clustered
 Documents + relations + vectors in one ACID transaction
 Standard SQL -- every developer already knows it

Schema Flexibility: BSON vs JSONB + Typed Columns

MongoDB: Schema-Optional Documents

// MongoDB: flexible schema, but no guarantees without validation
db.products.insertMany([
    {
        name: "Widget",
        price: 29.99,
        tags: ["hardware", "bestseller"],
        specs: { weight: 0.5, color: "blue" }
    },
    {
        // Oops: "price" is a string, "tags" is missing
        name: "Gadget",
        price: "forty-two",
        specifications: { mass: 1.2 }  // different field name
    }
]);

// Schema validation (added later, optional)
db.createCollection("products", {
    validator: {
        $jsonSchema: {
            bsonType: "object",
            required: ["name", "price"],
            properties: {
                name: { bsonType: "string" },
                price: { bsonType: "number" }
            }
        }
    }
});

HeliosDB: Typed Columns + JSONB Flexibility

-- Structured columns enforce data quality
-- JSONB column provides document-style flexibility
CREATE TABLE products (
    id          SERIAL PRIMARY KEY,
    name        TEXT NOT NULL,
    price       NUMERIC(10,2) NOT NULL CHECK (price > 0),
    tags        TEXT[] DEFAULT '{}',
    specs       JSONB DEFAULT '{}',
    embedding   VECTOR(384),
    created_at  TIMESTAMPTZ DEFAULT NOW()
);

-- Type-safe inserts: "forty-two" would be rejected immediately
INSERT INTO products (name, price, tags, specs, embedding)
VALUES (
    'Widget',
    29.99,
    ARRAY['hardware', 'bestseller'],
    '{"weight": 0.5, "color": "blue", "dimensions": {"l": 10, "w": 5, "h": 3}}',
    '[0.12, -0.45, 0.89, ...]'::vector(384)
);

-- Best of both worlds:
--   Structured data (name, price) has schema enforcement
--   Flexible data (specs) uses JSONB with 20+ query functions
--   Vector embeddings stored alongside for AI search

Query Language: MQL vs SQL

MongoDB: MongoDB Query Language

// Find products with nested field filter
db.products.find({
    "specs.weight": { $lt: 1.0 },
    tags: { $in: ["hardware"] },
    price: { $gte: 10, $lte: 50 }
}).sort({ price: -1 }).limit(10);

// Aggregation pipeline: multi-stage data transformation
db.orders.aggregate([
    { $match: { status: "completed", date: { $gte: ISODate("2026-01-01") } } },
    { $unwind: "$items" },
    { $group: {
        _id: "$items.product_id",
        total_revenue: { $sum: { $multiply: ["$items.price", "$items.qty"] } },
        order_count: { $sum: 1 }
    }},
    { $sort: { total_revenue: -1 } },
    { $limit: 20 },
    // $lookup is MongoDB's JOIN -- limited to one collection, no nesting
    { $lookup: {
        from: "products",
        localField: "_id",
        foreignField: "_id",
        as: "product"
    }},
    { $unwind: "$product" },
    { $project: {
        product_name: "$product.name",
        total_revenue: 1,
        order_count: 1
    }}
]);

HeliosDB: Standard SQL (Same Query, Clearer Syntax)

-- Same filter query: readable, standard SQL
SELECT name, price, specs->>'weight' AS weight
FROM products
WHERE (specs->>'weight')::numeric < 1.0
  AND 'hardware' = ANY(tags)
  AND price BETWEEN 10 AND 50
ORDER BY price DESC
LIMIT 10;

-- Same aggregation: SQL with JOINs (no pipeline gymnastics)
SELECT
    p.name AS product_name,
    SUM(oi.price * oi.qty) AS total_revenue,
    COUNT(DISTINCT o.id) AS order_count
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id
WHERE o.status = 'completed'
  AND o.order_date >= '2026-01-01'
GROUP BY p.name
ORDER BY total_revenue DESC
LIMIT 20;

-- HeliosDB advantage: CTEs and window functions
WITH monthly_revenue AS (
    SELECT
        p.name,
        date_trunc('month', o.order_date) AS month,
        SUM(oi.price * oi.qty) AS revenue
    FROM orders o
    JOIN order_items oi ON oi.order_id = o.id
    JOIN products p ON p.id = oi.product_id
    WHERE o.order_date >= '2025-01-01'
    GROUP BY p.name, date_trunc('month', o.order_date)
)
SELECT
    name,
    month,
    revenue,
    LAG(revenue) OVER (PARTITION BY name ORDER BY month) AS prev_month,
    ROUND(
        (revenue - LAG(revenue) OVER (PARTITION BY name ORDER BY month))
        / NULLIF(LAG(revenue) OVER (PARTITION BY name ORDER BY month), 0) * 100,
        2
    ) AS growth_pct
FROM monthly_revenue
ORDER BY name, month;

JSONB Deep Dive: 20+ Functions

HeliosDB's JSONB support is not a thin wrapper -- it provides full PostgreSQL-compatible JSONB with over 20 functions and operators:

-- Navigation and extraction
SELECT
    data->'address'->>'city' AS city,             -- nested extraction
    data#>>'{contacts,0,email}' AS first_email,   -- path extraction
    jsonb_extract_path(data, 'address', 'zip') AS zip,
    jsonb_typeof(data->'age') AS age_type          -- "number"
FROM customers;

-- Containment and existence
SELECT * FROM customers
WHERE data @> '{"tier": "enterprise"}'             -- contains
  AND data ? 'phone'                                -- key exists
  AND data ?| ARRAY['email', 'phone'];             -- any key exists

-- Modification (immutable: returns new document)
UPDATE customers
SET data = jsonb_set(
    data,
    '{address,verified}',
    'true'::jsonb
)
WHERE id = 42;

-- Aggregation: build JSON from query results
SELECT jsonb_agg(
    jsonb_build_object(
        'name', name,
        'total', order_total
    )
) AS summary
FROM (
    SELECT c.name, SUM(o.total) AS order_total
    FROM customers c
    JOIN orders o ON o.customer_id = c.id
    GROUP BY c.name
    ORDER BY order_total DESC
    LIMIT 5
) top_customers;

-- Expand JSONB arrays and objects
SELECT
    p.name,
    tag.value AS tag
FROM products p,
     jsonb_array_elements_text(p.data->'tags') AS tag
WHERE tag.value LIKE '%sale%';

-- JSONB in JOINs with relational data (impossible in MongoDB without $lookup)
SELECT
    c.name,
    c.data->>'tier' AS tier,
    COUNT(o.id) AS orders,
    SUM(o.total) AS revenue,
    AVG(o.total) AS avg_order
FROM customers c
JOIN orders o ON o.customer_id = c.id
LEFT JOIN shipments s ON s.order_id = o.id
WHERE c.data @> '{"region": "EU"}'
  AND o.order_date >= '2026-01-01'
GROUP BY c.name, c.data->>'tier'
HAVING SUM(o.total) > 10000
ORDER BY revenue DESC;

Vector Search: Add-On vs Built-In

MongoDB: Atlas Vector Search (Separate Service)

// MongoDB Atlas Vector Search requires:
// 1. Atlas cluster (cloud-only, not self-hosted)
// 2. Separate search index definition
// 3. $vectorSearch aggregation stage (Atlas-only)

// Create search index (Atlas UI or API, not MQL)
// { "type": "vectorSearch", "fields": [{ "path": "embedding", ... }] }

// Query using aggregation pipeline
db.documents.aggregate([
    {
        $vectorSearch: {
            index: "vector_index",
            path: "embedding",
            queryVector: [0.12, -0.45, 0.89 /* ... */],
            numCandidates: 100,
            limit: 10
        }
    },
    {
        $project: {
            title: 1,
            content: 1,
            score: { $meta: "vectorSearchScore" }
        }
    }
]);

// Limitations:
//  - Atlas cloud only (not available in self-hosted MongoDB)
//  - Separate index type, separate billing
//  - Cannot combine with $match in same pipeline stage
//  - No hybrid vector + full-text in single operation
//  - Limited to Atlas M10+ clusters

HeliosDB: Native Vector Search with SQL

-- Create HNSW vector index (works everywhere: embedded, server, cluster)
CREATE INDEX idx_docs_embedding ON documents USING HNSW (embedding);

-- Vector similarity search: standard SQL
SELECT title, content,
       embedding <-> '[0.12, -0.45, 0.89, ...]'::vector(384) AS distance
FROM documents
ORDER BY embedding <-> '[0.12, -0.45, 0.89, ...]'::vector(384)
LIMIT 10;

-- Hybrid search: vector + full-text + SQL filters in ONE query
SELECT title, content,
       embedding <-> $1::vector(384) AS semantic_distance
FROM documents
WHERE content @@ 'database AND transactions'
  AND metadata @> '{"category": "technical"}'
  AND created_at >= '2026-01-01'
ORDER BY embedding <-> $1::vector(384)
LIMIT 10;

-- Vector search across JOINed data (impossible in MongoDB)
SELECT
    d.title,
    d.content,
    a.name AS author,
    c.name AS category,
    d.embedding <-> $1::vector(384) AS distance
FROM documents d
JOIN authors a ON a.id = d.author_id
JOIN categories c ON c.id = d.category_id
WHERE c.name IN ('engineering', 'research')
ORDER BY d.embedding <-> $1::vector(384)
LIMIT 10;
Aspect MongoDB Atlas Vector Search HeliosDB Vector Search
AvailabilityAtlas cloud only (M10+)Everywhere: embedded, server, cluster
Index typeSeparate vector indexHNSW (native, integrated with optimizer)
QuantizationNot availableProduct Quantization for memory efficiency
SIMD accelerationNot documentedNative SIMD for similarity computation
Hybrid searchSeparate stagesSingle query: vector + FTS + SQL
JOIN supportNo (aggregation only)Full SQL JOINs with vector ORDER BY
Self-hostedNot availableYes (60MB binary)
ACIDNot guaranteed in searchFull MVCC transaction isolation

Time-Travel and Data Branching

MongoDB has no equivalent for either feature.

MongoDB: No Historical Queries

// MongoDB has no time-travel capability
// To query historical data, you must:
// 1. Implement change data capture (CDC) manually
// 2. Store snapshots in a separate collection
// 3. Use change streams + application logic

// Change stream (captures changes, but doesn't query history)
const changeStream = db.collection('orders').watch();
changeStream.on('change', (change) => {
    // Save to history collection manually
    db.collection('orders_history').insertOne({
        ...change.fullDocument,
        _change_type: change.operationType,
        _changed_at: new Date()
    });
});

// Querying "what was the state at time X?" requires
// replaying all changes up to that timestamp -- expensive and complex

HeliosDB: Built-In Time-Travel + Branching

-- Time-travel: query data as it existed at any timestamp
SELECT * FROM orders AT TIME '2026-01-15 09:30:00';

-- Audit trail: compare before and after
SELECT
    a.order_id,
    a.status AS status_before,
    b.status AS status_after,
    a.total AS total_before,
    b.total AS total_after
FROM orders AT TIME '2026-02-01' a
JOIN orders b ON a.order_id = b.order_id
WHERE a.status != b.status;

-- Data branching: git-like workflows for data
CREATE BRANCH pricing_test FROM main;
CHECKOUT pricing_test;

-- Experiment with pricing changes
UPDATE products SET price = price * 0.9 WHERE category = 'seasonal';

-- Run analytics on the experimental data
SELECT category, AVG(price) AS avg_price, COUNT(*) AS products
FROM products
GROUP BY category;

-- Safe: production data on 'main' is completely untouched
CHECKOUT main;
SELECT category, AVG(price) FROM products GROUP BY category;
-- (original prices, unchanged)

-- Merge if satisfied, or drop the branch
MERGE pricing_test INTO main;
-- or: DROP BRANCH pricing_test;

Indexing: B-Tree vs ART + Bloom Filters + Zone Maps

MongoDB Indexes

// B-tree (default)
db.orders.createIndex({ customer_id: 1, order_date: -1 });

// Compound index
db.orders.createIndex({ status: 1, total: -1 });

// Text index (one per collection limitation)
db.products.createIndex({ name: "text", description: "text" });

// 2dsphere (geospatial)
db.locations.createIndex({ coordinates: "2dsphere" });

// Wildcard (all fields in document)
db.logs.createIndex({ "$**": 1 });

HeliosDB Indexes

-- ART (Adaptive Radix Tree): faster than B-tree for skewed distributions
CREATE INDEX idx_orders_customer ON orders (customer_id);

-- Composite index
CREATE INDEX idx_orders_status ON orders (status, total DESC);

-- HNSW vector index (no equivalent in MongoDB without Atlas)
CREATE INDEX idx_products_embedding ON products USING HNSW (embedding);

-- HeliosDB also uses automatically:
--   Bloom filters: fast negative lookups (key definitely not in block)
--   Zone maps: min/max per data block for range query pruning
--   These are transparent -- no CREATE INDEX needed
Index Type MongoDB HeliosDB
Primary keyB-tree on _idART on PRIMARY KEY
SecondaryB-treeART
Text search1 text index per collectionFTS on any text column (unlimited)
VectorAtlas Vector Search (cloud only)HNSW (native, everywhere)
Geospatial2dsphere, 2dNot built-in
Bloom filterNot availableAutomatic per data block
Zone mapsNot availableAutomatic min/max per block
Wildcard$** (all fields)Not applicable (schema-defined)

Transactions: A Deeper Look

MongoDB: Multi-Document Transactions (Since 4.0)

const session = client.startSession();
try {
    session.startTransaction({
        readConcern: { level: "snapshot" },
        writeConcern: { w: "majority" }
    });

    await orders.insertOne({ customer_id: 42, total: 100 }, { session });
    await inventory.updateOne(
        { product_id: 7, stock: { $gte: 1 } },
        { $inc: { stock: -1 } },
        { session }
    );
    await accounts.updateOne(
        { id: 42 },
        { $inc: { balance: -100 } },
        { session }
    );

    await session.commitTransaction();
} catch (error) {
    await session.abortTransaction();
} finally {
    session.endSession();
}

// Caveats:
//  - 60-second default timeout (TransactionLifetimeLimitSeconds)
//  - Performance overhead (~2x latency vs non-transactional)
//  - No savepoints (all-or-nothing only)
//  - Oplog size limits for large transactions
//  - Retryable writes required for production

HeliosDB: MVCC Transactions with Savepoints

BEGIN;

-- Insert order
INSERT INTO orders (customer_id, total) VALUES (42, 100.00);

-- Create savepoint before inventory update
SAVEPOINT before_inventory;

-- Update inventory
UPDATE inventory SET stock = stock - 1 WHERE product_id = 7 AND stock >= 1;

-- If inventory was insufficient, roll back just that part
-- ROLLBACK TO SAVEPOINT before_inventory;

-- Update account balance
UPDATE accounts SET balance = balance - 100.00 WHERE id = 42;

-- Nested savepoint for audit logging
SAVEPOINT before_audit;
INSERT INTO audit_log (action, details)
VALUES ('purchase', '{"customer": 42, "total": 100.00}');

COMMIT;

-- Transactions in HeliosDB:
--  - No arbitrary timeout limits
--  - MVCC: readers never block writers
--  - Savepoints for partial rollback
--  - Snapshot isolation by default
--  - WAL-backed durability with fsync
Aspect MongoDB HeliosDB
Multi-documentYes (since 4.0)Yes (native MVCC)
SavepointsNoYes (SAVEPOINT / ROLLBACK TO)
Isolation levelSnapshot (read concern)Snapshot isolation (default)
Timeout60s defaultConfigurable, no arbitrary limit
Performance overhead~2x latencyMinimal (MVCC is the default path)
Cross-shardYes (with overhead)Yes (via sharding tier)
Nested transactionsNoYes (via savepoints)

Encryption Comparison

MongoDB: Encryption at Rest + Client-Side Field-Level Encryption

// MongoDB encryption options:
// 1. At-rest: WiredTiger encrypts storage files (Enterprise only)
// 2. In-transit: TLS for client connections
// 3. CSFLE: Client-Side Field-Level Encryption (Enterprise or Atlas)

// CSFLE example (complex setup required)
const client = new MongoClient(uri, {
    autoEncryption: {
        keyVaultNamespace: "encryption.__keyVault",
        kmsProviders: {
            aws: { accessKeyId: "...", secretAccessKey: "..." }
        },
        schemaMap: {
            "mydb.patients": {
                bsonType: "object",
                properties: {
                    ssn: {
                        encrypt: {
                            bsonType: "string",
                            algorithm: "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic"
                        }
                    }
                }
            }
        }
    }
});

// Limitations:
//  - At-rest encryption: Enterprise edition only
//  - CSFLE: complex schema map configuration
//  - No encryption of indexes (query patterns leak information)
//  - No zero-knowledge option (server can read non-CSFLE fields)

HeliosDB: TDE + Zero-Knowledge Encryption

-- TDE: transparent, everything encrypted at rest
-- No application code changes, no schema annotations
-- Data pages, WAL entries, indexes -- all encrypted with AES-256-GCM

INSERT INTO patients (name, ssn, diagnosis)
VALUES ('Jane Doe', '123-45-6789', 'confidential');

-- Queries work transparently -- decryption happens automatically
SELECT * FROM patients WHERE name = 'Jane Doe';

-- Zero-Knowledge Encryption: even the database admin cannot read data
-- Application holds the encryption key
-- HeliosDB processes encrypted data without seeing plaintext
-- Ring + AWS-LC FIPS-validated providers
Aspect MongoDB HeliosDB
At-rest encryptionEnterprise only (WiredTiger)All editions (TDE, AES-256-GCM)
ScopeStorage filesData pages + WAL + indexes
Field-level encryptionCSFLE (complex setup)TDE encrypts everything transparently
Zero-knowledgeNo (server sees non-CSFLE data)Yes (application-held keys)
Index encryptionNoYes
FIPS complianceDepends on providerRing + AWS-LC FIPS providers
ConfigurationSchema maps, KMS setupSingle config option

Embedded Deployment

MongoDB is a server-only database. There is no way to embed it inside your application process.

MongoDB: Server Required

Application Process          MongoDB Server (separate process)
+-----------------+         +----------------------+
|  Your App       |  TCP    |  mongod               |
|  (Node/Python)  | ------> |  (1GB+ RAM minimum)   |
|                 |         |  (WiredTiger engine)   |
+-----------------+         +----------------------+

- Minimum resource: ~1GB RAM for mongod
- Separate deployment, monitoring, upgrades
- Network round-trip for every operation
- Cannot run on IoT/edge devices

HeliosDB: Embedded or Server

# Embedded mode: database runs inside your application process
# No separate server, no network round-trips, no deployment overhead

import psycopg2

# Option 1: Connect to HeliosDB server (like MongoDB)
conn = psycopg2.connect("host=localhost port=5432 dbname=myapp")

# Option 2: HeliosDB as embedded library (no server needed)
# The database engine runs in-process
# 60MB binary, starts in milliseconds
# Perfect for: edge devices, IoT, desktop apps, CI/CD, testing
// TypeScript/Node.js: connect via PostgreSQL protocol
import { Client } from 'pg';

const client = new Client({
    host: 'localhost',
    port: 5432,
    database: 'myapp',
});
await client.connect();

// Every PostgreSQL client library works out of the box
// No MongoDB driver, no Mongoose, no special ODM required
const result = await client.query(
    'SELECT * FROM products WHERE data @> $1 LIMIT 10',
    [JSON.stringify({ category: 'electronics' })]
);
// Go: standard database/sql with any PostgreSQL driver
package main

import (
    "database/sql"
    _ "github.com/lib/pq"
)

func main() {
    db, _ := sql.Open("postgres", "host=localhost port=5432 dbname=myapp sslmode=disable")
    defer db.Close()

    rows, _ := db.Query(`
        SELECT name, data->>'category' AS category
        FROM products
        WHERE data @> '{"in_stock": true}'
        ORDER BY (data->>'price')::numeric DESC
        LIMIT 20
    `)
    // Standard Go database patterns -- nothing MongoDB-specific to learn
}

Replication and Sharding

MongoDB: Replica Sets + Sharded Clusters

Replica Set:
+----------+     +----------+     +----------+
| Primary  |---->|Secondary |---->|Secondary |
|  (R/W)   |     |  (R/O)   |     |  (R/O)   |
+----------+     +----------+     +----------+
  Oplog-based async replication
  Automatic failover (election)

Sharded Cluster:
+---------+  +---------+  +---------+
| mongos  |  | mongos  |  | mongos  |  (routers)
+----+----+  +----+----+  +----+----+
     +------+-----+------+-----+
  +----------+  +----------+  +----------+
  |  Shard 1 |  |  Shard 2 |  |  Shard 3 |
  | (replica |  | (replica |  | (replica |
  |   set)   |  |   set)   |  |   set)   |
  +----------+  +----------+  +----------+
  + Config servers (another replica set)

  Minimum for sharded HA: 13 processes
  (3 config, 3x3 shard replicas, 1+ mongos)

HeliosDB: 3-Tier Replication

Tier 1: WAL Streaming (simple HA)
+----------+    WAL stream    +----------+
| Primary  | ---------------> | Replica  |
|  (R/W)   |                  |  (R/O)   |
+----------+                  +----------+
  Minimum for HA: 2 nodes

Tier 2: Multi-Primary (write scaling)
+----------+ <-----------> +----------+
| Primary1 |              | Primary2 |
|  (R/W)   |              |  (R/W)   |
+----------+              +----------+
  Both nodes accept writes

Tier 3: Sharding (horizontal scaling)
+----------+  +----------+  +----------+
| Shard 1  |  | Shard 2  |  | Shard 3  |
|(hash/rng |  |(hash/rng |  |(hash/rng |
| /list)   |  | /list)   |  | /list)   |
+----------+  +----------+  +----------+
  + Proxy with 13 built-in features
  (connection pooling, query routing, failover, etc.)
Aspect MongoDB HeliosDB
Basic HA3 nodes (replica set)2 nodes (primary + replica)
Sharded HA13+ processes3+ shards + proxy
Replication methodOplog (async)WAL streaming (sync or async)
Multi-primaryNo (single primary per replica set)Yes (Tier 2)
Shard key selectionMust choose carefully, hard to changeHash, range, or list partitioning
Config serversSeparate replica set requiredNot needed (proxy handles routing)
ReshardingOnline (since 6.0, but slow)Partition management via SQL

HeliosDB-Full: MongoDB-Style DocumentStore API

For teams migrating from MongoDB, HeliosDB-Full provides a DocumentStore API with familiar concepts:

# HeliosDB-Full DocumentStore: MongoDB-style API
# Collections, queries, aggregation pipeline, validation, change streams

import psycopg2
import json

conn = psycopg2.connect("host=localhost port=5432 dbname=helios")
cur = conn.cursor()

# Create a "collection" (table with JSONB document column)
cur.execute("""
    CREATE TABLE IF NOT EXISTS products (
        id SERIAL PRIMARY KEY,
        doc JSONB NOT NULL,
        embedding VECTOR(384),
        created_at TIMESTAMPTZ DEFAULT NOW()
    )
""")

# Insert documents (like db.collection.insertOne)
cur.execute("""
    INSERT INTO products (doc, embedding)
    VALUES (%s, %s)
""", (
    json.dumps({
        "name": "Widget",
        "price": 29.99,
        "tags": ["hardware"],
        "specs": {"weight": 0.5}
    }),
    "[0.12, -0.45, 0.89]"
))

# Query with containment (like MongoDB $match)
cur.execute("""
    SELECT doc->>'name' AS name, doc->>'price' AS price
    FROM products
    WHERE doc @> '{"tags": ["hardware"]}'
      AND (doc->>'price')::numeric < 50
    ORDER BY (doc->>'price')::numeric
""")

# Aggregation (like MongoDB aggregation pipeline, but standard SQL)
cur.execute("""
    SELECT
        doc->>'category' AS category,
        COUNT(*) AS count,
        AVG((doc->>'price')::numeric) AS avg_price,
        jsonb_agg(doc->>'name') AS product_names
    FROM products
    GROUP BY doc->>'category'
    HAVING COUNT(*) > 5
    ORDER BY avg_price DESC
""")

conn.commit()

When to Choose MongoDB

MongoDB is the right choice when:

  • Existing MongoDB expertise -- Your team is proficient in MQL, Mongoose, and the MongoDB ecosystem, and migration cost is not justified
  • Atlas managed cloud -- You want a fully managed database-as-a-service with built-in monitoring, backups, and scaling
  • Atlas-specific features -- You rely on Atlas Search, Atlas Vector Search, Atlas Data Lake, or Atlas Charts as an integrated cloud suite
  • Geospatial queries -- Your application heavily uses 2dsphere indexes and geospatial operators ($geoWithin, $near)
  • Change streams -- You have extensive real-time event processing built on MongoDB change streams
  • Petabyte sharding -- You need to shard across hundreds of nodes with MongoDB's mature sharding infrastructure

When to Choose HeliosDB

HeliosDB is the right choice when:

  • SQL + documents together -- You need relational JOINs across document data without $lookup limitations
  • Built-in vector search -- You need AI/RAG with vector similarity search without a separate cloud add-on
  • Self-hosted vector search -- Atlas Vector Search is cloud-only; HeliosDB runs anywhere
  • Time-travel queries -- You need to query historical data at any timestamp for auditing or compliance
  • Data branching -- You want git-like workflows for testing schema migrations, pricing changes, or A/B experiments
  • Embedded deployment -- You need a database inside your application (edge, IoT, desktop, CI/CD)
  • PostgreSQL compatibility -- You want to use standard PostgreSQL tooling (psql, pgAdmin, DBeaver, any PG driver)
  • Full ACID with savepoints -- You need nested transaction rollback, not just all-or-nothing
  • Encryption without Enterprise -- TDE and zero-knowledge encryption in every edition, not just Enterprise
  • Simpler operations -- A single 60MB binary instead of replica sets, config servers, and mongos routers

Migration Path

From MongoDB to HeliosDB

# Step 1: Export from MongoDB
from pymongo import MongoClient
import psycopg2
import json

mongo = MongoClient('mongodb://localhost:27017')
mongo_db = mongo['myapp']

# Step 2: Create equivalent tables in HeliosDB
pg = psycopg2.connect("host=localhost port=5432 dbname=helios")
cur = pg.cursor()

# Option A: Keep document-style (minimal changes)
cur.execute("""
    CREATE TABLE products (
        id TEXT PRIMARY KEY,
        doc JSONB NOT NULL,
        embedding VECTOR(384)
    )
""")

# Option B: Extract key fields into typed columns (recommended)
cur.execute("""
    CREATE TABLE products (
        id TEXT PRIMARY KEY,
        name TEXT NOT NULL,
        price NUMERIC(10,2) NOT NULL,
        category TEXT,
        tags TEXT[],
        specs JSONB DEFAULT '{}',
        embedding VECTOR(384),
        created_at TIMESTAMPTZ DEFAULT NOW()
    )
""")

# Step 3: Migrate data
for doc in mongo_db.products.find():
    # Option A: store entire document as JSONB
    doc_id = str(doc.pop('_id'))
    cur.execute(
        "INSERT INTO products (id, doc) VALUES (%s, %s)",
        (doc_id, json.dumps(doc, default=str))
    )

pg.commit()

Query Translation Cheat Sheet

MongoDB (MQL) HeliosDB (SQL)
db.c.find({x: 1})SELECT * FROM c WHERE doc->>'x' = '1'
db.c.find({x: {$gt: 5}})SELECT * FROM c WHERE (doc->>'x')::int > 5
db.c.find({tags: {$in: ["a"]}})SELECT * FROM c WHERE doc->'tags' ? 'a'
db.c.find({"a.b": 1})SELECT * FROM c WHERE doc->'a'->>'b' = '1'
db.c.aggregate([{$group: ...}])SELECT ... GROUP BY ... HAVING ...
db.c.aggregate([{$lookup: ...}])SELECT ... JOIN ... ON ...
db.c.countDocuments({x: 1})SELECT COUNT(*) FROM c WHERE doc->>'x' = '1'
db.c.distinct("x")SELECT DISTINCT doc->>'x' FROM c
db.c.updateOne({_id: 1}, {$set: {x: 2}})UPDATE c SET doc = jsonb_set(doc, '{x}', '2') WHERE id = '1'
db.c.deleteMany({status: "old"})DELETE FROM c WHERE doc->>'status' = 'old'

Summary

Dimension MongoDB HeliosDB
Best atSchema-flexible document storage at scaleUnified documents + relations + vectors + time-travel
Query languageMQL (MongoDB-specific)Standard SQL (PostgreSQL-compatible)
Document supportNative BSON documentsJSONB with 20+ functions + typed columns
Vector searchAtlas add-on (cloud only)Native HNSW + PQ + SIMD (everywhere)
Full-text searchAtlas Search (Lucene add-on)Native @@ operator (integrated with optimizer)
JOINs$lookup (limited)Full SQL JOINs (all types including LATERAL)
TransactionsMulti-document (since 4.0, no savepoints)Full MVCC with savepoints and nested rollback
Time-travelNot supportedAT TIME queries on any table
BranchingNot supportedGit-like CREATE/MERGE/DROP BRANCH
EncryptionEnterprise-only at-rest + CSFLETDE + ZKE in all editions
Embedded modeNot available60MB binary, in-process
EcosystemMongoDB drivers, Mongoose, Atlas cloudPostgreSQL drivers, any PG tool, REST, gRPC

MongoDB is a proven, mature document database with a vast ecosystem and a strong managed cloud offering in Atlas. HeliosDB is for teams who have outgrown the "one database per concern" architecture and want documents, relations, vectors, full-text search, time-travel, and branching in a single ACID-compliant engine that speaks PostgreSQL.

Ready to try HeliosDB?

Get started with HeliosDB in minutes. Open source, free to use.

Get Started Contact Sales