Backup, Restore, and Format Conversion Tutorial

Available since: v3.6.0 Build: default — no feature flag required CLI: heliosdb-nano dump, heliosdb-nano restore

UVP

Most embedded databases hand you a file path and call that “backup”. HeliosDB Nano ships a first-class dump/restore pipeline in the same binary: full or incremental dumps, four compression algorithms (zstd / gzip / brotli / lz4) selectable at the CLI, CRC32 checksum validation on restore, and format-aware import/export across CSV, JSON, JSONL, Parquet, Arrow, and SQL. Plus a bundled SQLite converter that turns any .sqlite file into a Nano data directory in seconds. Same binary handles your hot path, your nightly backup, your data lake export, and your migration from SQLite.

Prerequisites

HeliosDB Nano v3.6+ binary (heliosdb-nano --version)
~10 minutes
Optional: a .sqlite file you want to migrate

1. Full Database Dump

Dump an entire database to a single compressed file. Defaults to zstd:

heliosdb-nano dump \
    --data-dir ./mydata \
    --output backup.heliodump

Pick a different compression algorithm:

# Zstd (default — best speed/ratio balance)
heliosdb-nano dump --data-dir ./mydata --output backup.zst.heliodump --compression zstd

# Gzip (broadest tooling compatibility)
heliosdb-nano dump --data-dir ./mydata --output backup.gz.heliodump --compression gzip

# Brotli (smallest size, slowest write)
heliosdb-nano dump --data-dir ./mydata --output backup.br.heliodump --compression brotli

# LZ4 (fastest, largest size — good for hot snapshots)
heliosdb-nano dump --data-dir ./mydata --output backup.lz4.heliodump --compression lz4

# None (uncompressed, for CRC verification testing)
heliosdb-nano dump --data-dir ./mydata --output backup.raw.heliodump --compression none

Add --verbose to see the report (table count, total rows, compression ratio, CRC32 checksum).

2. Incremental Dumps

Append only the rows changed since the last dump:

# Initial full dump
heliosdb-nano dump --data-dir ./mydata --output rolling.heliodump

# Hours later — append the delta
heliosdb-nano dump --data-dir ./mydata --output rolling.heliodump --append

The dump file’s header gets a new metadata block per append; the restore path replays them in order. Useful for hourly snapshots without paying the full-dump cost every time.

3. Restore

Restores create a fresh data directory; they never overwrite an existing one in place:

heliosdb-nano restore \
    --input backup.heliodump \
    --target ./restored-data \
    --verify

--verify runs the CRC32 check before applying any rows. If the dump file is corrupt, restore aborts before touching the target.

Then point a fresh server at the restored directory:

heliosdb-nano start --data-dir ./restored-data

4. Dump-On-Shutdown (In-Memory Mode)

If you’re running with --memory (see IN_MEMORY_MODE_QUICKREF), pair it with --dump-on-shutdown to persist on graceful exit:

heliosdb-nano start \
    --memory \
    --dump-on-shutdown
# On Ctrl-C, writes ./shutdown_dump.heliodump

This is the canonical “ephemeral DB with periodic snapshots” pattern.

5. Selective Per-Table / Per-Schema Export

The dump format covers the whole database; for selective slices, use the import/export layer at the SQL/REST surface. Six formats are supported (auto-detected by file extension):

Extension	Format	Use case
`.csv`	CSV	Excel / spreadsheet hand-off
`.json`	JSON array	API consumers, JS scripts
`.jsonl` / `.ndjson`	JSON Lines	Streaming pipelines
`.parquet` / `.pq`	Apache Parquet	Data lake, dbt, Spark
`.arrow` / `.ipc`	Apache Arrow IPC	Pandas, Polars
`.sql`	SQL DDL + INSERTs	Cross-database migration

Export a table

COPY products TO '/tmp/products.parquet';
COPY (SELECT * FROM orders WHERE created_at > now() - interval '7 days')
  TO '/tmp/recent_orders.csv' WITH (FORMAT csv, HEADER true);

Import a file

COPY products FROM '/tmp/products.parquet';
COPY products FROM '/tmp/products.csv' WITH (FORMAT csv, HEADER true);

The ExportFormat::from_extension() helper picks the right codec when the extension is unambiguous; pass WITH (FORMAT …) when the extension is generic (e.g., .txt).

Export a schema-only dump

COPY (SELECT 'CREATE TABLE ' || tablename || ' (...)'
      FROM pg_tables WHERE schemaname = 'public')
  TO '/tmp/schema.sql';

6. SQLite → HeliosDB Migration

A bundled Python tool under tools/HELIOSDB_SQLITE_CONVERTER converts any SQLite file into a Nano data directory:

python3 tools/HELIOSDB_SQLITE_CONVERTER.py \
    --sqlite ./legacy_app.sqlite \
    --output ./heliosdb-data \
    --verbose

Or transparently from a Python application — the converter intercepts the connect call and converts on first use:

from HELIOSDB_SQLITE_CONVERTER import TransparentConverter

success, conn, messages = TransparentConverter.connect_with_auto_conversion(
    sqlite_path="./legacy_app.sqlite"
)
# conn is a psycopg2-style HeliosDB connection

Existing SQLAlchemy / psycopg2 / Django code keeps working; only the connection setup changes. Verification step (SQLiteDetector.validate_sqlite_database()) catches corrupt source files before conversion.

7. Dump File Format

For reference / compliance audits — the dump file layout is documented in src/storage/dump/format.rs:

┌─────────────────────────────────────┐
│ Magic number (4B):  0xHELIOSDB     │
│ Version (4B):       DUMP_VERSION    │
│ Metadata header:    DumpMetadata    │   ← created_at, dump_type, table_count,
│                                       total_rows, compressed_size, CRC32
├─────────────────────────────────────┤
│ Compressed payload (zstd/gzip/...) │
│   Schema records                    │
│   Index metadata                    │
│   Tuple stream                      │
└─────────────────────────────────────┘

CRC32 covers the uncompressed payload; compression is applied to the payload only, not the header.

8. Operational Patterns

Nightly snapshot + restore drill

# 02:00 — snapshot
heliosdb-nano dump \
    --data-dir /var/lib/heliosdb/prod \
    --output /backups/$(date +%F).heliodump \
    --verbose

# 02:30 — verify by restoring to a scratch dir
heliosdb-nano restore \
    --input /backups/$(date +%F).heliodump \
    --target /tmp/restore-drill \
    --verify
rm -rf /tmp/restore-drill

Continuous incremental + weekly full

# Hourly cron
heliosdb-nano dump --data-dir /var/lib/heliosdb/prod \
    --output /backups/rolling.heliodump --append

# Sunday 03:00 — fresh full
heliosdb-nano dump --data-dir /var/lib/heliosdb/prod \
    --output /backups/full-$(date +%Y-W%V).heliodump

Troubleshooting

Symptom	Cause	Fix
`Cannot dump from in-memory database without data directory`	Tried `dump --memory`	Run `start --memory --dump-on-shutdown` instead
Restore fails with `CRC32 mismatch`	Corrupted dump file	Always pass `--verify`; re-run dump
`Either --data-dir or --connection must be specified`	Forgot to pass `--data-dir`	Add it (server-mode `--connection` is reserved for future use)
Parquet import shows `string` for INT columns	Source schema mismatch	Re-export with explicit casts, or pre-create the target table
SQLite converter reports `database is locked`	Source `.sqlite` is open in another process	Close the other process or copy the file first

Where Next

IN_MEMORY_MODE_QUICKREF — pair with --dump-on-shutdown for ephemeral-with-snapshots.
HIGH_AVAILABILITY — keep nightly dumps even with HA replication.
ENCRYPTION_TUTORIAL — TDE-encrypted databases produce encrypted dumps automatically.
FIPS_COMPLIANCE_TUTORIAL — dump checksums become SHA-256 in FIPS mode.