Skip to content

Backup, Restore, and Format Conversion Tutorial

Backup, Restore, and Format Conversion Tutorial

Available since: v3.6.0 Build: default — no feature flag required CLI: heliosdb-nano dump, heliosdb-nano restore


UVP

Most embedded databases hand you a file path and call that “backup”. HeliosDB Nano ships a first-class dump/restore pipeline in the same binary: full or incremental dumps, four compression algorithms (zstd / gzip / brotli / lz4) selectable at the CLI, CRC32 checksum validation on restore, and format-aware import/export across CSV, JSON, JSONL, Parquet, Arrow, and SQL. Plus a bundled SQLite converter that turns any .sqlite file into a Nano data directory in seconds. Same binary handles your hot path, your nightly backup, your data lake export, and your migration from SQLite.


Prerequisites

  • HeliosDB Nano v3.6+ binary (heliosdb-nano --version)
  • ~10 minutes
  • Optional: a .sqlite file you want to migrate

1. Full Database Dump

Dump an entire database to a single compressed file. Defaults to zstd:

Terminal window
heliosdb-nano dump \
--data-dir ./mydata \
--output backup.heliodump

Pick a different compression algorithm:

Terminal window
# Zstd (default — best speed/ratio balance)
heliosdb-nano dump --data-dir ./mydata --output backup.zst.heliodump --compression zstd
# Gzip (broadest tooling compatibility)
heliosdb-nano dump --data-dir ./mydata --output backup.gz.heliodump --compression gzip
# Brotli (smallest size, slowest write)
heliosdb-nano dump --data-dir ./mydata --output backup.br.heliodump --compression brotli
# LZ4 (fastest, largest size — good for hot snapshots)
heliosdb-nano dump --data-dir ./mydata --output backup.lz4.heliodump --compression lz4
# None (uncompressed, for CRC verification testing)
heliosdb-nano dump --data-dir ./mydata --output backup.raw.heliodump --compression none

Add --verbose to see the report (table count, total rows, compression ratio, CRC32 checksum).


2. Incremental Dumps

Append only the rows changed since the last dump:

Terminal window
# Initial full dump
heliosdb-nano dump --data-dir ./mydata --output rolling.heliodump
# Hours later — append the delta
heliosdb-nano dump --data-dir ./mydata --output rolling.heliodump --append

The dump file’s header gets a new metadata block per append; the restore path replays them in order. Useful for hourly snapshots without paying the full-dump cost every time.


3. Restore

Restores create a fresh data directory; they never overwrite an existing one in place:

Terminal window
heliosdb-nano restore \
--input backup.heliodump \
--target ./restored-data \
--verify

--verify runs the CRC32 check before applying any rows. If the dump file is corrupt, restore aborts before touching the target.

Then point a fresh server at the restored directory:

Terminal window
heliosdb-nano start --data-dir ./restored-data

4. Dump-On-Shutdown (In-Memory Mode)

If you’re running with --memory (see IN_MEMORY_MODE_QUICKREF), pair it with --dump-on-shutdown to persist on graceful exit:

Terminal window
heliosdb-nano start \
--memory \
--dump-on-shutdown
# On Ctrl-C, writes ./shutdown_dump.heliodump

This is the canonical “ephemeral DB with periodic snapshots” pattern.


5. Selective Per-Table / Per-Schema Export

The dump format covers the whole database; for selective slices, use the import/export layer at the SQL/REST surface. Six formats are supported (auto-detected by file extension):

ExtensionFormatUse case
.csvCSVExcel / spreadsheet hand-off
.jsonJSON arrayAPI consumers, JS scripts
.jsonl / .ndjsonJSON LinesStreaming pipelines
.parquet / .pqApache ParquetData lake, dbt, Spark
.arrow / .ipcApache Arrow IPCPandas, Polars
.sqlSQL DDL + INSERTsCross-database migration

Export a table

COPY products TO '/tmp/products.parquet';
COPY (SELECT * FROM orders WHERE created_at > now() - interval '7 days')
TO '/tmp/recent_orders.csv' WITH (FORMAT csv, HEADER true);

Import a file

COPY products FROM '/tmp/products.parquet';
COPY products FROM '/tmp/products.csv' WITH (FORMAT csv, HEADER true);

The ExportFormat::from_extension() helper picks the right codec when the extension is unambiguous; pass WITH (FORMAT …) when the extension is generic (e.g., .txt).

Export a schema-only dump

COPY (SELECT 'CREATE TABLE ' || tablename || ' (...)'
FROM pg_tables WHERE schemaname = 'public')
TO '/tmp/schema.sql';

6. SQLite → HeliosDB Migration

A bundled Python tool under tools/HELIOSDB_SQLITE_CONVERTER converts any SQLite file into a Nano data directory:

Terminal window
python3 tools/HELIOSDB_SQLITE_CONVERTER.py \
--sqlite ./legacy_app.sqlite \
--output ./heliosdb-data \
--verbose

Or transparently from a Python application — the converter intercepts the connect call and converts on first use:

from HELIOSDB_SQLITE_CONVERTER import TransparentConverter
success, conn, messages = TransparentConverter.connect_with_auto_conversion(
sqlite_path="./legacy_app.sqlite"
)
# conn is a psycopg2-style HeliosDB connection

Existing SQLAlchemy / psycopg2 / Django code keeps working; only the connection setup changes. Verification step (SQLiteDetector.validate_sqlite_database()) catches corrupt source files before conversion.


7. Dump File Format

For reference / compliance audits — the dump file layout is documented in src/storage/dump/format.rs:

┌─────────────────────────────────────┐
│ Magic number (4B): 0xHELIOSDB │
│ Version (4B): DUMP_VERSION │
│ Metadata header: DumpMetadata │ ← created_at, dump_type, table_count,
│ total_rows, compressed_size, CRC32
├─────────────────────────────────────┤
│ Compressed payload (zstd/gzip/...) │
│ Schema records │
│ Index metadata │
│ Tuple stream │
└─────────────────────────────────────┘

CRC32 covers the uncompressed payload; compression is applied to the payload only, not the header.


8. Operational Patterns

Nightly snapshot + restore drill

Terminal window
# 02:00 — snapshot
heliosdb-nano dump \
--data-dir /var/lib/heliosdb/prod \
--output /backups/$(date +%F).heliodump \
--verbose
# 02:30 — verify by restoring to a scratch dir
heliosdb-nano restore \
--input /backups/$(date +%F).heliodump \
--target /tmp/restore-drill \
--verify
rm -rf /tmp/restore-drill

Continuous incremental + weekly full

Terminal window
# Hourly cron
heliosdb-nano dump --data-dir /var/lib/heliosdb/prod \
--output /backups/rolling.heliodump --append
# Sunday 03:00 — fresh full
heliosdb-nano dump --data-dir /var/lib/heliosdb/prod \
--output /backups/full-$(date +%Y-W%V).heliodump

Troubleshooting

SymptomCauseFix
Cannot dump from in-memory database without data directoryTried dump --memoryRun start --memory --dump-on-shutdown instead
Restore fails with CRC32 mismatchCorrupted dump fileAlways pass --verify; re-run dump
Either --data-dir or --connection must be specifiedForgot to pass --data-dirAdd it (server-mode --connection is reserved for future use)
Parquet import shows string for INT columnsSource schema mismatchRe-export with explicit casts, or pre-create the target table
SQLite converter reports database is lockedSource .sqlite is open in another processClose the other process or copy the file first

Where Next