Skip to content

HeliosDB Nano - Repository Splitting Strategy

HeliosDB Nano - Repository Splitting Strategy

This document outlines the strategy for maintaining heliosdb-nano as both a standalone open-source project and as part of the HeliosDB monorepo.

Objectives

  1. Standalone Distribution: heliosdb-nano should be easily accessible as a separate project
  2. Maintain Together: Development should happen in the main HeliosDB monorepo
  3. Easy Sync: Changes should flow easily between standalone and monorepo
  4. Zero Friction: Contributors should not need special knowledge of the setup

Current Structure

HeliosDB/ (Main Monorepo)
├── heliosdb-nano/ (Lightweight version)
│ ├── src/
│ ├── tests/
│ ├── examples/
│ ├── Cargo.toml (Workspace member)
│ ├── README.md
│ └── docs/
├── heliosdb-storage/ (Enterprise crate)
├── heliosdb-compute/ (Enterprise crate)
├── Cargo.toml (Workspace root)
└── ...

Why Git Subtree?

Advantages:

  • Maintains full commit history
  • No special commands needed for normal development
  • Easy to push changes to standalone repo
  • Can pull changes from standalone repo back to monorepo
  • Contributors work in monorepo normally

⚠️ Disadvantages:

  • Requires manual sync (push/pull operations)
  • Slightly more complex maintenance

Alternative Considered: Git Submodule

Why Not Submodule:

  • Requires all contributors to understand submodules
  • More complex workflow (init, update, etc.)
  • Easy to make mistakes
  • Harder to make atomic changes across repos

Implementation Plan

Phase 1: Create Standalone Repository

Step 1: Extract heliosdb-nano to Standalone Repo

Terminal window
# From HeliosDB root directory
# Create a new branch for extraction
git checkout -b heliosdb-nano-standalone
# Extract heliosdb-nano directory with full history
git subtree split --prefix=heliosdb-nano --branch heliosdb-nano-only
# Create the standalone repository
mkdir ../heliosdb-nano-standalone
cd ../heliosdb-nano-standalone
git init
git pull ../HeliosDB heliosdb-nano-only
# Push to GitHub
git remote add origin git@github.com:heliosdb/heliosdb-nano.git
git push -u origin main

Step 2: Make heliosdb-nano Standalone-Compatible

Update heliosdb-nano/Cargo.toml to work both in workspace and standalone:

[package]
name = "heliosdb-nano"
version = "0.1.0"
edition = "2021"
authors = ["HeliosDB Team"]
license = "Apache-2.0"
description = "PostgreSQL-compatible embedded database"
repository = "https://github.com/dimensigon/HDB-HeliosDB-Nano"
keywords = ["database", "postgresql", "embedded", "sql"]
[lib]
name = "heliosdb_nano"
path = "src/lib.rs"
[[bin]]
name = "heliosdb-nano"
path = "src/main.rs"
[dependencies]
# All dependencies are explicit (no workspace dependencies)
rocksdb = { version = "0.22", default-features = false, features = ["snappy"] }
sqlparser = { version = "0.53", features = ["visitor"] }
arrow = { version = "53", features = ["prettyprint"] }
# ... (all dependencies listed explicitly)
[dev-dependencies]
criterion = "0.5"
proptest = "1.5"
[features]
default = ["encryption", "vector-search"]
encryption = []
vector-search = []
# This section is only used when NOT in a workspace
[profile.release]
lto = true
codegen-units = 1
opt-level = 3

Step 3: Update Main HeliosDB Cargo.toml

[workspace]
members = [
"heliosdb-nano",
"heliosdb-storage",
"heliosdb-compute",
# ...
]
# Optional: Can reference heliosdb-nano from local path
[dependencies]
heliosdb-nano = { path = "heliosdb-nano" }

Phase 2: Ongoing Maintenance

Pushing Changes from Monorepo to Standalone

When you make changes in heliosdb-nano/ within the monorepo:

Terminal window
# From HeliosDB root directory
# Commit changes normally in monorepo
git add heliosdb-nano/
git commit -m "feat: add new feature to heliosdb-nano"
# Push to main monorepo
git push origin main
# Push the subtree to standalone repo
git subtree push --prefix=heliosdb-nano \
git@github.com:heliosdb/heliosdb-nano.git main
# Or set up a remote once:
git remote add heliosdb-nano-standalone \
git@github.com:heliosdb/heliosdb-nano.git
# Then subsequent pushes are easier:
git subtree push --prefix=heliosdb-nano heliosdb-nano-standalone main

Pulling Changes from Standalone to Monorepo

If someone contributes directly to the standalone repo:

Terminal window
# From HeliosDB root directory
# Pull changes from standalone repo
git subtree pull --prefix=heliosdb-nano \
git@github.com:heliosdb/heliosdb-nano.git main \
--squash
# Resolve any conflicts
git add .
git commit -m "sync: pull changes from heliosdb-nano standalone"
# Push to main monorepo
git push origin main

Phase 3: CI/CD Setup

Standalone Repository CI (.github/workflows/ci.yml)

name: CI
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Rust
uses: actions-rs/toolchain@v1
with:
toolchain: stable
- name: Build
run: cargo build --release
- name: Test
run: cargo test --all
- name: Run examples
run: |
cargo run --example quickstart
cargo run --example encryption
publish:
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
needs: test
steps:
- uses: actions/checkout@v3
- name: Publish to crates.io
run: |
cargo login ${{ secrets.CARGO_TOKEN }}
cargo publish

Monorepo CI (Test heliosdb-nano independently)

name: CI
on: [push, pull_request]
jobs:
test-heliosdb-nano:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Test heliosdb-nano
working-directory: ./heliosdb-nano
run: |
cargo build --release
cargo test --all

Development Workflow

For Core Team (Monorepo)

Normal workflow - no changes needed:

Terminal window
# Clone monorepo
git clone https://github.com/heliosdb/heliosdb.git
cd heliosdb
# Make changes to heliosdb-nano
cd heliosdb-nano
# ... edit files ...
# Commit and push normally
git add .
git commit -m "feat: add new feature"
git push origin main
# Sync to standalone (maintainer only)
git subtree push --prefix=heliosdb-nano heliosdb-nano-standalone main

For Open-Source Contributors (Standalone)

Simple workflow:

Terminal window
# Clone standalone repo
git clone https://github.com/dimensigon/HDB-HeliosDB-Nano.git
cd heliosdb-nano
# Make changes
# ... edit files ...
# Test
cargo test
# Commit and push
git add .
git commit -m "fix: bug fix"
git push origin feature-branch
# Create PR on standalone repo

For Maintainers: Syncing Standalone PRs

When a PR is merged in standalone:

Terminal window
# In monorepo
git subtree pull --prefix=heliosdb-nano \
git@github.com:heliosdb/heliosdb-nano.git main \
--squash
git push origin main

Release Process

Version Management

Both repos should have the same version number:

# heliosdb-nano/Cargo.toml (same in both repos)
[package]
version = "0.1.0"

Publishing to crates.io

Option 1: Publish from Standalone

Terminal window
cd heliosdb-nano-standalone
git tag v0.1.0
git push origin v0.1.0
# CI will publish to crates.io

Option 2: Manual Publish

Terminal window
cd heliosdb-nano-standalone
cargo publish

Then sync the tag back to monorepo:

Terminal window
cd heliosdb
git subtree pull --prefix=heliosdb-nano \
git@github.com:heliosdb/heliosdb-nano.git v0.1.0
git tag heliosdb-nano-v0.1.0
git push origin heliosdb-nano-v0.1.0

Documentation Strategy

Standalone Repo (heliosdb-nano)

Should have:

  • ✅ Comprehensive README
  • ✅ Full documentation in docs/
  • ✅ Working examples in examples/
  • ✅ MIGRATION.md
  • ✅ CONTRIBUTING.md
  • ✅ LICENSE

Monorepo

Should have:

  • Reference to heliosdb-nano as a project
  • Link to standalone repo
  • High-level overview

Migration Checklist

  • Create standalone repository on GitHub
  • Extract heliosdb-nano with git subtree split
  • Update Cargo.toml for standalone compatibility
  • Setup CI/CD for standalone repo
  • Update README with standalone instructions
  • Setup crates.io publishing
  • Create CONTRIBUTING.md for standalone
  • Add subtree remote to monorepo
  • Document sync process for maintainers
  • Test full workflow (PR → merge → sync)

Testing the Strategy

Before going live, test the workflow:

  1. Create test standalone repo
  2. Practice subtree push/pull
  3. Make test PR to standalone
  4. Sync back to monorepo
  5. Verify builds work in both places
  6. Verify examples work in both places

FAQ

Q: Do I need to do anything special when working in the monorepo? A: No, work normally. Subtree sync is done by maintainers.

Q: Can I contribute directly to the standalone repo? A: Yes! PRs are welcome on the standalone repo.

Q: How often should we sync? A: After every significant feature or bug fix.

Q: What if there are conflicts during sync? A: Resolve them like normal git conflicts, favoring the newer changes.

Q: Can we have different versions in monorepo vs standalone? A: Not recommended. Keep versions synchronized.

Automation Ideas

Consider creating scripts:

scripts/sync-heliosdb-nano.sh:

#!/bin/bash
# Push heliosdb-nano changes to standalone repo
set -e
echo "Syncing heliosdb-nano to standalone repository..."
git subtree push --prefix=heliosdb-nano \
git@github.com:heliosdb/heliosdb-nano.git main
echo "Sync complete!"

scripts/pull-heliosdb-nano.sh:

#!/bin/bash
# Pull heliosdb-nano changes from standalone repo
set -e
echo "Pulling heliosdb-nano from standalone repository..."
git subtree pull --prefix=heliosdb-nano \
git@github.com:heliosdb/heliosdb-nano.git main \
--squash
echo "Pull complete!"

Conclusion

The git subtree approach provides:

✅ Easy development in monorepo ✅ Clean standalone repository ✅ Full commit history ✅ Straightforward sync process ✅ Low friction for contributors

This strategy allows heliosdb-nano to be both:

  1. A well-integrated part of the HeliosDB ecosystem
  2. A standalone open-source project accessible to the community

Next Steps:

  1. Create standalone repository
  2. Setup CI/CD
  3. Test sync workflow
  4. Document for team
  5. Announce to community