HeliosDB Nano - Repository Splitting Strategy
HeliosDB Nano - Repository Splitting Strategy
This document outlines the strategy for maintaining heliosdb-nano as both a standalone open-source project and as part of the HeliosDB monorepo.
Objectives
- Standalone Distribution: heliosdb-nano should be easily accessible as a separate project
- Maintain Together: Development should happen in the main HeliosDB monorepo
- Easy Sync: Changes should flow easily between standalone and monorepo
- Zero Friction: Contributors should not need special knowledge of the setup
Current Structure
HeliosDB/ (Main Monorepo)├── heliosdb-nano/ (Lightweight version)│ ├── src/│ ├── tests/│ ├── examples/│ ├── Cargo.toml (Workspace member)│ ├── README.md│ └── docs/├── heliosdb-storage/ (Enterprise crate)├── heliosdb-compute/ (Enterprise crate)├── Cargo.toml (Workspace root)└── ...Recommended Approach: Git Subtree
Why Git Subtree?
✅ Advantages:
- Maintains full commit history
- No special commands needed for normal development
- Easy to push changes to standalone repo
- Can pull changes from standalone repo back to monorepo
- Contributors work in monorepo normally
⚠️ Disadvantages:
- Requires manual sync (push/pull operations)
- Slightly more complex maintenance
Alternative Considered: Git Submodule
❌ Why Not Submodule:
- Requires all contributors to understand submodules
- More complex workflow (init, update, etc.)
- Easy to make mistakes
- Harder to make atomic changes across repos
Implementation Plan
Phase 1: Create Standalone Repository
Step 1: Extract heliosdb-nano to Standalone Repo
# From HeliosDB root directory
# Create a new branch for extractiongit checkout -b heliosdb-nano-standalone
# Extract heliosdb-nano directory with full historygit subtree split --prefix=heliosdb-nano --branch heliosdb-nano-only
# Create the standalone repositorymkdir ../heliosdb-nano-standalonecd ../heliosdb-nano-standalonegit initgit pull ../HeliosDB heliosdb-nano-only
# Push to GitHubgit remote add origin git@github.com:heliosdb/heliosdb-nano.gitgit push -u origin mainStep 2: Make heliosdb-nano Standalone-Compatible
Update heliosdb-nano/Cargo.toml to work both in workspace and standalone:
[package]name = "heliosdb-nano"version = "0.1.0"edition = "2021"authors = ["HeliosDB Team"]license = "Apache-2.0"description = "PostgreSQL-compatible embedded database"repository = "https://github.com/dimensigon/HDB-HeliosDB-Nano"keywords = ["database", "postgresql", "embedded", "sql"]
[lib]name = "heliosdb_nano"path = "src/lib.rs"
[[bin]]name = "heliosdb-nano"path = "src/main.rs"
[dependencies]# All dependencies are explicit (no workspace dependencies)rocksdb = { version = "0.22", default-features = false, features = ["snappy"] }sqlparser = { version = "0.53", features = ["visitor"] }arrow = { version = "53", features = ["prettyprint"] }# ... (all dependencies listed explicitly)
[dev-dependencies]criterion = "0.5"proptest = "1.5"
[features]default = ["encryption", "vector-search"]encryption = []vector-search = []
# This section is only used when NOT in a workspace[profile.release]lto = truecodegen-units = 1opt-level = 3Step 3: Update Main HeliosDB Cargo.toml
[workspace]members = [ "heliosdb-nano", "heliosdb-storage", "heliosdb-compute", # ...]
# Optional: Can reference heliosdb-nano from local path[dependencies]heliosdb-nano = { path = "heliosdb-nano" }Phase 2: Ongoing Maintenance
Pushing Changes from Monorepo to Standalone
When you make changes in heliosdb-nano/ within the monorepo:
# From HeliosDB root directory
# Commit changes normally in monorepogit add heliosdb-nano/git commit -m "feat: add new feature to heliosdb-nano"
# Push to main monorepogit push origin main
# Push the subtree to standalone repogit subtree push --prefix=heliosdb-nano \ git@github.com:heliosdb/heliosdb-nano.git main
# Or set up a remote once:git remote add heliosdb-nano-standalone \ git@github.com:heliosdb/heliosdb-nano.git
# Then subsequent pushes are easier:git subtree push --prefix=heliosdb-nano heliosdb-nano-standalone mainPulling Changes from Standalone to Monorepo
If someone contributes directly to the standalone repo:
# From HeliosDB root directory
# Pull changes from standalone repogit subtree pull --prefix=heliosdb-nano \ git@github.com:heliosdb/heliosdb-nano.git main \ --squash
# Resolve any conflictsgit add .git commit -m "sync: pull changes from heliosdb-nano standalone"
# Push to main monorepogit push origin mainPhase 3: CI/CD Setup
Standalone Repository CI (.github/workflows/ci.yml)
name: CI
on: push: branches: [ main ] pull_request: branches: [ main ]
jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
- name: Install Rust uses: actions-rs/toolchain@v1 with: toolchain: stable
- name: Build run: cargo build --release
- name: Test run: cargo test --all
- name: Run examples run: | cargo run --example quickstart cargo run --example encryption
publish: runs-on: ubuntu-latest if: startsWith(github.ref, 'refs/tags/v') needs: test steps: - uses: actions/checkout@v3
- name: Publish to crates.io run: | cargo login ${{ secrets.CARGO_TOKEN }} cargo publishMonorepo CI (Test heliosdb-nano independently)
name: CI
on: [push, pull_request]
jobs: test-heliosdb-nano: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
- name: Test heliosdb-nano working-directory: ./heliosdb-nano run: | cargo build --release cargo test --allDevelopment Workflow
For Core Team (Monorepo)
Normal workflow - no changes needed:
# Clone monorepogit clone https://github.com/heliosdb/heliosdb.gitcd heliosdb
# Make changes to heliosdb-nanocd heliosdb-nano# ... edit files ...
# Commit and push normallygit add .git commit -m "feat: add new feature"git push origin main
# Sync to standalone (maintainer only)git subtree push --prefix=heliosdb-nano heliosdb-nano-standalone mainFor Open-Source Contributors (Standalone)
Simple workflow:
# Clone standalone repogit clone https://github.com/dimensigon/HDB-HeliosDB-Nano.gitcd heliosdb-nano
# Make changes# ... edit files ...
# Testcargo test
# Commit and pushgit add .git commit -m "fix: bug fix"git push origin feature-branch
# Create PR on standalone repoFor Maintainers: Syncing Standalone PRs
When a PR is merged in standalone:
# In monorepogit subtree pull --prefix=heliosdb-nano \ git@github.com:heliosdb/heliosdb-nano.git main \ --squashgit push origin mainRelease Process
Version Management
Both repos should have the same version number:
# heliosdb-nano/Cargo.toml (same in both repos)[package]version = "0.1.0"Publishing to crates.io
Option 1: Publish from Standalone
cd heliosdb-nano-standalonegit tag v0.1.0git push origin v0.1.0# CI will publish to crates.ioOption 2: Manual Publish
cd heliosdb-nano-standalonecargo publishThen sync the tag back to monorepo:
cd heliosdbgit subtree pull --prefix=heliosdb-nano \ git@github.com:heliosdb/heliosdb-nano.git v0.1.0git tag heliosdb-nano-v0.1.0git push origin heliosdb-nano-v0.1.0Documentation Strategy
Standalone Repo (heliosdb-nano)
Should have:
- ✅ Comprehensive README
- ✅ Full documentation in
docs/ - ✅ Working examples in
examples/ - ✅ MIGRATION.md
- ✅ CONTRIBUTING.md
- ✅ LICENSE
Monorepo
Should have:
- Reference to heliosdb-nano as a project
- Link to standalone repo
- High-level overview
Migration Checklist
- Create standalone repository on GitHub
- Extract heliosdb-nano with git subtree split
- Update Cargo.toml for standalone compatibility
- Setup CI/CD for standalone repo
- Update README with standalone instructions
- Setup crates.io publishing
- Create CONTRIBUTING.md for standalone
- Add subtree remote to monorepo
- Document sync process for maintainers
- Test full workflow (PR → merge → sync)
Testing the Strategy
Before going live, test the workflow:
- Create test standalone repo
- Practice subtree push/pull
- Make test PR to standalone
- Sync back to monorepo
- Verify builds work in both places
- Verify examples work in both places
FAQ
Q: Do I need to do anything special when working in the monorepo? A: No, work normally. Subtree sync is done by maintainers.
Q: Can I contribute directly to the standalone repo? A: Yes! PRs are welcome on the standalone repo.
Q: How often should we sync? A: After every significant feature or bug fix.
Q: What if there are conflicts during sync? A: Resolve them like normal git conflicts, favoring the newer changes.
Q: Can we have different versions in monorepo vs standalone? A: Not recommended. Keep versions synchronized.
Automation Ideas
Consider creating scripts:
scripts/sync-heliosdb-nano.sh:
#!/bin/bash# Push heliosdb-nano changes to standalone repo
set -e
echo "Syncing heliosdb-nano to standalone repository..."
git subtree push --prefix=heliosdb-nano \ git@github.com:heliosdb/heliosdb-nano.git main
echo "Sync complete!"scripts/pull-heliosdb-nano.sh:
#!/bin/bash# Pull heliosdb-nano changes from standalone repo
set -e
echo "Pulling heliosdb-nano from standalone repository..."
git subtree pull --prefix=heliosdb-nano \ git@github.com:heliosdb/heliosdb-nano.git main \ --squash
echo "Pull complete!"Conclusion
The git subtree approach provides:
✅ Easy development in monorepo ✅ Clean standalone repository ✅ Full commit history ✅ Straightforward sync process ✅ Low friction for contributors
This strategy allows heliosdb-nano to be both:
- A well-integrated part of the HeliosDB ecosystem
- A standalone open-source project accessible to the community
Next Steps:
- Create standalone repository
- Setup CI/CD
- Test sync workflow
- Document for team
- Announce to community