Skip to content

HeliosDB Python Client

HeliosDB Python Client

A complete PEP 249 (Python Database API Specification v2.0) compliant database driver for HeliosDB, providing a familiar interface for Python developers while delivering HeliosDB-specific optimizations.

Features

Core PEP 249 Compliance

  • Standard connect() factory function
  • Connection class with transaction management (commit(), rollback(), close())
  • Cursor class with execute(), fetchone(), fetchall(), fetchmany()
  • Complete exception hierarchy (Warning, Error, DatabaseError, etc.)
  • Type constructors (Binary, Date, Timestamp, etc.)
  • Parameterized queries with ? placeholders (qmark style)

HeliosDB-Specific Features

  • Shard-Aware Routing: Client-side key hashing routes queries directly to the correct shard
  • 🗺 Topology Caching: Local cache of cluster topology for low-latency routing decisions
  • 📡 Metadata Service Integration: Automatic topology refresh on cluster changes
  • 🔢 Vector Type Support: Native support for vector embeddings via numpy arrays
  • 🔗 PostgreSQL Wire Protocol: Compatible with PostgreSQL protocol (port 5432)
  • 🔄 Connection Pooling Ready: Thread-safe design for future connection pooling

Installation

Terminal window
# Install from source
cd python-client
pip install -e .
# Or with development dependencies
pip install -e ".[dev]"

Requirements

  • Python 3.8+
  • numpy >= 1.20.0 (for vector operations)

Quick Start

Basic Usage

import heliosdb
# Connect to HeliosDB cluster
conn = heliosdb.connect("heliosdb://demo:demo@localhost:5432/testdb")
# Create cursor and execute query
cur = conn.cursor()
cur.execute("SELECT * FROM users WHERE id = ?", (123,))
# Fetch results
rows = cur.fetchall()
for row in rows:
print(row)
# Close connection
conn.close()

Using Connection String

The driver supports flexible connection strings:

# Full connection string
conn = heliosdb.connect("heliosdb://user:password@host:port/database")
# With query parameters
conn = heliosdb.connect(
"heliosdb://user:password@localhost:5432/testdb?shard_aware=true&autocommit=false"
)
# Using keyword arguments (overrides connection string)
conn = heliosdb.connect(
host="localhost",
port=5432,
database="testdb",
user="demo",
password="demo",
metadata_host="localhost",
metadata_port=50051
)

Context Managers

The driver supports Python’s context manager protocol:

with heliosdb.connect("heliosdb://demo:demo@localhost:5432/testdb") as conn:
with conn.cursor() as cur:
cur.execute("SELECT COUNT(*) FROM users")
count = cur.fetchone()[0]
print(f"Total users: {count}")
# Connection automatically commits and closes

API Reference

Connection Object

Methods

  • cursor() - Create a new cursor for executing queries
  • commit() - Commit the current transaction
  • rollback() - Rollback the current transaction
  • close() - Close the connection

Properties

  • closed - Boolean indicating if connection is closed
  • autocommit - Get/set autocommit mode

HeliosDB Extensions

  • get_topology() - Get current cluster topology
  • refresh_topology() - Force topology refresh
  • get_shard_for_key(sharding_key) - Get shard info for a key

Cursor Object

Methods

  • execute(operation, parameters=None) - Execute SQL query with optional parameters
  • executemany(operation, seq_of_parameters) - Execute query for multiple parameter sets
  • fetchone() - Fetch next row or None
  • fetchmany(size=arraysize) - Fetch multiple rows
  • fetchall() - Fetch all remaining rows
  • close() - Close the cursor

Properties

  • description - Column metadata (read-only)
  • rowcount - Number of rows affected (read-only)
  • arraysize - Default fetch size (default: 1)

Advanced Usage

Transactions

conn = heliosdb.connect("heliosdb://demo:demo@localhost:5432/testdb")
conn.autocommit = False # Disable autocommit
cur = conn.cursor()
try:
cur.execute("INSERT INTO accounts (user_id, balance) VALUES (?, ?)", (1, 1000))
cur.execute("INSERT INTO accounts (user_id, balance) VALUES (?, ?)", (2, 500))
conn.commit()
except Exception as e:
conn.rollback()
print(f"Transaction failed: {e}")
finally:
cur.close()
conn.close()

Batch Operations

# Insert multiple rows efficiently
users = [
("Alice", "alice@example.com"),
("Bob", "bob@example.com"),
("Charlie", "charlie@example.com")
]
cur.executemany(
"INSERT INTO users (name, email) VALUES (?, ?)",
users
)
print(f"Inserted {cur.rowcount} rows")

Vector Operations

from heliosdb import Vector
import numpy as np
# Create vector embedding
embedding = Vector(np.random.rand(1536)) # 1536-dimensional vector
# Insert document with vector
cur.execute(
"INSERT INTO documents (title, embedding) VALUES (?, ?)",
("My Document", embedding)
)
# Vector similarity search
query_vector = Vector(np.random.rand(1536))
cur.execute("""
SELECT title, content
FROM documents
WHERE category = ?
ORDER BY embedding <-> ?
LIMIT 10
""", ("technology", query_vector))
results = cur.fetchall()

Shard-Aware Routing

The driver automatically routes queries to the correct shard based on the sharding key:

conn = heliosdb.connect(
"heliosdb://demo:demo@localhost:5432/testdb",
shard_aware=True # Enable shard-aware routing (default)
)
# Query with sharding key - routed to single shard
cur.execute("SELECT * FROM orders WHERE customer_id = ?", (12345,))
# Query without sharding key - broadcast to all shards
cur.execute("SELECT COUNT(*) FROM orders")
# Get topology information
topology = conn.get_topology()
print(f"Cluster has {len(topology.get_all_nodes())} nodes")
# Find which shard owns a key
shard = conn.get_shard_for_key(12345)
print(f"Customer 12345 is on shard: {shard.shard_id}")

Type System

Type Constructors (PEP 249)

from heliosdb import Date, Time, Timestamp, Binary
# Date/Time types
date_val = Date(2025, 10, 10)
time_val = Time(14, 30, 0)
timestamp_val = Timestamp(2025, 10, 10, 14, 30, 0)
# Binary data
binary_val = Binary(b'\x00\x01\x02\x03')
# Use in queries
cur.execute(
"INSERT INTO events (event_date, event_time, data) VALUES (?, ?, ?)",
(date_val, time_val, binary_val)
)

Type Codes

from heliosdb import STRING, BINARY, NUMBER, DATETIME, ROWID
# Access column types from cursor description
cur.execute("SELECT id, name, created_at FROM users LIMIT 1")
for col in cur.description:
name, type_code = col[0], col[1]
if type_code == NUMBER:
print(f"{name} is a numeric column")
elif type_code == STRING:
print(f"{name} is a string column")
elif type_code == DATETIME:
print(f"{name} is a datetime column")

Exception Hierarchy

from heliosdb import (
Error, # Base for all database errors
InterfaceError, # Driver/interface errors
DatabaseError, # Base for database-related errors
DataError, # Invalid data errors
OperationalError, # Connection/operation errors
IntegrityError, # Constraint violations
InternalError, # Internal database errors
ProgrammingError, # SQL syntax/logic errors
NotSupportedError # Unsupported operations
)
try:
cur.execute("INVALID SQL")
except ProgrammingError as e:
print(f"SQL syntax error: {e}")
except DatabaseError as e:
print(f"Database error: {e}")

Connection Parameters

ParameterTypeDefaultDescription
hoststr’localhost’Database server hostname
portint5432Database server port
databasestrrequiredDatabase name
userstrrequiredUsername
passwordstrrequiredPassword
metadata_hoststr’localhost’Metadata service host
metadata_portint50051Metadata service port
shard_awareboolTrueEnable shard-aware routing
autocommitboolFalseEnable autocommit mode
connect_timeoutint30Connection timeout (seconds)

Module Attributes (PEP 249)

import heliosdb
print(heliosdb.apilevel) # '2.0' - DB API 2.0 compliant
print(heliosdb.threadsafety) # 2 - Threads may share module and connections
print(heliosdb.paramstyle) # 'qmark' - Question mark style (?...)

Thread Safety

The driver has threadsafety = 2, meaning:

  • Threads may share the module
  • Threads may share connections
  • ⚠ Threads should NOT share cursors (create one cursor per thread)

Architecture

Components

  1. connection.py - Connection management, DSN parsing, transaction control
  2. cursor.py - Query execution, result fetching, parameterized queries
  3. protocol.py - PostgreSQL wire protocol implementation
  4. metadata_client.py - Metadata service client for topology updates
  5. sharding.py - Consistent hashing, shard routing logic
  6. types.py - Type constructors and type code objects
  7. exceptions.py - Complete PEP 249 exception hierarchy

Data Flow

┌─────────────────┐
│ Application │
└────────┬────────┘
│ SQL + Parameters
┌─────────────────┐
│ Cursor │ ← Parameter substitution
└────────┬────────┘
│ Prepared SQL
┌─────────────────┐
│ Connection │ ← Shard routing (if enabled)
└────────┬────────┘
│ Routed query
┌─────────────────┐
│ Protocol Client │ ← PostgreSQL wire protocol
└────────┬────────┘
┌─────────────────┐
│ HeliosDB Cluster│
└─────────────────┘

Examples

See the examples/ directory for complete working examples:

  • basic_usage.py - Comprehensive examples of all features

Testing

Terminal window
# Run tests
pytest
# Run with coverage
pytest --cov=heliosdb --cov-report=html
# Type checking
mypy heliosdb/
# Code formatting
black heliosdb/
flake8 heliosdb/

Compatibility

Drop-in Replacement

This driver is designed as a drop-in replacement for:

  • psycopg2 (PostgreSQL)
  • PyMySQL (MySQL)
  • Other PEP 249 compliant drivers

ORM Support

Works with popular Python ORMs:

  • SQLAlchemy (with custom dialect)
  • Django ORM (with custom backend)
  • Peewee
  • Pony ORM

Performance Considerations

Shard-Aware Routing

  • Queries with equality predicates on sharding keys are routed to a single shard
  • Queries without sharding keys are broadcast to all shards
  • Topology is cached locally to minimize metadata service calls

Connection Pooling

For high-concurrency applications, consider using connection pooling:

from heliosdb import connect
# Create a pool of connections (future feature)
# pool = heliosdb.ConnectionPool(
# dsn="heliosdb://demo:demo@localhost:5432/testdb",
# minconn=5,
# maxconn=20
# )

Limitations

Current implementation limitations:

  • Stored procedures not yet supported (callproc())
  • Multiple result sets not supported (nextset())
  • Binary protocol format not implemented (text format only)
  • Full SQL parser not included (sharding key extraction is limited)

Contributing

Contributions are welcome! Please follow these guidelines:

  1. Follow PEP 8 style guide
  2. Add tests for new features
  3. Update documentation
  4. Ensure type hints are complete

License

Apache License 2.0

Support

For issues, questions, or feature requests:

Acknowledgments

This driver is built following: