HeliosDB Python Client
HeliosDB Python Client
A complete PEP 249 (Python Database API Specification v2.0) compliant database driver for HeliosDB, providing a familiar interface for Python developers while delivering HeliosDB-specific optimizations.
Features
Core PEP 249 Compliance
- Standard
connect()factory function Connectionclass with transaction management (commit(),rollback(),close())Cursorclass withexecute(),fetchone(),fetchall(),fetchmany()- Complete exception hierarchy (
Warning,Error,DatabaseError, etc.) - Type constructors (
Binary,Date,Timestamp, etc.) - Parameterized queries with
?placeholders (qmark style)
HeliosDB-Specific Features
- Shard-Aware Routing: Client-side key hashing routes queries directly to the correct shard
- 🗺 Topology Caching: Local cache of cluster topology for low-latency routing decisions
- 📡 Metadata Service Integration: Automatic topology refresh on cluster changes
- 🔢 Vector Type Support: Native support for vector embeddings via numpy arrays
- 🔗 PostgreSQL Wire Protocol: Compatible with PostgreSQL protocol (port 5432)
- 🔄 Connection Pooling Ready: Thread-safe design for future connection pooling
Installation
# Install from sourcecd python-clientpip install -e .
# Or with development dependenciespip install -e ".[dev]"Requirements
- Python 3.8+
- numpy >= 1.20.0 (for vector operations)
Quick Start
Basic Usage
import heliosdb
# Connect to HeliosDB clusterconn = heliosdb.connect("heliosdb://demo:demo@localhost:5432/testdb")
# Create cursor and execute querycur = conn.cursor()cur.execute("SELECT * FROM users WHERE id = ?", (123,))
# Fetch resultsrows = cur.fetchall()for row in rows: print(row)
# Close connectionconn.close()Using Connection String
The driver supports flexible connection strings:
# Full connection stringconn = heliosdb.connect("heliosdb://user:password@host:port/database")
# With query parametersconn = heliosdb.connect( "heliosdb://user:password@localhost:5432/testdb?shard_aware=true&autocommit=false")
# Using keyword arguments (overrides connection string)conn = heliosdb.connect( host="localhost", port=5432, database="testdb", user="demo", password="demo", metadata_host="localhost", metadata_port=50051)Context Managers
The driver supports Python’s context manager protocol:
with heliosdb.connect("heliosdb://demo:demo@localhost:5432/testdb") as conn: with conn.cursor() as cur: cur.execute("SELECT COUNT(*) FROM users") count = cur.fetchone()[0] print(f"Total users: {count}")# Connection automatically commits and closesAPI Reference
Connection Object
Methods
cursor()- Create a new cursor for executing queriescommit()- Commit the current transactionrollback()- Rollback the current transactionclose()- Close the connection
Properties
closed- Boolean indicating if connection is closedautocommit- Get/set autocommit mode
HeliosDB Extensions
get_topology()- Get current cluster topologyrefresh_topology()- Force topology refreshget_shard_for_key(sharding_key)- Get shard info for a key
Cursor Object
Methods
execute(operation, parameters=None)- Execute SQL query with optional parametersexecutemany(operation, seq_of_parameters)- Execute query for multiple parameter setsfetchone()- Fetch next row or Nonefetchmany(size=arraysize)- Fetch multiple rowsfetchall()- Fetch all remaining rowsclose()- Close the cursor
Properties
description- Column metadata (read-only)rowcount- Number of rows affected (read-only)arraysize- Default fetch size (default: 1)
Advanced Usage
Transactions
conn = heliosdb.connect("heliosdb://demo:demo@localhost:5432/testdb")conn.autocommit = False # Disable autocommit
cur = conn.cursor()
try: cur.execute("INSERT INTO accounts (user_id, balance) VALUES (?, ?)", (1, 1000)) cur.execute("INSERT INTO accounts (user_id, balance) VALUES (?, ?)", (2, 500)) conn.commit()except Exception as e: conn.rollback() print(f"Transaction failed: {e}")finally: cur.close() conn.close()Batch Operations
# Insert multiple rows efficientlyusers = [ ("Alice", "alice@example.com"), ("Bob", "bob@example.com"), ("Charlie", "charlie@example.com")]
cur.executemany( "INSERT INTO users (name, email) VALUES (?, ?)", users)print(f"Inserted {cur.rowcount} rows")Vector Operations
from heliosdb import Vectorimport numpy as np
# Create vector embeddingembedding = Vector(np.random.rand(1536)) # 1536-dimensional vector
# Insert document with vectorcur.execute( "INSERT INTO documents (title, embedding) VALUES (?, ?)", ("My Document", embedding))
# Vector similarity searchquery_vector = Vector(np.random.rand(1536))cur.execute(""" SELECT title, content FROM documents WHERE category = ? ORDER BY embedding <-> ? LIMIT 10""", ("technology", query_vector))
results = cur.fetchall()Shard-Aware Routing
The driver automatically routes queries to the correct shard based on the sharding key:
conn = heliosdb.connect( "heliosdb://demo:demo@localhost:5432/testdb", shard_aware=True # Enable shard-aware routing (default))
# Query with sharding key - routed to single shardcur.execute("SELECT * FROM orders WHERE customer_id = ?", (12345,))
# Query without sharding key - broadcast to all shardscur.execute("SELECT COUNT(*) FROM orders")
# Get topology informationtopology = conn.get_topology()print(f"Cluster has {len(topology.get_all_nodes())} nodes")
# Find which shard owns a keyshard = conn.get_shard_for_key(12345)print(f"Customer 12345 is on shard: {shard.shard_id}")Type System
Type Constructors (PEP 249)
from heliosdb import Date, Time, Timestamp, Binary
# Date/Time typesdate_val = Date(2025, 10, 10)time_val = Time(14, 30, 0)timestamp_val = Timestamp(2025, 10, 10, 14, 30, 0)
# Binary databinary_val = Binary(b'\x00\x01\x02\x03')
# Use in queriescur.execute( "INSERT INTO events (event_date, event_time, data) VALUES (?, ?, ?)", (date_val, time_val, binary_val))Type Codes
from heliosdb import STRING, BINARY, NUMBER, DATETIME, ROWID
# Access column types from cursor descriptioncur.execute("SELECT id, name, created_at FROM users LIMIT 1")
for col in cur.description: name, type_code = col[0], col[1] if type_code == NUMBER: print(f"{name} is a numeric column") elif type_code == STRING: print(f"{name} is a string column") elif type_code == DATETIME: print(f"{name} is a datetime column")Exception Hierarchy
from heliosdb import ( Error, # Base for all database errors InterfaceError, # Driver/interface errors DatabaseError, # Base for database-related errors DataError, # Invalid data errors OperationalError, # Connection/operation errors IntegrityError, # Constraint violations InternalError, # Internal database errors ProgrammingError, # SQL syntax/logic errors NotSupportedError # Unsupported operations)
try: cur.execute("INVALID SQL")except ProgrammingError as e: print(f"SQL syntax error: {e}")except DatabaseError as e: print(f"Database error: {e}")Connection Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
host | str | ’localhost’ | Database server hostname |
port | int | 5432 | Database server port |
database | str | required | Database name |
user | str | required | Username |
password | str | required | Password |
metadata_host | str | ’localhost’ | Metadata service host |
metadata_port | int | 50051 | Metadata service port |
shard_aware | bool | True | Enable shard-aware routing |
autocommit | bool | False | Enable autocommit mode |
connect_timeout | int | 30 | Connection timeout (seconds) |
Module Attributes (PEP 249)
import heliosdb
print(heliosdb.apilevel) # '2.0' - DB API 2.0 compliantprint(heliosdb.threadsafety) # 2 - Threads may share module and connectionsprint(heliosdb.paramstyle) # 'qmark' - Question mark style (?...)Thread Safety
The driver has threadsafety = 2, meaning:
- Threads may share the module
- Threads may share connections
- ⚠ Threads should NOT share cursors (create one cursor per thread)
Architecture
Components
- connection.py - Connection management, DSN parsing, transaction control
- cursor.py - Query execution, result fetching, parameterized queries
- protocol.py - PostgreSQL wire protocol implementation
- metadata_client.py - Metadata service client for topology updates
- sharding.py - Consistent hashing, shard routing logic
- types.py - Type constructors and type code objects
- exceptions.py - Complete PEP 249 exception hierarchy
Data Flow
┌─────────────────┐│ Application │└────────┬────────┘ │ SQL + Parameters ▼┌─────────────────┐│ Cursor │ ← Parameter substitution└────────┬────────┘ │ Prepared SQL ▼┌─────────────────┐│ Connection │ ← Shard routing (if enabled)└────────┬────────┘ │ Routed query ▼┌─────────────────┐│ Protocol Client │ ← PostgreSQL wire protocol└────────┬────────┘ │ ▼┌─────────────────┐│ HeliosDB Cluster│└─────────────────┘Examples
See the examples/ directory for complete working examples:
basic_usage.py- Comprehensive examples of all features
Testing
# Run testspytest
# Run with coveragepytest --cov=heliosdb --cov-report=html
# Type checkingmypy heliosdb/
# Code formattingblack heliosdb/flake8 heliosdb/Compatibility
Drop-in Replacement
This driver is designed as a drop-in replacement for:
- psycopg2 (PostgreSQL)
- PyMySQL (MySQL)
- Other PEP 249 compliant drivers
ORM Support
Works with popular Python ORMs:
- SQLAlchemy (with custom dialect)
- Django ORM (with custom backend)
- Peewee
- Pony ORM
Performance Considerations
Shard-Aware Routing
- Queries with equality predicates on sharding keys are routed to a single shard
- Queries without sharding keys are broadcast to all shards
- Topology is cached locally to minimize metadata service calls
Connection Pooling
For high-concurrency applications, consider using connection pooling:
from heliosdb import connect
# Create a pool of connections (future feature)# pool = heliosdb.ConnectionPool(# dsn="heliosdb://demo:demo@localhost:5432/testdb",# minconn=5,# maxconn=20# )Limitations
Current implementation limitations:
- Stored procedures not yet supported (
callproc()) - Multiple result sets not supported (
nextset()) - Binary protocol format not implemented (text format only)
- Full SQL parser not included (sharding key extraction is limited)
Contributing
Contributions are welcome! Please follow these guidelines:
- Follow PEP 8 style guide
- Add tests for new features
- Update documentation
- Ensure type hints are complete
License
Apache License 2.0
Support
For issues, questions, or feature requests:
- GitHub Issues: https://github.com/heliosdb/heliosdb
- Documentation: https://docs.heliosdb.io
- Email: support@heliosdb.io
Acknowledgments
This driver is built following:
- PEP 249 – Python Database API Specification v2.0
- PostgreSQL Wire Protocol v3.0
- Design principles from psycopg2, PyMySQL, and scylla-driver