Python Client Driver Compatibility Matrix

HeliosDB Multi-Protocol Support Research

Researcher Agent Analysis Date: 2025-10-10

PostgreSQL Clients
MySQL Clients
SQL Server Clients
DB2 Clients
Oracle Clients
HTTP API Clients
Implementation Recommendations

1. PostgreSQL Clients

psycopg2 (Synchronous)

Type: C extension (libpq wrapper) Protocol: PostgreSQL wire protocol Install: pip install psycopg2-binary

Connection Parameters

import psycopg2

conn = psycopg2.connect(
    host="db.example.com",
    port=5432,
    dbname="demo",
    user="demo",
    password="demo",
    sslmode="require",  # disable, allow, prefer, require, verify-ca, verify-full
    connect_timeout=10,
    application_name="my_app"
)

Parameter Binding

# Format style (%s)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))

# Named parameters (pyformat style)
cursor.execute(
    "SELECT * FROM users WHERE id = %(id)s AND name = %(name)s",
    {"id": user_id, "name": user_name}
)

Prepared Statements

# Server-side cursor (for large result sets)
cursor = conn.cursor(name='fetch_cursor')
cursor.execute("SELECT * FROM large_table")

# Iterate without loading all rows
for record in cursor:
    process(record)

cursor.close()

Connection Pooling

from psycopg2 import pool

# Simple pool
connection_pool = pool.SimpleConnectionPool(
    minconn=1,
    maxconn=10,
    host="db.example.com",
    user="demo",
    password="demo",
    database="demo"
)

conn = connection_pool.getconn()
# Use connection
connection_pool.putconn(conn)

Key Behaviors

Type Conversion: Automatic Python ↔ PostgreSQL type mapping
Transactions: Auto-BEGIN on first query, manual COMMIT/ROLLBACK
COPY Protocol: Full support for COPY FROM/TO
Two-Phase Commit: Supported via .tpc_*() methods
LISTEN/NOTIFY: Async notifications supported
Connection Status: .closed, .status attributes

Important Notes

Uses libpq C library (requires PostgreSQL client libs)
Blocking I/O (not suitable for async/await)
Thread-safety: Connections and cursors not thread-safe
Parameter style: MUST use %s (even for non-string types)

asyncpg (Asynchronous)

Type: Pure Python + Cython (no libpq) Protocol: PostgreSQL wire protocol (native implementation) Install: pip install asyncpg

Connection Parameters

import asyncpg

conn = await asyncpg.connect(
    host="db.example.com",
    port=5432,
    database="demo",
    user="demo",
    password="demo",
    ssl=True,  # or ssl.SSLContext object
    timeout=10,
    server_settings={
        'application_name': 'my_app'
    }
)

Parameter Binding

# PostgreSQL native style ($1, $2, ...)
result = await conn.fetch(
    "SELECT * FROM users WHERE id = $1 AND name = $2",
    user_id, user_name
)

# Named parameters NOT supported (must use positional)

Prepared Statements

# Automatic caching for fetch(), fetchrow(), fetchval()
result = await conn.fetch("SELECT * FROM users WHERE id = $1", user_id)
# Statement is automatically prepared and cached

# Explicit prepared statement
stmt = await conn.prepare("SELECT * FROM users WHERE id = $1")
result = await stmt.fetch(user_id)
# Can be reused
result2 = await stmt.fetch(another_id)

Connection Pooling

# Built-in pool
pool = await asyncpg.create_pool(
    host="db.example.com",
    port=5432,
    user="demo",
    password="demo",
    database="demo",
    min_size=10,
    max_size=20,
    command_timeout=60
)

async with pool.acquire() as conn:
    result = await conn.fetch("SELECT 1")

Key Behaviors

Type Conversion: Custom codec system (fast binary protocol)
Prepared Statements: LRU cache (max 100 by default)
Transactions: Explicit via async with conn.transaction()
Cursors: async for iteration with .cursor()
COPY Protocol: Full support (async)
LISTEN/NOTIFY: Supported via .add_listener()

Important Notes

HIGH PERFORMANCE: 3-5x faster than psycopg2 in async workloads
Parameter style: MUST use $1, $2, ... (PostgreSQL native)
PgBouncer incompatibility: Prepared statements fail in transaction/statement pooling mode
Cursors/prepared statements invalid after connection release
Binary protocol by default (not text)
No thread support (asyncio only)

SQLAlchemy (ORM/Core)

Type: Database abstraction layer Protocol: Uses psycopg2 or asyncpg under the hood Install: pip install sqlalchemy psycopg2-binary or pip install sqlalchemy asyncpg

Connection Strings

# psycopg2 (sync)
engine = create_engine(
    "postgresql+psycopg2://demo:demo@db.example.com:5432/demo"
)

# asyncpg (async)
engine = create_async_engine(
    "postgresql+asyncpg://demo:demo@db.example.com:5432/demo"
)

Parameter Binding (Core)

from sqlalchemy import text

# Named parameters (SQLAlchemy style)
result = conn.execute(
    text("SELECT * FROM users WHERE id = :id"),
    {"id": user_id}
)

# Bound parameters (converted to driver's style)
stmt = text("SELECT * FROM users WHERE id = :id AND name = :name")
result = conn.execute(stmt, {"id": 1, "name": "Alice"})

Key Behaviors

Dialect System: Abstracts driver differences
Parameter Conversion: :name → %s (psycopg2) or $1 (asyncpg)
Connection Pooling: Built-in QueuePool
Transaction Management: Automatic or explicit
Type System: Rich type mapping

Important Notes

Adds abstraction layer over psycopg2/asyncpg
HeliosDB must support BOTH psycopg2 and asyncpg behaviors
ORM layer expects PostgreSQL-specific features (RETURNING, etc.)

2. MySQL Clients

PyMySQL (Synchronous)

Type: Pure Python Protocol: MySQL wire protocol (Python implementation) Install: pip install pymysql

Connection Parameters

import pymysql

conn = pymysql.connect(
    host="db.example.com",
    port=3306,
    user="demo",
    password="demo",
    database="demo",
    charset="utf8mb4",
    ssl={'ssl': {}},  # Enable SSL/TLS
    connect_timeout=10,
    autocommit=False
)

Parameter Binding

# Format style (%s only, no named parameters)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))

# Multiple parameters
cursor.execute(
    "SELECT * FROM users WHERE id = %s AND name = %s",
    (user_id, user_name)
)

# CLIENT-SIDE escaping (not true prepared statements)

Cursor Types

# Default cursor (tuple rows)
cursor = conn.cursor()

# Dictionary cursor (row as dict)
cursor = conn.cursor(pymysql.cursors.DictCursor)

# Server-side cursor (streaming)
cursor = conn.cursor(pymysql.cursors.SSCursor)

Key Behaviors

Type Conversion: Automatic (int, str, datetime, etc.)
Prepared Statements: CLIENT-SIDE only (escaping, not protocol)
Transactions: Manual BEGIN, COMMIT, ROLLBACK
Autocommit: Can be enabled (affects transaction behavior)
Character Sets: utf8mb4 recommended for full Unicode

Important Notes

Pure Python (no C dependencies)
NOT true prepared statements (client-side escaping only)
Parameter style: %s only (no named, no ?)
Thread-safe at connection level (not cursor level)
MySQL-specific: cursor.lastrowid for INSERT

mysql-connector-python (Synchronous)

Type: Pure Python OR C extension Protocol: MySQL wire protocol Install: pip install mysql-connector-python

Connection Parameters

import mysql.connector

conn = mysql.connector.connect(
    host="db.example.com",
    port=3306,
    user="demo",
    password="demo",
    database="demo",
    ssl_ca="/path/to/ca.pem",  # TLS cert
    ssl_verify_cert=True,
    use_pure=False,  # False = C extension, True = pure Python
    autocommit=False
)

Parameter Binding (Regular Cursor)

# Format style (%s)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))

# Named parameters (pyformat style)
cursor.execute(
    "SELECT * FROM users WHERE id = %(id)s AND name = %(name)s",
    {"id": user_id, "name": user_name}
)

Prepared Statements (MySQLCursorPrepared)

# Server-side prepared statement
cursor = conn.cursor(prepared=True)

# ONLY %s or ? allowed (no named parameters for prepared)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))

# Statement is prepared on first execute, cached for subsequent calls
cursor.execute("SELECT * FROM users WHERE id = %s", (another_id,))
# Uses cached prepared statement

# Qmark style also supported
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))

Connection Pooling

from mysql.connector import pooling

# Create pool
pool = pooling.MySQLConnectionPool(
    pool_name="my_pool",
    pool_size=5,
    host="db.example.com",
    user="demo",
    password="demo",
    database="demo"
)

# Get connection from pool
conn = pool.get_connection()
# Use connection
conn.close()  # Returns to pool

Key Behaviors

Type Conversion: Automatic with type hints
Prepared Statements: TRUE server-side (binary protocol)
Statement Caching: Automatic for prepared cursors
Transactions: Explicit or autocommit
Character Sets: Full support for utf8mb4
Multi-Statements: Supported with multi=True

Important Notes

Official Oracle driver
Server-side prepared statements via cursor(prepared=True)
Prepared cursors: %s or ? ONLY (no named parameters)
Regular cursors: %s or %(name)s (named supported)
use_pure=False (C extension) is faster
Binary protocol for prepared statements

SQLAlchemy (MySQL Dialect)

Type: Database abstraction layer Protocol: Uses PyMySQL or mysql-connector-python Install: pip install sqlalchemy pymysql

Connection Strings

# PyMySQL
engine = create_engine(
    "mysql+pymysql://demo:demo@db.example.com:3306/demo"
)

# mysql-connector-python
engine = create_engine(
    "mysql+mysqlconnector://demo:demo@db.example.com:3306/demo"
)

Key Behaviors

Dialect Translation: Handles MySQL-specific SQL
LIMIT/OFFSET: Native MySQL syntax
AUTO_INCREMENT: Supports LAST_INSERT_ID()
Type Mapping: MySQL-specific types (TINYINT, MEDIUMINT, etc.)

3. SQL Server Clients

pyodbc (ODBC Wrapper)

Type: C extension (ODBC wrapper) Protocol: TDS via ODBC driver Install: pip install pyodbc (requires ODBC Driver 18 for SQL Server)

Connection Parameters

import pyodbc

conn = pyodbc.connect(
    "DRIVER={ODBC Driver 18 for SQL Server};"
    "SERVER=db.example.com,1433;"
    "DATABASE=demo;"
    "UID=demo;"
    "PWD=demo;"
    "Encrypt=yes;"
    "TrustServerCertificate=no;"
    "Connection Timeout=30;"
)

Parameter Binding

# Qmark style (? only)
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))

# Multiple parameters
cursor.execute(
    "SELECT * FROM users WHERE id = ? AND name = ?",
    (user_id, user_name)
)

# NO named parameters supported

Prepared Statements

# Prepared automatically by ODBC driver
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))

# Reusing with different parameters
cursor.execute("SELECT * FROM users WHERE id = ?", (another_id,))
# Driver may cache prepared statement

Key Behaviors

Type Conversion: ODBC driver handles conversions
Transactions: Auto-BEGIN, manual COMMIT/ROLLBACK
Encoding: UTF-16LE for SQL Server (ODBC handles conversion)
MARS: Multiple Active Result Sets (if enabled)
OUTPUT Parameters: Supported for stored procedures

Important Notes

Requires ODBC driver installation
Works with multiple databases (not just SQL Server)
Parameter style: ? ONLY (qmark)
Fast execution for parameterized queries
Connection string can be complex (many options)
Supports async execution (with pyodbc 4.0+)

pymssql (FreeTDS Wrapper)

Type: C extension (FreeTDS wrapper) Protocol: TDS via FreeTDS Install: pip install pymssql

Connection Parameters

import pymssql

conn = pymssql.connect(
    server="db.example.com",
    port=1433,
    user="demo",
    password="demo",
    database="demo",
    tds_version="7.4",
    timeout=30
)

Parameter Binding

# Format style (%s)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))

# Named parameters (pyformat style)
cursor.execute(
    "SELECT * FROM users WHERE id = %(id)s AND name = %(name)s",
    {"id": user_id, "name": user_name}
)

Key Behaviors

Type Conversion: Automatic
Transactions: Manual control
Prepared Statements: Via FreeTDS
Stored Procedures: Full support

Important Notes

Uses FreeTDS library (not official Microsoft driver)
Simpler API than pyodbc
Parameter style: %s or %(name)s (like psycopg2)
TDS version configurable (7.0, 7.1, 7.2, 7.3, 7.4)

4. DB2 Clients

ibm_db (IBM Data Server Driver)

Type: C extension (IBM driver wrapper) Protocol: DRDA via IBM Data Server Driver Install: pip install ibm_db (requires IBM Data Server Driver)

Connection Parameters

import ibm_db

# Connection string
conn = ibm_db.connect(
    "DATABASE=SAMPLE;"
    "HOSTNAME=db.example.com;"
    "PORT=50000;"
    "PROTOCOL=TCPIP;"
    "UID=demo;"
    "PWD=demo;"
    "SECURITY=SSL;",
    "", ""
)

SQL Execution

# Immediate execution (non-prepared)
stmt = ibm_db.exec_immediate(conn, "SELECT 1 FROM sysibm.sysdummy1")
row = ibm_db.fetch_tuple(stmt)
print(row)

# Prepared statement
stmt = ibm_db.prepare(conn, "SELECT * FROM users WHERE id = ?")
ibm_db.bind_param(stmt, 1, user_id)
ibm_db.execute(stmt)
row = ibm_db.fetch_tuple(stmt)

Parameter Binding

# Qmark style (?)
stmt = ibm_db.prepare(conn, "SELECT * FROM users WHERE id = ?")
ibm_db.bind_param(stmt, 1, user_id)  # 1-indexed
ibm_db.execute(stmt)

# Multiple parameters
stmt = ibm_db.prepare(conn, "SELECT * FROM users WHERE id = ? AND name = ?")
ibm_db.bind_param(stmt, 1, user_id)
ibm_db.bind_param(stmt, 2, user_name)
ibm_db.execute(stmt)

Key Behaviors

Type Conversion: Automatic (int, str, datetime, etc.)
Transactions: Manual COMMIT/ROLLBACK
Prepared Statements: Via prepare() + bind_param() + execute()
Connection Test: “SELECT 1 FROM sysibm.sysdummy1”
SSL/TLS: Supported via SECURITY=SSL

Important Notes

Requires IBM Data Server Driver installation (large download)
Two APIs: ibm_db (native) and ibm_db_dbi (DB-API 2.0 compliant)
Parameter style: ? (qmark)
Connection string format specific to IBM
DRDA protocol under the hood
EBCDIC handling (driver converts to/from)

ibm_db_dbi (DB-API 2.0)

import ibm_db_dbi as dbi

# More Pythonic interface
conn = dbi.connect(
    "DATABASE=SAMPLE;HOSTNAME=db.example.com;PORT=50000;"
    "PROTOCOL=TCPIP;UID=demo;PWD=demo;SECURITY=SSL;"
)

cursor = conn.cursor()
cursor.execute("SELECT 1 FROM sysibm.sysdummy1")
row = cursor.fetchone()

5. Oracle Clients

oracledb (Thin Mode)

Type: Pure Python (Thin mode) OR C extension (Thick mode) Protocol: Oracle Net/TTC (Thin: Python implementation) Install: pip install oracledb

Connection Parameters (Thin Mode)

import oracledb

# Thin mode (no Oracle Client required)
conn = oracledb.connect(
    user="demo",
    password="demo",
    dsn="db.example.com:1521/DEMO",  # host:port/service_name
    mode=oracledb.THIN_MODE,  # Or omit (Thin is default)
    config_dir=None  # No config needed for Thin
)

Connection Parameters (Thick Mode)

import oracledb

# Initialize Thick mode (requires Oracle Instant Client)
oracledb.init_oracle_client(lib_dir="/path/to/instantclient")

conn = oracledb.connect(
    user="demo",
    password="demo",
    dsn="db.example.com:1521/DEMO"
    # Thick mode auto-detected if init_oracle_client() called
)

Parameter Binding

# Named parameters (:name)
cursor.execute(
    "SELECT * FROM users WHERE id = :id AND name = :name",
    {"id": user_id, "name": user_name}
)

# Numeric parameters (:1, :2)
cursor.execute(
    "SELECT * FROM users WHERE id = :1 AND name = :2",
    [user_id, user_name]
)

# Bind variables (explicitly)
cursor.execute(
    "SELECT * FROM users WHERE id = :id",
    id=user_id
)

Prepared Statements

# Automatic preparation and caching
cursor.execute("SELECT * FROM users WHERE id = :1", [user_id])

# Statement is prepared on first execute, cached for reuse
cursor.execute("SELECT * FROM users WHERE id = :1", [another_id])

Connection Pooling

# Built-in connection pool
pool = oracledb.create_pool(
    user="demo",
    password="demo",
    dsn="db.example.com:1521/DEMO",
    min=2,
    max=10,
    increment=1
)

conn = pool.acquire()
# Use connection
pool.release(conn)

Key Behaviors

Type Conversion: Rich type system (NUMBER, VARCHAR2, CLOB, BLOB, etc.)
Prepared Statements: Automatic caching
Transactions: Auto-BEGIN, manual COMMIT/ROLLBACK
Connection Test: “SELECT 1 FROM DUAL”
LOBs: Full support for CLOB, BLOB, NCLOB
Object Types: User-defined types supported (Thick mode)
Collections: VARRAY, nested tables (Thick mode)

Important Notes

Thin mode: No Oracle Client installation required (pure Python)
Thick mode: Requires Oracle Instant Client (full features)
Parameter style: :name (named) or :1, :2 (numeric)
“SELECT 1 FROM DUAL” for connection testing
Bind variables use : prefix (Oracle convention)
Thin mode limitations: No advanced features (external auth, LDAP, etc.)

6. HTTP API Clients

snowflake-connector-python

Type: Pure Python Protocol: HTTPS/REST + Arrow (optional) Install: pip install snowflake-connector-python

Connection Parameters

import snowflake.connector

conn = snowflake.connector.connect(
    account="local",
    user="demo",
    password="demo",
    database="demo",
    schema="public",
    warehouse="compute_wh",
    role="accountadmin",
    # Alternative: JWT auth
    # authenticator="SNOWFLAKE_JWT",
    # private_key_file="/path/to/key.pem"
)

Query Execution

cursor = conn.cursor()

# Simple query
cursor.execute("SELECT 1")
print(cursor.fetchone())

# Parameterized query (qmark or named)
cursor.execute(
    "SELECT * FROM users WHERE id = ? AND name = ?",
    (user_id, user_name)
)

# Named parameters
cursor.execute(
    "SELECT * FROM users WHERE id = %(id)s AND name = %(name)s",
    {"id": user_id, "name": user_name}
)

Key Behaviors

Authentication: Password, JWT, OAuth, SSO
Result Formats: JSON, Arrow (high performance)
Session Management: Multi-statement, session parameters
Async Execution: Via cursor.execute_async()
Result Caching: Query result caching
Compression: Gzip for data transfer

Important Notes

HTTPS/REST based (not wire protocol)
JWT authentication for automation
Arrow format for large result sets
Session parameters (warehouse, schema, etc.) per connection
Multi-statement execution supported
Async query execution (long-running queries)

databricks-sql-connector

Type: Thrift-based (no ODBC/JDBC) Protocol: HTTP/Thrift over TLS Install: pip install databricks-sql-connector

Connection Parameters

from databricks import sql

conn = sql.connect(
    server_hostname="db.example.com",
    http_path="/sql/1.0/warehouses/abc123",
    access_token="dapi...",  # Personal Access Token
    # Or OAuth:
    # auth_type="oauth",
    # client_id="...",
    # client_secret="..."
)

Query Execution

cursor = conn.cursor()

# Simple query
cursor.execute("SELECT 1")
print(cursor.fetchone())

# Parameterized query (named parameters)
cursor.execute(
    "SELECT * FROM users WHERE id = :id AND name = :name",
    {"id": user_id, "name": user_name}
)

# Parameters as dict
params = {"id": 123, "name": "Alice"}
cursor.execute("SELECT * FROM users WHERE id = :id", params)

Key Behaviors

Authentication: PAT, OAuth, Microsoft Entra ID
Result Formats: JSON, Arrow
Async Execution: Built-in (via Thrift)
Result Chunking: Large results paginated
Unity Catalog: Full support (catalog.schema.table)
Warehouse Management: SQL warehouse routing

Important Notes

Thrift-based protocol (not ODBC/JDBC)
PAT or OAuth authentication
Named parameters only (:name style)
Unity Catalog namespace support
Arrow format for performance
HTTP/2 support

pinecone-client

Type: Pure Python Protocol: HTTP/REST or gRPC Install: pip install pinecone-client or pip install pinecone-client[grpc]

Initialization

from pinecone import Pinecone

# HTTP/REST (default)
pc = Pinecone(api_key="your_api_key")

# gRPC (for performance)
pc = Pinecone(api_key="your_api_key", use_grpc=True)

# Index operations
index = pc.Index("my-index")

Vector Operations

# Upsert vectors
index.upsert(vectors=[
    {
        "id": "vec1",
        "values": [0.1, 0.2, 0.3, ..., 0.768],
        "metadata": {"category": "books", "title": "Example"}
    }
])

# Query vectors
results = index.query(
    vector=[0.1, 0.2, 0.3, ..., 0.768],
    top_k=10,
    filter={"category": {"$eq": "books"}},
    include_values=True,
    include_metadata=True
)

# Delete vectors
index.delete(ids=["vec1", "vec2"])

# Fetch vectors
vectors = index.fetch(ids=["vec1", "vec2"])

# Update metadata
index.update(id="vec1", set_metadata={"updated": True})

Key Behaviors

API Authentication: API key header
Namespaces: Logical isolation within index
Metadata Filtering: Complex boolean expressions
Sparse Vectors: Hybrid dense + sparse support
gRPC: Optional for performance
Up-to-the-second: Immediate consistency

Important Notes

Not SQL-based (vector operations)
HTTP/REST or gRPC protocols
API key authentication
Metadata filtering with MongoDB-like syntax
gRPC provides modest performance improvement
No connection pooling (stateless HTTP)

7. Implementation Recommendations

Unified Parameter Handling

Challenge: Different parameter styles across drivers

Driver	Parameter Style	Example
psycopg2	`%s`, `%(name)s`	`WHERE id = %s`
asyncpg	`$1, $2, ...`	`WHERE id = $1`
PyMySQL	`%s` only	`WHERE id = %s`
mysql-connector (prep)	`%s` or `?`	`WHERE id = %s`
pyodbc	`?` only	`WHERE id = ?`
ibm_db	`?` only	`WHERE id = ?`
oracledb	`:name` or `:1`	`WHERE id = :id`
snowflake-connector	`?` or `%(name)s`	`WHERE id = ?`
databricks-sql	`:name`	`WHERE id = :name`

Recommendation:

class ParameterAdapter:
    def adapt(self, sql, params, target_style):
        if target_style == "postgres_positional":
            # Convert to $1, $2, ...
            return self._to_postgres_positional(sql, params)
        elif target_style == "qmark":
            # Convert to ?
            return self._to_qmark(sql, params)
        elif target_style == "named":
            # Convert to :name
            return self._to_named(sql, params)
        # etc.

Prepared Statement Handling

Challenge: Client-side vs server-side prepared statements

PyMySQL: Client-side escaping (not true prepared statements)

# PyMySQL: String escaping, not binary protocol
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
# HeliosDB: Can optimize with server-side cache

mysql-connector-python: Server-side prepared (binary protocol)

cursor = conn.cursor(prepared=True)
cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
# HeliosDB: MUST support COM_STMT_PREPARE, COM_STMT_EXECUTE

asyncpg: Automatic caching

await conn.fetch("SELECT * FROM users WHERE id = $1", user_id)
# HeliosDB: LRU cache for Parse messages

Recommendation:

Implement prepared statement cache per protocol
For PyMySQL: Accept text protocol, optionally cache internally
For mysql-connector-python: Support binary protocol (COM_STMT_*)
For asyncpg: Cache Parse/Bind messages (max 100 by default)

Connection Pooling

Challenge: Some clients have built-in pools, others don’t

Built-in Pools:

asyncpg: create_pool()
mysql-connector-python: pooling.MySQLConnectionPool()
oracledb: create_pool()

External Pools:

psycopg2: Use psycopg2.pool or SQLAlchemy
PyMySQL: Use SQLAlchemy or DBUtils
pyodbc: Use SQLAlchemy

Recommendation:

HeliosDB should implement server-side connection pooling
Track active connections per protocol
Support concurrent connections (asyncio-safe)

Transaction Semantics

Challenge: Different transaction modes

Auto-BEGIN:

psycopg2: First query starts transaction
oracledb: First query starts transaction

Manual BEGIN:

MySQL: Requires explicit BEGIN or START TRANSACTION
SQL Server: Requires explicit BEGIN TRANSACTION

Autocommit:

MySQL: Can be enabled (no explicit transactions)
SQL Server: Implicit transactions by default

Recommendation:

class TransactionManager:
    def __init__(self, protocol):
        self.protocol = protocol
        self.in_transaction = False

    def handle_query(self, sql):
        if self.protocol == "postgresql":
            if not self.in_transaction and not sql.upper().startswith("BEGIN"):
                # Auto-BEGIN
                self.execute_internal("BEGIN")
                self.in_transaction = True
        elif self.protocol == "mysql":
            # Explicit BEGIN required
            pass
        # Execute query

Type Conversion

Challenge: Different type systems

PostgreSQL:

Numeric: int2, int4, int8, numeric, float4, float8
Text: text, varchar, char
Binary: bytea
Temporal: timestamp, timestamptz, date, time
JSON: json, jsonb

MySQL:

Numeric: TINYINT, SMALLINT, INT, BIGINT, DECIMAL, FLOAT, DOUBLE
Text: VARCHAR, CHAR, TEXT
Binary: BLOB, BINARY, VARBINARY
Temporal: DATETIME, TIMESTAMP, DATE, TIME

SQL Server:

Numeric: TINYINT, SMALLINT, INT, BIGINT, DECIMAL, FLOAT, REAL
Text: NVARCHAR, VARCHAR, NCHAR, CHAR
Binary: VARBINARY, BINARY
Temporal: DATETIME, DATETIME2, DATE, TIME

Recommendation:

Unified internal type system
Dialect-specific mappers (PostgreSQL types ↔ MySQL types ↔ internal)
Vector type exposure (BYTEA for PostgreSQL, VARBINARY for MySQL)

Error Handling

Challenge: Different error codes and formats

PostgreSQL: SQLSTATE (5-char code) + message

SQLSTATE: 42P01
Message: relation "users" does not exist

MySQL: Error number + SQLSTATE + message

Error: 1146
SQLSTATE: 42S02
Message: Table 'demo.users' doesn't exist

SQL Server: Error number + severity + state + message

Error: 208
Severity: 16
State: 1
Message: Invalid object name 'users'.

Recommendation:

class ErrorMapper:
    def map_error(self, internal_error, target_protocol):
        if target_protocol == "postgresql":
            return PostgreSQLError(
                sqlstate=self._to_sqlstate(internal_error),
                message=internal_error.message
            )
        elif target_protocol == "mysql":
            return MySQLError(
                errno=self._to_mysql_errno(internal_error),
                sqlstate=self._to_sqlstate(internal_error),
                message=internal_error.message
            )
        # etc.

Async Support

Challenge: Async vs sync drivers

Async Drivers:

asyncpg (asyncio native)
databricks-sql-connector (Thrift async)

Sync Drivers:

psycopg2, PyMySQL, mysql-connector-python, pyodbc, ibm_db

Recommendation:

Implement async protocol handlers with asyncio
Use thread pool for blocking operations if needed
Support concurrent connections per protocol

Testing Matrix

Required Tests per Driver:

Connection Establishment
- Valid credentials
- Invalid credentials
- TLS/SSL negotiation
- Timeout handling
Authentication
- All supported methods (password, JWT, OAuth, etc.)
- Auth failure scenarios
Query Execution
- Simple query (SELECT 1)
- Parameterized query
- Multiple result sets
- Empty result set
Prepared Statements
- Prepare + execute
- Reuse with different parameters
- Close prepared statement
Transactions
- BEGIN, COMMIT, ROLLBACK
- Auto-commit mode
- Transaction isolation
Data Types
- Integer, float, decimal
- Text, varchar
- Binary, bytea/blob
- Timestamp, date, time
- NULL handling
Error Handling
- Invalid SQL
- Constraint violation
- Connection loss
- Timeout
Cursors (where applicable)
- Server-side cursor
- Fetch rows
- Cursor close
Connection Pooling
- Acquire/release
- Max connections
- Connection reuse
Vector Operations (Pinecone)
- Upsert, query, delete
- Metadata filtering
- Namespaces

CI/CD Integration:

test-matrix:
  - driver: psycopg2
    version: 2.9.9
    protocol: postgresql
  - driver: asyncpg
    version: 0.29.0
    protocol: postgresql
  - driver: PyMySQL
    version: 1.1.0
    protocol: mysql
  - driver: mysql-connector-python
    version: 8.0.33
    protocol: mysql
  # ... etc.

Summary & Next Steps

Key Findings

Parameter Style Diversity: 5 different styles (%s, $1, ?, :name, %(name)s)
- Requires unified parameter adapter
Prepared Statement Complexity: Client-side vs server-side
- PyMySQL: Client-side escaping
- mysql-connector-python, asyncpg, oracledb: Server-side caching
- HeliosDB must support both paradigms
Connection Pooling: Mixed built-in and external
- Server-side pooling recommended for consistency
Transaction Semantics: Auto-BEGIN vs manual BEGIN
- Protocol-specific transaction managers needed
Type Systems: Rich but incompatible
- Unified internal type system with dialect mappers
Async/Sync Split: asyncpg vs psycopg2
- Async handlers with thread pool fallback

Implementation Priorities

Phase 1 (PostgreSQL + MySQL):

Implement psycopg2 compatibility (%s parameters)
Implement asyncpg compatibility ($1 parameters)
Implement PyMySQL compatibility (client-side escaping)
Implement mysql-connector-python compatibility (server-side prepared)
Unified parameter adapter
Server-side connection pooling

Phase 1.5 (HTTP APIs):

Implement snowflake-connector-python compatibility (JWT auth)
Implement databricks-sql-connector compatibility (Thrift)
Implement pinecone-client compatibility (vector ops → SQL translation)

Phase 2 (SQL Server + DB2):

Implement pyodbc compatibility (? parameters)
Implement ibm_db compatibility (DRDA + EBCDIC handling)

Phase 3 (Oracle):

Implement oracledb compatibility (:name parameters)
Thin mode focus (pure Python reference)

Testing Strategy

Per-Driver Integration Tests:

Connection → Auth → Query → Prepared Statement → Transaction → Cursor
Run against each Python client library
Validate parameter binding, type conversion, error mapping

Compatibility Matrix:

Track feature support per driver
Mark unsupported features clearly in docs
Provide migration guides (expected differences)

Document Status: Complete Coverage: All target Python clients with behavioral analysis Next Steps: Coordinate with architect on unified parameter/type/error handling design

Python Client Driver Compatibility Matrix

Python Client Driver Compatibility Matrix

HeliosDB Multi-Protocol Support Research

Table of Contents

1. PostgreSQL Clients

psycopg2 (Synchronous)

Connection Parameters

Parameter Binding

Prepared Statements

Connection Pooling

Key Behaviors

Important Notes

asyncpg (Asynchronous)

Connection Parameters

Parameter Binding

Prepared Statements

Connection Pooling

Key Behaviors

Important Notes

SQLAlchemy (ORM/Core)

Connection Strings

Parameter Binding (Core)

Key Behaviors

Important Notes

2. MySQL Clients

PyMySQL (Synchronous)

Connection Parameters

Parameter Binding

Cursor Types

Key Behaviors

Important Notes

mysql-connector-python (Synchronous)

Connection Parameters

Parameter Binding (Regular Cursor)

Prepared Statements (MySQLCursorPrepared)

Connection Pooling

Key Behaviors

Important Notes

SQLAlchemy (MySQL Dialect)

Connection Strings

Key Behaviors

3. SQL Server Clients

pyodbc (ODBC Wrapper)

Connection Parameters

Parameter Binding

Prepared Statements

Key Behaviors

Important Notes

pymssql (FreeTDS Wrapper)

Connection Parameters

Parameter Binding

Key Behaviors

Important Notes

4. DB2 Clients

ibm_db (IBM Data Server Driver)

Connection Parameters

SQL Execution

Parameter Binding

Key Behaviors

Important Notes

ibm_db_dbi (DB-API 2.0)

5. Oracle Clients

oracledb (Thin Mode)

Connection Parameters (Thin Mode)

Connection Parameters (Thick Mode)

Parameter Binding

Prepared Statements

Connection Pooling

Key Behaviors

Important Notes

6. HTTP API Clients

snowflake-connector-python

Connection Parameters

Query Execution

Key Behaviors

Important Notes

databricks-sql-connector

Connection Parameters

Query Execution

Key Behaviors