Skip to content

HTTP/REST Gateway for HeliosDB Protocols

HTTP/REST Gateway for HeliosDB Protocols

Overview

The HTTP gateway (http_gateway.rs) implements Silver-level compatibility for three major cloud database REST APIs:

  1. Snowflake REST API - Session management and async query execution
  2. Databricks SQL API - SQL statement execution with polling
  3. Pinecone Vector API - Vector operations (insert, query, delete)

Architecture

┌─────────────────────────────────────────────────────┐
│ HTTP Gateway (Hyper) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ Snowflake │ │ Databricks │ │ Pinecone │ │
│ │ Handler │ │ Handler │ │ Handler │ │
│ └──────────────┘ └──────────────┘ └──────────┘ │
│ │ │ │ │
│ ┌────────┴────────────────┴────────────────┘ │
│ │ Authentication Layer │
│ │ (JWT / Bearer Token / API Key) │
│ └─────────────────────────────────────────────┐ │
│ │ │
│ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ Query Store │ │ Vector Store │ │ │
│ │ (Async Exec) │ │ (ANN Search) │ │ │
│ └──────────────────┘ └──────────────────┘ │ │
└─────────────────────────────────────────────────────┘
│ │
├───────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ HeliosDB Compute Engine │
│ (Query Execution & Vector Search) │
└─────────────────────────────────────────────────────┘

Snowflake REST API

Endpoints

POST /snowflake/session

Create a new Snowflake session.

Request:

{
"database": "mydb",
"schema": "public",
"warehouse": "compute_wh",
"role": "analyst"
}

Response (200 OK):

{
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"database": "mydb",
"schema": "public",
"warehouse": "compute_wh",
"role": "analyst"
}

POST /snowflake/queries

Submit a query for async execution.

Request:

{
"sql": "SELECT * FROM users WHERE active = true"
}

Response (202 Accepted):

{
"query_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"status": "QUEUED"
}

GET /snowflake/queries/{id}

Poll query execution status.

Response (200 OK):

{
"query_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"status": "RUNNING",
"progress": 50
}

Status values: QUEUED, RUNNING, COMPLETED, CANCELLED, FAILED

GET /snowflake/queries/{id}/result

Fetch query results (only when status is COMPLETED).

Response (200 OK):

{
"query_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"columns": [
{"name": "id", "type": "NUMBER"},
{"name": "name", "type": "VARCHAR"}
],
"rows": [
[1, "Alice"],
[2, "Bob"]
],
"row_count": 2
}

DELETE /snowflake/queries/{id}

Cancel a running query.

Response (200 OK):

{
"query_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"status": "CANCELLED"
}

Authentication

Uses JWT tokens in the Authorization header:

Authorization: Bearer <jwt_token>

The JWT must contain:

  • sub (subject/user_id)
  • exp (expiration timestamp)
  • iat (issued at timestamp)

Databricks SQL API

Endpoints

POST /dbsql/sql/statements

Execute a SQL statement.

Request:

{
"statement": "SELECT * FROM delta.`/path/to/table`",
"warehouse_id": "abc123"
}

Response (200 OK):

{
"statement_id": "01234567-89ab-cdef-0123-456789abcdef",
"status": {
"state": "PENDING"
}
}

State values: PENDING, RUNNING, SUCCEEDED, CANCELED, FAILED

GET /dbsql/sql/statements/{id}

Get statement status and results.

Response (200 OK - Running):

{
"statement_id": "01234567-89ab-cdef-0123-456789abcdef",
"status": {
"state": "RUNNING"
}
}

Response (200 OK - Completed):

{
"statement_id": "01234567-89ab-cdef-0123-456789abcdef",
"status": {
"state": "SUCCEEDED"
},
"manifest": {
"schema": {
"columns": [
{"name": "id", "type_name": "INT"},
{"name": "value", "type_name": "STRING"}
]
},
"total_row_count": 2
},
"result": {
"data_array": [
[1, "test1"],
[2, "test2"]
]
}
}

POST /dbsql/sql/statements/{id}/cancel

Cancel a running statement.

Response (200 OK):

{
"status": "CANCELED"
}

Authentication

Uses personal access tokens (PAT) or OAuth bearer tokens:

Authorization: Bearer <access_token>

Databricks PAT tokens typically start with dapi.

Pinecone Vector API

Endpoints

POST /pinecone/indexes

Create a vector index.

Request:

{
"name": "my-index",
"dimension": 128,
"metric": "cosine"
}

Response (201 Created):

{
"name": "my-index",
"dimension": 128,
"metric": "cosine"
}

POST /pinecone/vectors/upsert

Insert or update vectors.

Request:

{
"vectors": [
{
"id": "vec1",
"values": [0.1, 0.2, 0.3, ...],
"metadata": {"category": "A"}
},
{
"id": "vec2",
"values": [0.4, 0.5, 0.6, ...],
"metadata": {"category": "B"}
}
]
}

Response (200 OK):

{
"upserted_count": 2
}

POST /pinecone/query

Query for similar vectors using ANN search.

Request:

{
"vector": [0.1, 0.2, 0.3, ...],
"top_k": 10,
"namespace": "default",
"filter": {"category": "A"}
}

Response (200 OK):

{
"matches": [
{
"id": "vec1",
"score": 0.95,
"values": [0.1, 0.2, 0.3, ...],
"metadata": {"category": "A"}
}
],
"namespace": "default"
}

GET /pinecone/vectors/fetch?ids=vec1,vec2

Fetch vectors by ID.

Response (200 OK):

{
"vectors": {
"vec1": {
"id": "vec1",
"values": [0.1, 0.2, 0.3, ...],
"metadata": {"category": "A"}
},
"vec2": {
"id": "vec2",
"values": [0.4, 0.5, 0.6, ...],
"metadata": {"category": "B"}
}
}
}

DELETE /pinecone/vectors/delete

Delete vectors by ID.

Request:

{
"ids": ["vec1", "vec2"]
}

Response (200 OK):

{
"deleted_count": 2
}

Authentication

Uses API keys in the x-api-key header:

x-api-key: <your_api_key>

API keys must be at least 32 characters for validation.

Error Responses

All endpoints return errors in a consistent format:

{
"error": "Detailed error message",
"code": "ERROR_CODE"
}

Common error codes:

  • UNAUTHORIZED (401) - Missing or invalid authentication
  • NOT_FOUND (404) - Resource not found
  • INVALID_REQUEST (400) - Malformed request
  • QUERY_NOT_READY (400) - Query not yet completed

Implementation Details

Authentication Layer

Three authentication mechanisms:

  1. JwtValidator - Validates JWT tokens for Snowflake

    • Decodes and validates token structure
    • Checks expiration timestamps
    • Extracts user identity from claims
  2. BearerTokenValidator - Validates PAT tokens for Databricks

    • Supports custom token validation
    • Recognizes dapi* token format
  3. ApiKeyValidator - Validates API keys for Pinecone

    • Length-based validation (32+ chars)
    • Simple key-to-user mapping

Query Store

Manages async query execution:

  • Stores query metadata (SQL, user, status)
  • Supports status polling (QUEUED → RUNNING → COMPLETED)
  • Enables query cancellation
  • Thread-safe with RwLock

Vector Store

In-memory vector storage:

  • Multiple indexes with configurable dimensions
  • Cosine similarity search
  • Metadata filtering support
  • CRUD operations (upsert, query, fetch, delete)

CORS Support

To enable CORS, add the following to response headers:

.header("access-control-allow-origin", "*")
.header("access-control-allow-methods", "GET, POST, DELETE, OPTIONS")
.header("access-control-allow-headers", "content-type, authorization, x-api-key")

Usage Example

use heliosdb_protocols::http_gateway::HttpGateway;
use hyper::server::conn::http1;
use tokio::net::TcpListener;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let gateway = HttpGateway::new();
let listener = TcpListener::bind("127.0.0.1:8080").await?;
loop {
let (stream, _) = listener.accept().await?;
let gateway = gateway.clone();
tokio::spawn(async move {
if let Err(e) = http1::Builder::new()
.serve_connection(stream, gateway)
.await
{
eprintln!("Error serving connection: {}", e);
}
});
}
}

Testing

Manual Testing with curl

Snowflake Session:

Terminal window
curl -X POST http://localhost:8080/snowflake/session \
-H "Authorization: Bearer <jwt_token>" \
-H "Content-Type: application/json" \
-d '{"database":"mydb","schema":"public","warehouse":"wh","role":"user"}'

Databricks Query:

Terminal window
curl -X POST http://localhost:8080/dbsql/sql/statements \
-H "Authorization: Bearer dapi1234567890ab" \
-H "Content-Type: application/json" \
-d '{"statement":"SELECT 1"}'

Pinecone Upsert:

Terminal window
curl -X POST http://localhost:8080/pinecone/vectors/upsert \
-H "x-api-key: 12345678901234567890123456789012" \
-H "Content-Type: application/json" \
-d '{"vectors":[{"id":"v1","values":[0.1,0.2,0.3]}]}'

Future Enhancements

  1. Integration with heliosdb-compute

    • Replace mock query execution with actual compute engine
    • Support distributed query execution
    • Real result streaming
  2. Production Authentication

    • Proper JWT signature validation (RS256, HS256)
    • OAuth 2.0 integration for Databricks
    • API key rotation and management
  3. Performance Optimization

    • Connection pooling
    • Response caching
    • Query result pagination
    • Streaming large results
  4. Observability

    • Prometheus metrics
    • OpenTelemetry tracing
    • Access logs
  5. Security

    • Rate limiting per user
    • Request size limits
    • SQL injection prevention
    • TLS/HTTPS support

Silver-Level Compatibility

This implementation achieves Silver-level compatibility:

  • Core CRUD operations for all three APIs
  • Authentication mechanisms (JWT, Bearer, API Key)
  • Async query execution with polling
  • Vector similarity search
  • Proper HTTP status codes and error handling
  • JSON request/response format matching
  • ⚠ Limited to common use cases (no advanced features)
  • ⚠ Mock results (integration with compute engine needed)

For Gold-level compatibility, implement:

  • Full query result pagination
  • Advanced filtering and metadata queries
  • Streaming result delivery
  • All optional parameters and edge cases
  • Performance optimizations for production scale