Skip to content

Feature 12: GraphQL-to-SQL Gateway

Feature 12: GraphQL-to-SQL Gateway

Priority: Medium | Complexity: High | Phase: 4 (Differentiation)


Overview

Problem Statement

Modern applications increasingly use GraphQL:

  • Frontend developers prefer GraphQL’s flexibility
  • Reduces over-fetching and under-fetching
  • Strong typing and introspection
  • Unified API for multiple backends

But database access still requires SQL:

  • ORMs add latency and complexity
  • Hand-written resolvers are error-prone
  • N+1 query problems in naive implementations
  • Separate GraphQL server needed

Solution

Implement a GraphQL-to-SQL gateway that automatically generates efficient queries:

┌─────────────────────────────────────────────────┐
│ GRAPHQL GATEWAY │
│ │
GraphQL ─────────►│ ┌──────────────────────────────────────────┐ │
Query │ │ 1. Parse & Validate GraphQL │ │
│ │ - Schema validation │ │
│ │ - Authorization checks │ │
│ └──────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ 2. Query Planning │ │
│ │ - Resolve relationships │ │
│ │ - Detect N+1 patterns │ │
│ │ - Batch related queries │ │
│ └──────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ 3. SQL Generation │ │
│ │ - JOINs for relationships │ │
│ │ - Lateral subqueries │ │
│ │ - Aggregations │ │
│ └──────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ 4. Execute & Shape Response │ │
│ │ - Execute SQL │ │
│ │ - Transform to GraphQL shape │ │
│ └──────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘

Architecture

GraphQL Engine

use async_graphql::{Schema, Object, Context, Result};
pub struct GraphQLEngine {
/// GraphQL schema
schema: Schema<Query, Mutation, Subscription>,
/// Schema introspection
introspector: SchemaIntrospector,
/// SQL generator
sql_generator: SqlGenerator,
/// Query executor
executor: Arc<QueryExecutor>,
/// Cache
cache: Arc<QueryCache>,
}
impl GraphQLEngine {
pub async fn execute(&self, request: GraphQLRequest) -> GraphQLResponse {
// 1. Parse and validate
let document = self.parse(&request.query)?;
self.validate(&document)?;
// 2. Check authorization
self.authorize(&document, &request.context)?;
// 3. Plan query execution
let plan = self.plan(&document)?;
// 4. Generate SQL
let sql_queries = self.sql_generator.generate(&plan)?;
// 5. Execute (with batching)
let results = self.execute_batch(&sql_queries).await?;
// 6. Shape response
self.shape_response(&plan, &results)
}
}

Schema Introspector

pub struct SchemaIntrospector {
/// Database connection
db: Arc<DatabasePool>,
}
impl SchemaIntrospector {
/// Introspect database schema and generate GraphQL schema
pub async fn introspect(&self) -> Result<GraphQLSchema> {
// 1. Get tables
let tables = self.get_tables().await?;
// 2. Get columns
let columns = self.get_columns().await?;
// 3. Get relationships (foreign keys)
let relationships = self.get_relationships().await?;
// 4. Generate GraphQL types
let types = self.generate_types(&tables, &columns)?;
// 5. Generate queries
let queries = self.generate_queries(&tables, &relationships)?;
// 6. Generate mutations
let mutations = self.generate_mutations(&tables)?;
Ok(GraphQLSchema {
types,
queries,
mutations,
relationships,
})
}
fn generate_types(&self, tables: &[Table], columns: &[Column]) -> Vec<GraphQLType> {
tables.iter().map(|table| {
let fields: Vec<_> = columns.iter()
.filter(|c| c.table == table.name)
.map(|c| GraphQLField {
name: c.name.to_camel_case(),
graphql_type: self.sql_to_graphql_type(&c.data_type),
nullable: c.nullable,
})
.collect();
GraphQLType {
name: table.name.to_pascal_case(),
fields,
}
}).collect()
}
}

SQL Generator

pub struct SqlGenerator {
/// Relationship resolver
relationships: Arc<RelationshipResolver>,
}
impl SqlGenerator {
/// Generate optimized SQL from GraphQL query
pub fn generate(&self, plan: &QueryPlan) -> Vec<SqlQuery> {
match plan {
QueryPlan::Single { selection, filters, limit, offset } => {
vec![self.generate_single(selection, filters, limit, offset)]
}
QueryPlan::Relationship { parent, child, relation_type } => {
// Use JOIN or lateral subquery based on cardinality
match relation_type {
RelationType::OneToOne | RelationType::ManyToOne => {
vec![self.generate_with_join(parent, child)]
}
RelationType::OneToMany | RelationType::ManyToMany => {
vec![self.generate_with_lateral(parent, child)]
}
}
}
QueryPlan::Batch { queries } => {
// Batch multiple queries into single statement
self.generate_batch(queries)
}
}
}
fn generate_single(&self, selection: &Selection, filters: &[Filter], limit: &Option<u32>, offset: &Option<u32>) -> SqlQuery {
let columns = self.resolve_columns(selection);
let table = selection.table_name();
let where_clause = self.build_where(filters);
let mut sql = format!(
"SELECT {} FROM {}",
columns.join(", "),
table
);
if !where_clause.is_empty() {
sql.push_str(&format!(" WHERE {}", where_clause));
}
if let Some(l) = limit {
sql.push_str(&format!(" LIMIT {}", l));
}
if let Some(o) = offset {
sql.push_str(&format!(" OFFSET {}", o));
}
SqlQuery { sql, params: vec![] }
}
fn generate_with_join(&self, parent: &Selection, child: &Selection) -> SqlQuery {
// Example: user { posts { ... } }
// SELECT u.*, p.* FROM users u JOIN posts p ON p.user_id = u.id
let parent_alias = "p";
let child_alias = "c";
let join_condition = self.get_join_condition(parent, child);
SqlQuery {
sql: format!(
"SELECT {}.*, {}.* FROM {} {} JOIN {} {} ON {}",
parent_alias,
child_alias,
parent.table_name(),
parent_alias,
child.table_name(),
child_alias,
join_condition
),
params: vec![],
}
}
fn generate_with_lateral(&self, parent: &Selection, child: &Selection) -> SqlQuery {
// For one-to-many, use LATERAL to avoid cartesian explosion
// SELECT u.*, LATERAL (SELECT array_agg(p.*) FROM posts p WHERE p.user_id = u.id) AS posts
// FROM users u
SqlQuery {
sql: format!(
"SELECT {}, LATERAL (
SELECT json_agg({}.*)
FROM {} {}
WHERE {}.{} = {}.{}
) AS {}
FROM {} {}",
self.resolve_columns(&parent).join(", "),
"c",
child.table_name(), "c",
"c", child.foreign_key(), "p", parent.primary_key(),
child.field_name(),
parent.table_name(), "p"
),
params: vec![],
}
}
}

DataLoader for N+1 Prevention

pub struct DataLoader<K, V> {
/// Batch load function
loader: Arc<dyn Fn(Vec<K>) -> HashMap<K, V> + Send + Sync>,
/// Pending keys
pending: Mutex<Vec<K>>,
/// Cached results
cache: DashMap<K, V>,
/// Batch window
batch_window: Duration,
}
impl<K: Hash + Eq + Clone, V: Clone> DataLoader<K, V> {
pub async fn load(&self, key: K) -> Option<V> {
// Check cache
if let Some(value) = self.cache.get(&key) {
return Some(value.clone());
}
// Add to pending batch
self.pending.lock().push(key.clone());
// Wait for batch window
tokio::time::sleep(self.batch_window).await;
// Execute batch load
let keys: Vec<_> = self.pending.lock().drain(..).collect();
let results = (self.loader)(keys);
// Cache results
for (k, v) in &results {
self.cache.insert(k.clone(), v.clone());
}
results.get(&key).cloned()
}
}

API Specification

Configuration (heliosproxy.toml)

[graphql]
enabled = true
endpoint = "/graphql"
playground = true
introspection = true
# Schema generation
[graphql.schema]
auto_generate = true
refresh_interval = "5m"
# Table configuration
[[graphql.tables]]
name = "users"
graphql_name = "User"
exclude_columns = ["password_hash"]
max_depth = 3
[[graphql.tables]]
name = "posts"
graphql_name = "Post"
# Custom relationships
[[graphql.relationships]]
name = "author"
from_table = "posts"
to_table = "users"
from_column = "user_id"
to_column = "id"
type = "many_to_one"
# Query complexity limits
[graphql.limits]
max_depth = 10
max_complexity = 1000
max_aliases = 10
# Batching
[graphql.batching]
enabled = true
window_ms = 10
max_batch_size = 100
# Caching
[graphql.caching]
enabled = true
default_ttl = "60s"

GraphQL Endpoint

# Auto-generated schema from database
type User {
id: ID!
name: String!
email: String!
createdAt: DateTime!
posts: [Post!]!
postsCount: Int!
}
type Post {
id: ID!
title: String!
content: String!
author: User!
createdAt: DateTime!
}
type Query {
user(id: ID!): User
users(limit: Int, offset: Int, where: UserFilter): [User!]!
post(id: ID!): Post
posts(limit: Int, offset: Int, where: PostFilter): [Post!]!
}
type Mutation {
createUser(input: CreateUserInput!): User!
updateUser(id: ID!, input: UpdateUserInput!): User!
deleteUser(id: ID!): Boolean!
createPost(input: CreatePostInput!): Post!
}
input UserFilter {
id: IDFilter
name: StringFilter
email: StringFilter
createdAt: DateTimeFilter
AND: [UserFilter!]
OR: [UserFilter!]
}
input StringFilter {
eq: String
ne: String
contains: String
startsWith: String
endsWith: String
in: [String!]
}

Example Queries

# Simple query
query GetUser {
user(id: "123") {
name
email
}
}
# Generated SQL:
# SELECT name, email FROM users WHERE id = '123'
# Nested query with relationship
query GetUserWithPosts {
user(id: "123") {
name
posts {
title
content
}
}
}
# Generated SQL (using lateral):
# SELECT u.name,
# LATERAL (SELECT json_agg(p.*) FROM posts p WHERE p.user_id = u.id) AS posts
# FROM users u WHERE u.id = '123'
# Filtered query
query GetActiveUsers {
users(where: { email: { endsWith: "@example.com" } }, limit: 10) {
id
name
postsCount
}
}
# Generated SQL:
# SELECT u.id, u.name, (SELECT COUNT(*) FROM posts p WHERE p.user_id = u.id) AS posts_count
# FROM users u WHERE u.email LIKE '%@example.com' LIMIT 10

AI/Agent Innovations

1. Natural Language to GraphQL

Allow natural language queries:

# Extension for NL queries
query NaturalLanguage {
ask(question: "Show me all users who posted in the last week") {
... on User {
name
posts {
title
}
}
}
}
pub struct NLToGraphQL {
/// LLM for query generation
llm: Arc<LLMClient>,
/// Schema context
schema: Arc<GraphQLSchema>,
}
impl NLToGraphQL {
pub async fn translate(&self, question: &str) -> GraphQLQuery {
let prompt = format!(
"Given this GraphQL schema:\n{}\n\nGenerate a GraphQL query for: {}",
self.schema.to_sdl(),
question
);
let response = self.llm.complete(&prompt).await?;
self.parse_graphql(&response)
}
}

2. RAG Pipeline GraphQL

GraphQL interface for RAG operations:

# RAG-specific queries
type Query {
# Semantic search
searchDocuments(
query: String!
limit: Int = 10
threshold: Float = 0.7
): [DocumentMatch!]!
# RAG retrieval
retrieve(
question: String!
topK: Int = 5
): RetrievalResult!
}
type DocumentMatch {
document: Document!
score: Float!
chunk: String!
}
type RetrievalResult {
chunks: [Chunk!]!
context: String!
sources: [Source!]!
}

3. Agent Tool Interface

GraphQL as universal tool interface:

# Agent tool schema
type Mutation {
# Execute agent action
executeAction(
action: String!
parameters: JSON!
): ActionResult!
}
type ActionResult {
success: Boolean!
result: JSON
error: String
}
# Example usage by agent:
mutation ExecuteTool {
executeAction(
action: "database_query"
parameters: { table: "users", filter: { active: true } }
) {
success
result
}
}

4. Conversation Context API

GraphQL for conversation management:

type Query {
conversation(id: ID!): Conversation
conversations(userId: ID!, limit: Int): [Conversation!]!
}
type Mutation {
createConversation(userId: ID!): Conversation!
addTurn(conversationId: ID!, input: TurnInput!): Turn!
}
type Conversation {
id: ID!
turns: [Turn!]!
context: JSON
createdAt: DateTime!
}
type Turn {
role: Role!
content: String!
timestamp: DateTime!
}

HeliosDB-Lite Integration

1. Branch-Aware GraphQL

GraphQL across branches:

# Query specific branch
query GetBranchData {
users @branch(name: "development") {
id
name
}
}
# Compare branches
query CompareBranches {
main: users @branch(name: "main") { id name }
dev: users @branch(name: "development") { id name }
}

2. Time-Travel GraphQL

Historical queries via GraphQL:

# Query as of specific time
query GetHistoricalData {
users @asOf(timestamp: "2025-12-01T00:00:00Z") {
id
name
}
}
# Get change history
query GetHistory {
userHistory(id: "123", since: "2025-01-01") {
version
changes
timestamp
}
}

3. Vector Search GraphQL

Vector operations in GraphQL:

type Query {
# Vector similarity search
similarDocuments(
embedding: [Float!]!
limit: Int = 10
metric: DistanceMetric = COSINE
): [DocumentMatch!]!
# Combined text + vector search
hybridSearch(
text: String!
embedding: [Float!]
alpha: Float = 0.5
): [SearchResult!]!
}
enum DistanceMetric {
COSINE
EUCLIDEAN
DOT_PRODUCT
}

4. Sync Mode Directives

Control consistency via directives:

# Strong consistency read
query CriticalRead {
account(id: "123") @consistency(level: STRONG) {
balance
}
}
# Eventual consistency (faster)
query AnalyticsRead {
metrics @consistency(level: EVENTUAL) {
totalUsers
activeToday
}
}

Implementation Notes

File Locations

src/proxy/
├── graphql/
│ ├── mod.rs # Public API
│ ├── engine.rs # GraphQLEngine
│ ├── introspector.rs # SchemaIntrospector
│ ├── sql_generator.rs # SqlGenerator
│ ├── dataloader.rs # DataLoader
│ ├── resolver.rs # Field resolvers
│ ├── validation.rs # Query validation
│ └── metrics.rs # GraphQL metrics

Key Considerations

  1. N+1 Prevention: Use DataLoader pattern for batching.

  2. Query Complexity: Limit depth and complexity to prevent abuse.

  3. Caching: Cache compiled queries and schema.

  4. Subscriptions: Consider WebSocket support for real-time.

  5. Authorization: Per-field authorization checks.

Query Complexity Calculation

pub fn calculate_complexity(query: &Document, schema: &Schema) -> u32 {
let mut complexity = 0;
for field in query.fields() {
// Base cost per field
complexity += 1;
// Multiplier for lists
if field.is_list() {
let limit = field.argument("limit").unwrap_or(100);
complexity *= limit;
}
// Recursive cost for nested fields
complexity += calculate_complexity(&field.selection_set, schema);
}
complexity
}

Performance Targets

MetricTargetMeasurement
Query parsing<1msp99
SQL generation<5msp99
Response shaping<2msp99
Introspection<100msfull schema