Skip to content

Schema Generation API

Schema Generation API

Auto-generate database schemas from JSON samples - Perfect for AI agents and rapid prototyping


Overview

The Schema Generation API analyzes your JSON data samples and automatically generates:

  • SQL DDL statements ready for execution
  • Column types with smart inference (VARCHAR, INTEGER, VECTOR, JSONB, etc.)
  • Constraints (PRIMARY KEY, UNIQUE, NOT NULL)
  • Indexes recommendations
  • Confidence scores for each inference

Available on Free Tier - No credit card required.


Quick Start

1. Get Your Token

Terminal window
# Sign up for free
curl -X POST http://localhost:6543/auth/v1/signup \
-H "Content-Type: application/json" \
-d '{"email": "dev@example.com", "password": "secure123"}'
# Get token
curl -X POST http://localhost:6543/auth/v1/token \
-H "Content-Type: application/json" \
-d '{"email": "dev@example.com", "password": "secure123"}'

2. Infer Schema from JSON

Terminal window
curl -X POST http://localhost:6543/api/v1/schema/infer \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"samples": [
{
"id": 1,
"name": "Widget",
"price": 29.99,
"in_stock": true,
"tags": ["electronics", "gadget"]
},
{
"id": 2,
"name": "Gadget Pro",
"price": 49.99,
"in_stock": false,
"tags": ["electronics", "premium"]
}
],
"table_name": "products"
}'

3. Get Production-Ready DDL

{
"success": true,
"data": {
"table_name": "products",
"columns": [
{"name": "id", "sql_type": "INTEGER", "nullable": false, "confidence": 0.95},
{"name": "name", "sql_type": "VARCHAR(20)", "nullable": false, "confidence": 0.95},
{"name": "price", "sql_type": "NUMERIC", "nullable": false, "confidence": 0.95},
{"name": "in_stock", "sql_type": "BOOLEAN", "nullable": false, "confidence": 0.95},
{"name": "tags", "sql_type": "JSON", "nullable": false, "confidence": 0.90}
],
"primary_key": ["id"],
"ddl": "CREATE TABLE products (\n id INTEGER NOT NULL,\n name VARCHAR(20) NOT NULL,\n price NUMERIC NOT NULL,\n in_stock BOOLEAN NOT NULL,\n tags JSON NOT NULL,\n PRIMARY KEY (id)\n);",
"confidence": 0.94,
"warnings": []
}
}

4. Execute the DDL

Terminal window
curl -X POST http://localhost:6543/api/v1/query \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"sql": "CREATE TABLE products (\n id INTEGER NOT NULL,\n name VARCHAR(20) NOT NULL,\n price NUMERIC NOT NULL,\n in_stock BOOLEAN NOT NULL,\n tags JSON NOT NULL,\n PRIMARY KEY (id)\n);"}'

API Endpoints

Infer Schema

POST /api/v1/schema/infer

Infer schema from JSON samples.

Request Body:

FieldTypeRequiredDescription
samplesarrayYesArray of JSON objects (1-1000)
table_namestringNoTable name (default: “inferred_table”)
optionsobjectNoInference options

Options:

OptionTypeDefaultDescription
detect_nullablebooleantrueDetect nullable columns
detect_uniquebooleantrueDetect unique columns
detect_primary_keybooleantrueSuggest primary key
detect_foreign_keysbooleanfalseDetect foreign key patterns
suggest_indexesbooleantrueSuggest indexes
prefer_narrow_typesbooleantruePrefer INT over BIGINT
max_varchar_lengthnumber255Max VARCHAR before TEXT
detect_vectorsbooleantrueDetect vector columns
detect_jsonbooleantrueDetect JSON columns

Example with Options:

{
"samples": [...],
"table_name": "events",
"options": {
"prefer_narrow_types": false,
"max_varchar_length": 500,
"detect_vectors": true
}
}

Batch Inference

POST /api/v1/schema/infer/batch

Infer schemas for multiple tables at once.

Request:

{
"tables": [
{
"name": "users",
"samples": [
{"id": 1, "name": "Alice", "email": "alice@example.com"},
{"id": 2, "name": "Bob", "email": "bob@example.com"}
]
},
{
"name": "orders",
"samples": [
{"id": 1, "user_id": 1, "total": 99.99},
{"id": 2, "user_id": 2, "total": 149.99}
]
}
],
"detect_relationships": true
}

Response:

{
"success": true,
"data": {
"schemas": [
{"table_name": "users", "ddl": "...", "confidence": 0.9},
{"table_name": "orders", "ddl": "...", "confidence": 0.9}
],
"relationships": [
{
"from_table": "orders",
"from_column": "user_id",
"to_table": "users",
"to_column": "id",
"relationship_type": "many-to-one",
"confidence": 0.85
}
],
"combined_ddl": "CREATE TABLE users (...);\n\nCREATE TABLE orders (...);\n\nALTER TABLE orders ADD CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id);"
}
}

Validate Schema

POST /api/v1/schema/validate

Validate DDL syntax and best practices.

{
"ddl": "CREATE TABLE users (id INT, name TEXT)"
}

Response:

{
"success": true,
"data": {
"valid": true,
"errors": [],
"warnings": [
{"code": "MISSING_PK", "message": "No PRIMARY KEY defined"},
{"code": "MISSING_NOT_NULL", "message": "Consider adding NOT NULL constraints"}
],
"suggestions": [
"Add PRIMARY KEY constraint to 'id' column",
"Consider VARCHAR(n) instead of TEXT for bounded strings"
]
}
}

Compare Schemas

POST /api/v1/schema/compare

Compare two schema versions and generate migration.

{
"source": "CREATE TABLE users (id INT, name TEXT)",
"target": "CREATE TABLE users (id INT, name VARCHAR(100), email TEXT)",
"generate_migration": true
}

Response:

{
"success": true,
"data": {
"differences": [
{"diff_type": "column_modified", "object": "name", "source_state": "TEXT", "target_state": "VARCHAR(100)"},
{"diff_type": "column_added", "object": "email", "target_state": "TEXT"}
],
"forward_migration": "ALTER TABLE users ALTER COLUMN name TYPE VARCHAR(100);\nALTER TABLE users ADD COLUMN email TEXT;",
"backward_migration": "ALTER TABLE users DROP COLUMN email;\nALTER TABLE users ALTER COLUMN name TYPE TEXT;",
"is_compatible": true
}
}

Type Detection

Supported Types

JSON ValueDetected SQL Type
true/falseBOOLEAN
123 (integer)INTEGER or BIGINT
123.45 (decimal)NUMERIC
"hello" (short)VARCHAR(n)
"very long..."TEXT
"2024-01-15"DATE*
"2024-01-15T10:00:00Z"TIMESTAMP*
[0.1, 0.2, 0.3] (floats)VECTOR(n)
["a", "b"] (mixed)JSON
{"key": "value"}JSONB
nullmarks column nullable

*Date/timestamp detection based on pattern matching

Vector Detection

Arrays of numbers are detected as vectors:

{
"samples": [
{
"id": 1,
"content": "Hello world",
"embedding": [0.1, 0.2, 0.3, 0.4, 0.5]
}
]
}

Result:

CREATE TABLE documents (
id INTEGER NOT NULL,
content VARCHAR(22) NOT NULL,
embedding VECTOR(5) NOT NULL,
PRIMARY KEY (id)
);

For AI Agents

Structured Output for LLMs

Schema inference is designed for AI coding assistants:

# AI Agent workflow
import requests
def create_table_from_data(data_samples, table_name):
"""Let AI infer schema from example data."""
# 1. Infer schema
response = requests.post(
"http://localhost:6543/api/v1/schema/infer",
headers={"Authorization": f"Bearer {TOKEN}"},
json={
"samples": data_samples,
"table_name": table_name
}
)
schema = response.json()["data"]
# 2. Review confidence
if schema["confidence"] < 0.7:
print(f"Low confidence: {schema['warnings']}")
return None
# 3. Execute DDL
requests.post(
"http://localhost:6543/api/v1/query",
headers={"Authorization": f"Bearer {TOKEN}"},
json={"sql": schema["ddl"]}
)
return schema
# Usage by AI agent
samples = [
{"user_id": 1, "action": "click", "timestamp": "2024-01-15T10:00:00Z"},
{"user_id": 2, "action": "purchase", "amount": 99.99}
]
create_table_from_data(samples, "user_events")

Agent-Friendly Response Format

{
"success": true,
"data": {
"table_name": "user_events",
"ddl": "...",
"columns": [...],
"confidence": 0.85,
"warnings": [],
"_agent_hints": {
"ready_to_execute": true,
"suggested_indexes": ["user_id"],
"potential_issues": []
}
}
}

Free Tier Limits

FeatureFree TierStarterPro
Schema inferences/day1001,000Unlimited
Samples per request1005001,000
Batch tables52050
Relationship detectionYesYesYes
Migration generationYesYesYes

Best Practices

1. Provide Diverse Samples

// Good: Shows variety
{
"samples": [
{"price": 29.99, "discount": null},
{"price": 0.99, "discount": 0.10},
{"price": 999.99, "discount": 0.50}
]
}
// Bad: Single sample
{
"samples": [
{"price": 29.99}
]
}

2. Include Edge Cases

{
"samples": [
{"name": "A", "tags": ["a"]}, // Short string, single tag
{"name": "Very Long Product Name Here", "tags": ["a", "b", "c"]}, // Long string, multiple tags
{"name": null, "tags": []} // Null case
]
}

3. Use Meaningful Table Names

// Good
{"table_name": "user_sessions"}
// Bad
{"table_name": "table1"}

Examples

E-commerce Product Catalog

Terminal window
curl -X POST http://localhost:6543/api/v1/schema/infer \
-H "Authorization: Bearer $TOKEN" \
-d '{
"samples": [
{
"sku": "PROD-001",
"name": "Wireless Headphones",
"description": "High-quality wireless headphones with noise cancellation",
"price": 149.99,
"currency": "USD",
"in_stock": true,
"stock_count": 250,
"categories": ["electronics", "audio"],
"embedding": [0.1, 0.2, 0.3, 0.4, 0.5],
"created_at": "2024-01-15T00:00:00Z"
}
],
"table_name": "products"
}'

User Analytics Events

Terminal window
curl -X POST http://localhost:6543/api/v1/schema/infer \
-H "Authorization: Bearer $TOKEN" \
-d '{
"samples": [
{
"event_id": "evt_123abc",
"user_id": "usr_456def",
"event_type": "page_view",
"properties": {"page": "/home", "referrer": "google.com"},
"timestamp": "2024-01-15T10:30:00Z",
"session_id": "sess_789ghi"
}
],
"table_name": "analytics_events"
}'

RAG Document Store

Terminal window
curl -X POST http://localhost:6543/api/v1/schema/infer \
-H "Authorization: Bearer $TOKEN" \
-d '{
"samples": [
{
"doc_id": "doc-001",
"title": "Introduction to Machine Learning",
"content": "Machine learning is a subset of artificial intelligence...",
"embedding": [/* 1536 dimensions for OpenAI embeddings */],
"metadata": {
"source": "textbook",
"chapter": 1,
"author": "Dr. Smith"
},
"chunk_index": 0,
"total_chunks": 15
}
],
"table_name": "documents",
"options": {
"detect_vectors": true
}
}'

Error Handling

Common Errors

ErrorCauseSolution
EMPTY_SAMPLESNo samples providedAdd at least 1 sample
INVALID_JSONMalformed JSONValidate JSON syntax
TYPE_CONFLICTInconsistent typesEnsure consistent types across samples
TOO_MANY_SAMPLESExceeded limitReduce to plan limit

Example Error Response

{
"success": false,
"error": {
"code": "TYPE_CONFLICT",
"message": "Column 'price' has conflicting types: INTEGER, TEXT",
"details": {
"column": "price",
"sample_1_type": "INTEGER",
"sample_3_type": "TEXT"
}
}
}

Next Steps


Questions? API Reference | Join Discord