Instructions to use craterlabs/Struct-SQL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use craterlabs/Struct-SQL with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="craterlabs/Struct-SQL")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("craterlabs/Struct-SQL")
model = AutoModelForMultimodalLM.from_pretrained("craterlabs/Struct-SQL")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use craterlabs/Struct-SQL with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "craterlabs/Struct-SQL"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "craterlabs/Struct-SQL",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/craterlabs/Struct-SQL

SGLang

How to use craterlabs/Struct-SQL with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "craterlabs/Struct-SQL" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "craterlabs/Struct-SQL",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "craterlabs/Struct-SQL" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "craterlabs/Struct-SQL",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use craterlabs/Struct-SQL with Docker Model Runner:
```
docker model run hf.co/craterlabs/Struct-SQL
```

Struct-SQL / README.md

KhushbooThaker

Update README.md

f9774fd verified 5 months ago

preview code

Raw

History Blame Contribute Delete

4.29 kB

	---
	license: cc-by-4.0
	library_name: transformers
	tags:
	- text-to-sql
	- code
	- qwen3
	- knowledge-distillation
	datasets:
	- birdsql/bird_mini_dev # Links to the official BIRD Mini-dev dataset
	- craterlabs/struct-sql-data # REPLACE this with your actual dataset ID
	base_model:
	- Qwen/Qwen3-4B-Instruct-2507
	language:
	- en
	---

	# Struct-SQL-8B: Knowledge Distillation with Structured Chain-of-Thought

	Struct-SQL is a specialized Text-to-SQL model based on [Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507). It was trained using a novel Knowledge Distillation (KD) framework that transfers structured reasoning (Query Execution Plans) from a state-of-the-art teacher LLM (GPT-4o) to a smaller student model.

	Unlike standard distillation methods that rely on unstructured Chain-of-Thought (CoT), Struct-SQL learns to generate a formal, logical blueprint (a query plan) before generating the final SQL. This approach significantly reduces syntactic errors and schema hallucinations.

	📄 Paper: [Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL](https://arxiv.org/abs/2512.17053)
	(Accepted at Canadian AI Conference 2026)


	## Performance

	On the BIRD mini-dev benchmark, Struct-SQL achieves an Execution Accuracy (EX) of 45.0%, outperforming standard unstructured CoT distillation baselines by 8.1 points.

	\| Model \| Distillation Method \| Execution Accuracy (EX) \|
	\|:---\|:---\|:---\|
	\| Struct-SQL (Ours) \| Structured QP-CoT \| 45.0% \|
	\| ReasonSQL Baseline \| Unstructured CoT \| 36.9% \|
	\| FN-Gold Baseline \| No Reasoning (SQL Only) \| 34.3% \|
	\| Base Student (Zero-shot) \| None \| 17.0% \|

	---
	## Methodology

	The model was trained on a curated dataset of 1,000 samples generated by GPT-4o. The training data consists of:
	1. Input: Natural Language Question + Database Schema.
	2. Output: A structured Query Execution Plan (Reasoning) + Final SQL Query.

	By forcing the model to explicitly plan the query execution (e.g., "Scan Table", "Filter by...", "Join with..."), the model learns the logical structure of SQL generation rather than just memorizing patterns.

	---
	## Usage

	You can use this model with the `transformers` library. It expects the input to be formatted with a specific system prompt or structure if you want to elicit the query plan.

	---
	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "craterlabs/Struct-SQL"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.float16,
	device_map="auto"
	)

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=1200)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```
	---
	## Intended Use

	Struct-SQL-4B is intended for research and academic use in tasks involving Text-to-SQL generation and semantic parsing over relational databases. The model is particularly suited for studying:

	- Knowledge distillation techniques that leverage structured intermediate representations
	- Explicit query planning as an alternative to unstructured chain-of-thought reasoning
	- Error reduction in SQL generation, including syntactic validity and schema grounding
	- Compact language models for complex reasoning under limited parameter budgets

	The model is not optimized for direct deployment in production database systems without additional validation and safety constraints.

	---
	## Limitations

	- Evaluation is confined to the SQLite-based BIRD benchmark
	- The model may generate logically plausible but incorrect SQL for highly complex multi-hop queries

	---
	## Citation

	```bibtex
	@article{thaker2025knowledge,
	title={Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL},
	author={Thaker, Khushboo and Bresler, Yony},
	journal={arXiv preprint arXiv:2512.17053},
	year={2025}
	}
	@inproceedings{thaker2026knowledge,
	title={Struct-SQL: Distilling Structured Reasoning for Small Text-to-SQL Models},
	author={Thaker, Khushboo and Bresler, Yony},
	booktitle={Proceedings of the 39th Canadian Conference on Artificial Intelligence},
	year={2026},
	note={To appear}
	}