DuckDB-NSQL-7B-v0.1-mlx
This is an MLX-optimized version of motherduckdb/DuckDB-NSQL-7B-v0.1, converted for efficient inference on Apple Silicon (M1/M2/M3/M4).
Model Description
DuckDB-NSQL-7B is a 7-billion parameter language model fine-tuned for generating DuckDB SQL queries from natural language questions. This MLX conversion provides significant performance improvements on Apple Silicon Macs compared to PyTorch CPU inference.
Conversion Details
- Base Model: motherduckdb/DuckDB-NSQL-7B-v0.1
- Precision: Float16 (FP16)
- Framework: MLX
- Optimized for: Apple Silicon (M1/M2/M3/M4)
- Model Size: ~13.5 GB
- Converted by: aikhan1
Installation
pip install mlx-lm
Usage
Basic Inference
from mlx_lm import load, generate
# Load the model
model, tokenizer = load("aikhan1/DuckDB-NSQL-7B-v0.1-mlx")
# Example schema
schema = """
CREATE TABLE hospitals (
hospital_id BIGINT PRIMARY KEY,
hospital_name VARCHAR,
region VARCHAR,
bed_capacity INTEGER
);
CREATE TABLE patients (
patient_id BIGINT PRIMARY KEY,
full_name VARCHAR,
gender VARCHAR,
date_of_birth DATE,
region VARCHAR
);
"""
# Example question
question = "How many patients are there in each region?"
# Build prompt
prompt = f"""You are an assistant that writes valid DuckDB SQL queries.
### Schema:
{schema}
### Question:
{question}
### Response (DuckDB SQL only):"""
# Generate SQL
response = generate(model, tokenizer, prompt=prompt, max_tokens=200, temp=0.0)
print(response)
Using MLX Server
# Start the server
mlx_lm.server --model aikhan1/DuckDB-NSQL-7B-v0.1-mlx --port 8080
# In another terminal, make requests
curl -X POST http://localhost:8080/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "CREATE TABLE patients(...)\n\nQuestion: Count patients by region\n\nSQL:",
"max_tokens": 200,
"temperature": 0
}'
Performance Comparison
Speed
On Apple Silicon M-series chips:
| Model | M1 Pro/Max | M2/M3 Series |
|---|---|---|
| FP16 | ~30-50 tok/s | ~50-80 tok/s |
| 8-bit | ~60-120 tok/s | ~120-200 tok/s |
| 4-bit | ~90-180 tok/s | ~180-300 tok/s |
Memory Usage
- FP16 model: ~13.5 GB
- 8-bit model: ~3.5 GB (74% reduction)
- 4-bit model: ~2 GB (85% reduction)
Quality
The FP16 version provides 100% accuracy relative to the original model, with no quantization loss. This is the reference version for maximum quality.
Why FP16?
The FP16 version is ideal for:
โ
Maximum Accuracy: No quantization, full model precision
โ
Reference Quality: 100% of original model capability
โ
MLX Optimization: Still faster than PyTorch CPU
โ
Production Critical: When accuracy is paramount
Recommended for: When you have sufficient memory (16GB+ RAM) and need maximum accuracy
Trade-offs: Larger size and slower than quantized versions, but no quality loss
Prompt Format
The model expects prompts in this format:
You are an assistant that writes valid DuckDB SQL queries.
### Schema:
CREATE TABLE table_name (
column1 TYPE,
column2 TYPE
);
### Question:
[Your natural language question]
### Response (DuckDB SQL only):
Limitations
- The model is trained specifically for DuckDB SQL syntax
- Complex queries may require post-processing
- The model may occasionally generate invalid SQL for complex schemas
- Best performance on well-defined schemas with clear column names
- Requires ~16GB+ RAM for comfortable inference
Model Versions
| Version | Size | Speed | Quality | Use Case |
|---|---|---|---|---|
| FP16 | 13.5 GB | 1x | 100% | Maximum accuracy |
| 8-bit | 3.5 GB | 2-3x | ~99% | Production (recommended) |
| 4-bit | 2 GB | 3-4x | ~97% | Resource-constrained |
Which Version Should I Use?
- FP16: You need absolute maximum accuracy and have 16GB+ RAM
- 8-bit: Best balance for production (recommended for most users) - nuxera/duckdb-nsql-7b-mlx-8bit
- 4-bit: You're running on limited hardware or need maximum speed - nuxera/duckdb-nsql-7b-mlx-4bit
License
This model inherits the Llama 2 Community License Agreement from the base model.
Citation
@misc{duckdb-nsql-mlx,
title={DuckDB-NSQL-7B MLX Conversion},
author={aikhan1},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/aikhan1/DuckDB-NSQL-7B-v0.1-mlx}}
}
Original model:
@misc{duckdb-nsql,
title={DuckDB-NSQL-7B: Natural Language to SQL for DuckDB},
author={MotherDuck},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/motherduckdb/DuckDB-NSQL-7B-v0.1}}
}
Acknowledgments
- Original model by MotherDuck
- MLX framework by Apple ML Research
- Converted using mlx-lm
- Nuxera AI
- Downloads last month
- 16
Model tree for Nuxera/duckdb-nsql-7b-mlx
Base model
meta-llama/Llama-2-7b