DuckDB-NSQL-7B-v0.1-mlx

This is an MLX-optimized version of motherduckdb/DuckDB-NSQL-7B-v0.1, converted for efficient inference on Apple Silicon (M1/M2/M3/M4).

Model Description

DuckDB-NSQL-7B is a 7-billion parameter language model fine-tuned for generating DuckDB SQL queries from natural language questions. This MLX conversion provides significant performance improvements on Apple Silicon Macs compared to PyTorch CPU inference.

Conversion Details

  • Base Model: motherduckdb/DuckDB-NSQL-7B-v0.1
  • Precision: Float16 (FP16)
  • Framework: MLX
  • Optimized for: Apple Silicon (M1/M2/M3/M4)
  • Model Size: ~13.5 GB
  • Converted by: aikhan1

Installation

pip install mlx-lm

Usage

Basic Inference

from mlx_lm import load, generate

# Load the model
model, tokenizer = load("aikhan1/DuckDB-NSQL-7B-v0.1-mlx")

# Example schema
schema = """
CREATE TABLE hospitals (
    hospital_id BIGINT PRIMARY KEY,
    hospital_name VARCHAR,
    region VARCHAR,
    bed_capacity INTEGER
);

CREATE TABLE patients (
    patient_id BIGINT PRIMARY KEY,
    full_name VARCHAR,
    gender VARCHAR,
    date_of_birth DATE,
    region VARCHAR
);
"""

# Example question
question = "How many patients are there in each region?"

# Build prompt
prompt = f"""You are an assistant that writes valid DuckDB SQL queries.

### Schema:
{schema}

### Question:
{question}

### Response (DuckDB SQL only):"""

# Generate SQL
response = generate(model, tokenizer, prompt=prompt, max_tokens=200, temp=0.0)
print(response)

Using MLX Server

# Start the server
mlx_lm.server --model aikhan1/DuckDB-NSQL-7B-v0.1-mlx --port 8080

# In another terminal, make requests
curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "CREATE TABLE patients(...)\n\nQuestion: Count patients by region\n\nSQL:",
    "max_tokens": 200,
    "temperature": 0
  }'

Performance Comparison

Speed

On Apple Silicon M-series chips:

Model M1 Pro/Max M2/M3 Series
FP16 ~30-50 tok/s ~50-80 tok/s
8-bit ~60-120 tok/s ~120-200 tok/s
4-bit ~90-180 tok/s ~180-300 tok/s

Memory Usage

  • FP16 model: ~13.5 GB
  • 8-bit model: ~3.5 GB (74% reduction)
  • 4-bit model: ~2 GB (85% reduction)

Quality

The FP16 version provides 100% accuracy relative to the original model, with no quantization loss. This is the reference version for maximum quality.

Why FP16?

The FP16 version is ideal for:

โœ… Maximum Accuracy: No quantization, full model precision
โœ… Reference Quality: 100% of original model capability
โœ… MLX Optimization: Still faster than PyTorch CPU
โœ… Production Critical: When accuracy is paramount

Recommended for: When you have sufficient memory (16GB+ RAM) and need maximum accuracy

Trade-offs: Larger size and slower than quantized versions, but no quality loss

Prompt Format

The model expects prompts in this format:

You are an assistant that writes valid DuckDB SQL queries.

### Schema:
CREATE TABLE table_name (
  column1 TYPE,
  column2 TYPE
);

### Question:
[Your natural language question]

### Response (DuckDB SQL only):

Limitations

  • The model is trained specifically for DuckDB SQL syntax
  • Complex queries may require post-processing
  • The model may occasionally generate invalid SQL for complex schemas
  • Best performance on well-defined schemas with clear column names
  • Requires ~16GB+ RAM for comfortable inference

Model Versions

Version Size Speed Quality Use Case
FP16 13.5 GB 1x 100% Maximum accuracy
8-bit 3.5 GB 2-3x ~99% Production (recommended)
4-bit 2 GB 3-4x ~97% Resource-constrained

Which Version Should I Use?

License

This model inherits the Llama 2 Community License Agreement from the base model.

Citation

@misc{duckdb-nsql-mlx,
  title={DuckDB-NSQL-7B MLX Conversion},
  author={aikhan1},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/aikhan1/DuckDB-NSQL-7B-v0.1-mlx}}
}

Original model:

@misc{duckdb-nsql,
  title={DuckDB-NSQL-7B: Natural Language to SQL for DuckDB},
  author={MotherDuck},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/motherduckdb/DuckDB-NSQL-7B-v0.1}}
}

Acknowledgments

Downloads last month
16
Safetensors
Model size
7B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Nuxera/duckdb-nsql-7b-mlx

Finetuned
(2)
this model