zen-vl-training / README.md
Hanzo Dev
Add Zen VL training space with ADP+xLAM datasets
7522691

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: Zen VL Training
emoji: 🧘
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: apache-2.0
hardware: a10g-large

🧘 Zen VL Training Space

Train zen-vl vision-language models with combined ADP+xLAM datasets on HuggingFace Pro GPUs.

Features

  • Multi-Size Support: Train 4B, 8B, or 30B parameter models
  • GPU Options: A10G (24GB), A100-Large (40GB), A100 (80GB)
  • Combined Datasets: Agent Data Protocol (ADP) + xLAM Function Calling
  • Auto-Upload: Trained models automatically uploaded to HuggingFace Hub
  • Real-time Monitoring: Live training logs and progress tracking

Datasets

Agent Data Protocol (ADP)

xLAM Function Calling 60k

Training Configuration

4B Model (A10G - 24GB)

  • Batch size: 1
  • Gradient accumulation: 8
  • Max samples: 30,000
  • Estimated time: 6-8 hours

8B Model (A100-Large - 40GB)

  • Batch size: 2
  • Gradient accumulation: 8
  • Max samples: 50,000
  • Estimated time: 10-12 hours

30B Model (A100 - 80GB)

  • Batch size: 4
  • Gradient accumulation: 8
  • Max samples: 100,000
  • Estimated time: 20-24 hours

Usage

  1. Select model size (4b, 8b, or 30b)
  2. Choose GPU type (a10g, a100-large, or a100)
  3. Click "Start Training"
  4. Monitor progress in real-time
  5. Trained model automatically uploads to zenlm/zen-vl-{size}-agent

Requirements

  • HuggingFace Pro account (for GPU access)
  • HF_TOKEN environment variable set
  • Write access to zenlm organization

Output Models

Trained models will be uploaded to:

  • zenlm/zen-vl-4b-agent
  • zenlm/zen-vl-8b-agent
  • zenlm/zen-vl-30b-agent

Technical Details

Base Architecture: Qwen3-VL Training Method: Supervised Fine-Tuning (SFT) Data Mixture: 80% ADP, 20% xLAM Precision: bfloat16 Framework: Transformers + Accelerate

License

Apache 2.0 - See LICENSE

Citation

@software{zen-vl-2025,
  title={Zen VL: Vision-Language Models with Function Calling},
  author={Zen AI Team},
  year={2025},
  url={https://github.com/zenlm/zen-vl}
}

Links