Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
metadata
title: Zen VL Training
emoji: π§
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: apache-2.0
hardware: a10g-large
π§ Zen VL Training Space
Train zen-vl vision-language models with combined ADP+xLAM datasets on HuggingFace Pro GPUs.
Features
- Multi-Size Support: Train 4B, 8B, or 30B parameter models
- GPU Options: A10G (24GB), A100-Large (40GB), A100 (80GB)
- Combined Datasets: Agent Data Protocol (ADP) + xLAM Function Calling
- Auto-Upload: Trained models automatically uploaded to HuggingFace Hub
- Real-time Monitoring: Live training logs and progress tracking
Datasets
Agent Data Protocol (ADP)
- Source: neulab/agent-data-collection
- Size: ~220k agent trajectories (8.4GB)
- Citation: arXiv:2510.24702
xLAM Function Calling 60k
- Source: Salesforce/xlam-function-calling-60k
- Size: 60k function calling examples (101MB)
- Citation: Salesforce Research
Training Configuration
4B Model (A10G - 24GB)
- Batch size: 1
- Gradient accumulation: 8
- Max samples: 30,000
- Estimated time: 6-8 hours
8B Model (A100-Large - 40GB)
- Batch size: 2
- Gradient accumulation: 8
- Max samples: 50,000
- Estimated time: 10-12 hours
30B Model (A100 - 80GB)
- Batch size: 4
- Gradient accumulation: 8
- Max samples: 100,000
- Estimated time: 20-24 hours
Usage
- Select model size (4b, 8b, or 30b)
- Choose GPU type (a10g, a100-large, or a100)
- Click "Start Training"
- Monitor progress in real-time
- Trained model automatically uploads to
zenlm/zen-vl-{size}-agent
Requirements
- HuggingFace Pro account (for GPU access)
- HF_TOKEN environment variable set
- Write access to zenlm organization
Output Models
Trained models will be uploaded to:
zenlm/zen-vl-4b-agentzenlm/zen-vl-8b-agentzenlm/zen-vl-30b-agent
Technical Details
Base Architecture: Qwen3-VL Training Method: Supervised Fine-Tuning (SFT) Data Mixture: 80% ADP, 20% xLAM Precision: bfloat16 Framework: Transformers + Accelerate
License
Apache 2.0 - See LICENSE
Citation
@software{zen-vl-2025,
title={Zen VL: Vision-Language Models with Function Calling},
author={Zen AI Team},
year={2025},
url={https://github.com/zenlm/zen-vl}
}
Links
- Website: https://zenlm.org
- GitHub: https://github.com/zenlm/zen-vl
- Models: https://huggingface.co/zenlm
- Paper: Coming soon