koutch/qwen3-thinking-4b_train_grpo_v1_train_no_think Text Generation • 4B • Updated 10 days ago • 84
koutch/qwen3-instruct-4b_train_grpo_v1_train_no_think Text Generation • 4B • Updated 11 days ago • 64