Arabic LLM Checkpoints
Mingzhe Du PRO
AI & ML interests
Code Generation / Preference Alignment
Recent Activity
upvoted
a
paper
2 days ago
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
upvoted
a
paper
2 days ago
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
updated
a dataset
5 days ago
Elfsong/Qwen3_4B_Arabic_200-responses-Syrian