Lambent
/

ScenarioPlayTest-100Steps-Qwen3-4B-adapter

QLoRA adapter for Qwen3-4b playtesting the first draft of an RLVR environment of Mira's conceptualization.

Focus on one-shot roleplaying scenarios, even division of silly and serious, both narrative and problem-solving.

100 steps, cosine decay, batch size 4, learning rate 1e-5, rank 16, alpha 32.