Weight-sparse transformers have interpretable circuits Paper • 2511.13653 • Published Nov 17, 2025 • 2
Reasoning at the Edge Collection This collection traces the mathematical and empirical limits of machine reasoning. • 12 items • Updated 27 days ago • 1
view article Article Project SmolMoE-8x135M: From Zero to a Custom Mixture-of-Experts Model Aug 7, 2025 • 3
Co-training and Co-distillation for Quality Improvement and Compression of Language Models Paper • 2311.02849 • Published Nov 6, 2023 • 8
Small Models Struggle to Learn from Strong Reasoners Paper • 2502.12143 • Published Feb 17, 2025 • 39