Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Clément Castellon's picture

3

Clément Castellon

Clemspace

tudorizer's profile picture

·

clemspace

AI & ML interests

Reinforcement learning, Neural Architecture Search, Transformers

Organizations

Clemspace 's collections 3

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

Paper • 2510.22037 • Published Oct 24 • 19
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 500
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 537
Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13 • 100

Idiosyncrasies in Large Language Models

Paper • 2502.12150 • Published Feb 17 • 1
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 166

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published Dec 9, 2024 • 72
Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents

Paper • 2309.17207 • Published Sep 29, 2023
Titans: Learning to Memorize at Test Time

Paper • 2501.00663 • Published Dec 31, 2024 • 28
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 166

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

Paper • 2510.22037 • Published Oct 24 • 19
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 500
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 537
Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13 • 100

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published Dec 9, 2024 • 72
Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents

Paper • 2309.17207 • Published Sep 29, 2023
Titans: Learning to Memorize at Test Time

Paper • 2501.00663 • Published Dec 31, 2024 • 28
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 166

Idiosyncrasies in Large Language Models

Paper • 2502.12150 • Published Feb 17 • 1
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 166

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs