Pyhe
pyhe
·
AI & ML interests
None yet
Organizations
None yet
Finans
Artikler
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 76 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 203 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper • 2303.03915 • Published • 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 254
geemmma
Finans
Pythonic
Artikler
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 76 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 203 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper • 2303.03915 • Published • 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 254
1