Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
magibu
's Collections
Pretrain Datasets
papers
Ekip karışık verileri
Fine-tuned LLMs
Turkish Language Healthcare Datasets
Pretrain Datasets
updated
about 8 hours ago
Datasets we use for pretraining large language models
Upvote
-
omarkamali/wikipedia-monthly
Viewer
•
Updated
8 days ago
•
181M
•
15.9k
•
45
alibayram/hukuk_soru_cevap
Viewer
•
Updated
Nov 6, 2024
•
2.08k
•
91
•
12
umutertugrul/turkish-hospital-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
24.6k
•
207
•
6
umutertugrul/turkish-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
42.8k
•
53
•
3
alibayram/tr-books
Viewer
•
Updated
17 days ago
•
3.7k
•
32
selimfirat/bilkent-turkish-writings-dataset
Viewer
•
Updated
May 24, 2025
•
25.1k
•
166
•
8
umutertugrul/turkish-academic-theses-dataset
Viewer
•
Updated
Aug 18, 2025
•
649k
•
50
•
8
alibayram/onedio_haberler
Viewer
•
Updated
Jun 18, 2024
•
66.7k
•
5
•
5
habanoz/news-tr-1.8M
Viewer
•
Updated
Oct 6, 2024
•
1.85M
•
369
•
7
alibayram/hepsiburada_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
2.66M
•
70
•
13
alibayram/kitapyurdu_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
405k
•
25
alibayram/beyazperde_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
192k
•
21
•
5
BILGEM-AI/BILGE-Synthetic-Stories
Viewer
•
Updated
Nov 20, 2025
•
2.87M
•
116
•
4
Upvote
-
Share collection
View history
Collection guide
Browse collections