Predicting the Order of Upcoming Tokens Improves Language Modeling Paper • 2508.19228 • Published Aug 26, 2025 • 23
Language Surgery in Multilingual Large Language Models Paper • 2506.12450 • Published Jun 14, 2025 • 16
JavaneseHonorifics/Unggah-Ungguh-Javanese-GPT2-Classifier Text Classification • 0.1B • Updated May 26, 2025 • 5
JavaneseHonorifics/Unggah-Ungguh-Javanese-GPT2-Classifier Text Classification • 0.1B • Updated May 26, 2025 • 5
JavaneseHonorifics/Unggah-Ungguh-Javanese-Bert-Classifier Text Classification • 0.1B • Updated May 26, 2025 • 5
JavaneseHonorifics/Unggah-Ungguh-Javanese-Distilbert-Classifier Text Classification • 67M • Updated May 26, 2025 • 5
JavaneseHonorifics/Unggah-Ungguh-Javanese-Distilbert-Classifier Text Classification • 67M • Updated May 26, 2025 • 5
JavaneseHonorifics/Unggah-Ungguh-Javanese-Bert-Classifier Text Classification • 0.1B • Updated May 26, 2025 • 5
Do Language Models Understand Honorific Systems in Javanese? Paper • 2502.20864 • Published Feb 28, 2025 • 1
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Paper • 2504.20966 • Published Apr 29, 2025 • 31
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10, 2025 • 101