1 2

Mehul Damani PRO

mehuldamani

https://damanimehul.github.io

AI & ML interests

Reinforcement Learning, Large Language Models

Recent Activity

updated a dataset about 1 hour ago

mehuldamani/big-math-tough

published a dataset about 1 hour ago

mehuldamani/big-math-tough

published a model about 23 hours ago

mehuldamani/rlcr_single_from_rlvr_chkpt480

View all activity

Organizations

None yet

Collections 1

Papers 4

models 196

mehuldamani/rlcr_single_from_rlvr_chkpt480

Updated about 23 hours ago

mehuldamani/rlvr-base-trainRlcr-ConfMoreOne_fracAcc_entroRew_pt5Brier_pt9Temp_reasonInPrompt

Updated 1 day ago

mehuldamani/qwen3_8b_medical_rlvr_multi_k_2

Updated 2 days ago

mehuldamani/qwen3_8b_medical_rlvr_multi_k_4

Updated 2 days ago

mehuldamani/rlvr-base-train-with-rlcr-sumConfMoreOne_fractAcc_pt5Brier_pt9Temp_reasonUncertInPrompt

Updated 2 days ago

mehuldamani/rlcr-multi-from-rlvr-base-sumConfMoreThanOne_pt5Brier-pt9Temp-specifyConf_reasonUncertInPrompt

Updated 2 days ago

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConf_reasonUncertInPrompt

Updated 3 days ago

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-1point4Temp-specifyConfSumLessThan1

Updated 3 days ago

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConfSumLessThan1

Updated 3 days ago

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-point9Temp-specifyConfSumLessThan1

Updated 3 days ago

View 196 models

datasets 50

Mehul Damani PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

mehuldamani/big-math-digits-v2-correctness

mehuldamani/hotpot-v2-correctness-7b

mehuldamani/orm-big-math-digits-v2-correctness

mehuldamani/big-math-digits-v2-brier

mehuldamani/big-math-digits-v2-correctness

mehuldamani/hotpot-v2-correctness-7b

mehuldamani/orm-big-math-digits-v2-correctness

mehuldamani/big-math-digits-v2-brier

Papers 4

models 196

mehuldamani/rlcr_single_from_rlvr_chkpt480

mehuldamani/rlvr-base-trainRlcr-ConfMoreOne_fracAcc_entroRew_pt5Brier_pt9Temp_reasonInPrompt

mehuldamani/qwen3_8b_medical_rlvr_multi_k_2

mehuldamani/qwen3_8b_medical_rlvr_multi_k_4

mehuldamani/rlvr-base-train-with-rlcr-sumConfMoreOne_fractAcc_pt5Brier_pt9Temp_reasonUncertInPrompt

mehuldamani/rlcr-multi-from-rlvr-base-sumConfMoreThanOne_pt5Brier-pt9Temp-specifyConf_reasonUncertInPrompt

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConf_reasonUncertInPrompt

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-1point4Temp-specifyConfSumLessThan1

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConfSumLessThan1

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-point9Temp-specifyConfSumLessThan1

datasets 50

mehuldamani/big-math-tough

mehuldamani/medTroubleshootig-rlvr-220-evaled-on-rlcr

mehuldamani/medTroubleshootig-rlvr-220-evaled-on-rlvr

mehuldamani/medDataset_25k

mehuldamani/medDataset

mehuldamani/qwen3_8b_ambigQA_rlcr_multi_analysis

mehuldamani/qwen3_8b_ambigQA_rlcr_single_passk_tryAgain

mehuldamani/ambigQA

mehuldamani/judge-new-sft-instruct

mehuldamani/judge-new-sft-base

Mehul Damani PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

Papers 4

models 196 Sort: Recently updated

datasets 50 Sort: Recently updated

models 196

datasets 50