Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time
J Li
jiazhengli
AI & ML interests
AI for Education
Recent Activity
commented on
a paper
about 1 month ago
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
authored
a paper
about 1 month ago
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
upvoted
a
paper
about 1 month ago
SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
Organizations
None yet
MCTS with Preference Optimisation
Resources for EMNLP 2024 Paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
-
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Paper • 2406.19949 • Published • 1 -
jiazhengli/Rationale_MCTS
Viewer • Updated • 8.71k • 42 • 2 -
jiazhengli/Synthetic_Rationale
Viewer • Updated • 32.9k • 66 • 1 -
jiazhengli/deberta-v3-large-Rationale-to-Score
Text Classification • 0.4B • Updated • 8 • 1
AERA
Resources for EMNLP 2023 Paper: Distilling ChatGPT for Explainable Automated Student Answer Assessment
RoleMRC
A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following
SamPO
Resources for EMNLP 2024 Paper: Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
-
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Paper • 2406.10957 • Published • 2 -
jiazhengli/Pythia-2.8B-HH-RLHF-Iterative-SamPO
Text Generation • 3B • Updated • 11 -
jiazhengli/Pythia-2.8B-TLDR-Iterative-SamPO
Text Generation • 3B • Updated • 6 -
Junrulu/Llama-3-8B-Instruct-Iterative-SamPO
Text Generation • 8B • Updated • 8 • 1
DARS
Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time
RoleMRC
A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following
MCTS with Preference Optimisation
Resources for EMNLP 2024 Paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
-
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Paper • 2406.19949 • Published • 1 -
jiazhengli/Rationale_MCTS
Viewer • Updated • 8.71k • 42 • 2 -
jiazhengli/Synthetic_Rationale
Viewer • Updated • 32.9k • 66 • 1 -
jiazhengli/deberta-v3-large-Rationale-to-Score
Text Classification • 0.4B • Updated • 8 • 1
SamPO
Resources for EMNLP 2024 Paper: Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
-
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Paper • 2406.10957 • Published • 2 -
jiazhengli/Pythia-2.8B-HH-RLHF-Iterative-SamPO
Text Generation • 3B • Updated • 11 -
jiazhengli/Pythia-2.8B-TLDR-Iterative-SamPO
Text Generation • 3B • Updated • 6 -
Junrulu/Llama-3-8B-Instruct-Iterative-SamPO
Text Generation • 8B • Updated • 8 • 1
AERA
Resources for EMNLP 2023 Paper: Distilling ChatGPT for Explainable Automated Student Answer Assessment