1 15 11

J Li

jiazhengli

https://jiazhengli.com/

lijiazheng99

AI & ML interests

AI for Education

Recent Activity

commented on a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

authored a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

upvoted a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

View all activity

Organizations

None yet

commented a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Paper • 2511.20102 • Published Nov 25, 2025 • 27 •

authored a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Paper • 2511.20102 • Published Nov 25, 2025 • 27

upvoted a paper about 1 month ago

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Paper • 2511.20102 • Published Nov 25, 2025 • 27

liked a Space 2 months ago

The Smol Training Playbook

📚

2.77k

The secrets to building world-class LLMs

updated a dataset 2 months ago

jiazhengli/DARS_synthethsis_reflection

Viewer • Updated Oct 22, 2025 • 43.5k • 24

authored 2 papers 2 months ago

Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time

Paper • 2502.19230 • Published Feb 26, 2025 • 2

EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States

Paper • 2503.03340 • Published Mar 5, 2025 • 1

updated a model 2 months ago

jiazhengli/Qwen2.5-3B-Instruct-Critic

3B • Updated Oct 20, 2025 • 7

updated a collection 2 months ago

DARS

Collection

Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time • 4 items • Updated Oct 22, 2025

updated a model 2 months ago

jiazhengli/Qwen2.5-3B-Instruct-Reasoner

3B • Updated Oct 20, 2025 • 6

published 2 models 2 months ago

jiazhengli/Qwen2.5-3B-Instruct-Critic

3B • Updated Oct 20, 2025 • 7

jiazhengli/Qwen2.5-3B-Instruct-Reasoner

3B • Updated Oct 20, 2025 • 6

updated a collection 2 months ago

DARS

Collection

Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time • 4 items • Updated Oct 22, 2025

published a dataset 2 months ago

jiazhengli/DARS_synthethsis_reflection

Viewer • Updated Oct 22, 2025 • 43.5k • 24

upvoted 2 papers 3 months ago

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Paper • 2510.11052 • Published Oct 13, 2025 • 51

Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance

Paper • 2510.03528 • Published Oct 3, 2025 • 17

upvoted a paper 4 months ago

IntrEx: A Dataset for Modeling Engagement in Educational Conversations

Paper • 2509.06652 • Published Sep 8, 2025 • 24

liked a Space 4 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.25k

Generate high-quality text data for LLMs using FineWeb

J Li

AI & ML interests

Recent Activity

Organizations

jiazhengli's activity

The Smol Training Playbook

FineWeb: decanting the web for the finest text data at scale