-
jaygala24/Qwen3-4B-GRPO-KL-math-reasoning
Text Generation • 4B • Updated • 1.04k -
jaygala24/Qwen3-4B-GRPO-math-reasoning
Text Generation • 4B • Updated • 862 -
jaygala24/Qwen3-4B-ReMax-math-reasoning
Text Generation • 4B • Updated • 805 -
jaygala24/Qwen3-4B-RLOO-math-reasoning
Text Generation • 4B • Updated • 174
Jay Gala
jaygala24
·
AI & ML interests
Machine Learning, Natural Language Processing, Language and Vision Intersection, Fairness and Biases
Recent Activity
updated a collection 31 minutes ago
RL post-training updated a model about 1 hour ago
jaygala24/Qwen3-4B-DAPO-math-reasoning published a model about 1 hour ago
jaygala24/Qwen3-4B-DAPO-math-reasoning