stillarrow/qwen2.5-coder-1.5b-instruct__grpo_no_std_code_hidden_only_shortcut_guard Updated 41 minutes ago
stillarrow/qwen2.5-math-7b__math_subject_proportional_cluster-246fecfa-et_mix_lambda_no_drift_off_ratio_100 Updated about 4 hours ago
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-0939fc56-policy_lambda_no_drift_off_ratio_100 Updated about 4 hours ago
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-6bc47709-et_mix_lambda_no_drift_off_ratio_100 Updated 1 day ago • 21
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-aabaf976-policy_lambda_no_drift_off_ratio_100 Updated 1 day ago • 18