stillarrow/qwen2.5-coder-1.5b-instruct__grpo_no_std_code_hidden_only_shortcut_guard Updated about 13 hours ago
stillarrow/qwen2.5-coder-1.5b-instruct__scpo_no_std_code_hidden_only_shortcut_guard Updated about 3 hours ago • 1
stillarrow/qwen2.5-coder-1.5b-instruct__scpo_no_std_code_hidden_only_shortcut_guard Updated about 3 hours ago • 1
stillarrow/qwen2.5-coder-1.5b-instruct__jspo_no_std_code_hidden_only_shortcut_guard Updated about 6 hours ago
stillarrow/qwen2.5-coder-1.5b-instruct__jspo_no_std_code_hidden_only_shortcut_guard Updated about 6 hours ago
stillarrow/qwen2.5-coder-1.5b-instruct__grpo_no_std_code_hidden_only_shortcut_guard Updated about 13 hours ago
stillarrow/qwen2.5-math-7b__math_subject_proportional_cluster-246fecfa-et_mix_lambda_no_drift_off_ratio_100 Updated about 17 hours ago • 51
stillarrow/qwen2.5-math-7b__math_subject_proportional_cluster-246fecfa-et_mix_lambda_no_drift_off_ratio_100 Updated about 17 hours ago • 51
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-0939fc56-policy_lambda_no_drift_off_ratio_100 Updated about 17 hours ago • 49
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-0939fc56-policy_lambda_no_drift_off_ratio_100 Updated about 17 hours ago • 49
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published Feb 10 • 53
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-6bc47709-et_mix_lambda_no_drift_off_ratio_100 Updated 1 day ago • 54
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-aabaf976-policy_lambda_no_drift_off_ratio_100 Updated 1 day ago • 40
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-6bc47709-et_mix_lambda_no_drift_off_ratio_100 Updated 1 day ago • 54
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-aabaf976-policy_lambda_no_drift_off_ratio_100 Updated 1 day ago • 40
nvidia/llama-nv-embed-reasoning-3b Feature Extraction • 3B • Updated 27 days ago • 2.21k • 18
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 22 days ago • 117