Peng Wang's picture

In a Training Loop 🔄

Peng Wang

stillarrow

·

https://peter-peng-w.github.io/

AI & ML interests

None yet

Recent Activity

updated a model about 2 hours ago

stillarrow/qwen2.5-coder-1.5b-instruct__scpo_no_std_code_hidden_only_shortcut_guard

upvoted a paper about 4 hours ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

updated a model 1 day ago

stillarrow/qwen2.5-coder-1.5b-instruct__grpo_no_std_code_hidden_only_shortcut_guard

View all activity

Organizations

None yet

stillarrow 's datasets 1

stillarrow/MATH

Viewer • Updated Sep 25, 2025 • 26.5k • 38