SelectiveDPO - a glorgao Collection

glorgao 's Collections

SelectiveDPO

updated Mar 2

Released models trained by Selective DPO.

glorgao/SelectiveDPO-Gemma2-9B-SFT-UFBinarized

Text Generation • 9B • Updated May 15, 2025 • 6
glorgao/SelectiveDPO-Llama3-8B-SFT-UFBinarized

Text Generation • 8B • Updated May 15, 2025 • 3 • 1
glorgao/SelectiveDPO-Qwen2.5-7B-SFT-UFBinarized

Text Generation • 7B • Updated May 15, 2025 • 2 • 1
glorgao/SelectiveDPO-Mistral-7B-SFT-UFBinarized

Text Generation • 7B • Updated May 15, 2025 • 3
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

Paper • 2502.09650 • Published Feb 11, 2025