AgentPRMBench

community

AI & ML interests

None defined yet.

Recent Activity

salmannyu authored a paper about 11 hours ago

When Can LLMs Learn to Reason with Weak Supervision?

salmannyu submitted a paper 20 days ago

When Can LLMs Learn to Reason with Weak Supervision?

salmannyu submitted a paper 5 months ago

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

View all activity

authored a paper about 11 hours ago

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 21 days ago • 25

submitted a paper to Daily Papers 20 days ago

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 21 days ago • 25

submitted a paper to Daily Papers 5 months ago

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

Paper • 2512.03244 • Published Dec 2, 2025 • 17

authored a paper 10 months ago

ModelCitizens: Representing Community Voices in Online Safety

Paper • 2507.05455 • Published Jul 7, 2025 • 5

authored a paper about 1 year ago

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Paper • 2504.13203 • Published Apr 15, 2025 • 35

authored 2 papers about 1 year ago

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Paper • 2504.13203 • Published Apr 15, 2025 • 35

MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations

Paper • 2504.07830 • Published Apr 10, 2025 • 18

authored 3 papers about 1 year ago

Multimodal Clinical Pseudo-notes for Emergency Department Prediction Tasks using Multiple Embedding Model for EHR (MEME)

Paper • 2402.00160 • Published Jan 31, 2024

FEET: A Framework for Evaluating Embedding Techniques

Paper • 2411.01322 • Published Nov 2, 2024

Clinical ModernBERT: An efficient and long context encoder for biomedical text

Paper • 2504.03964 • Published Apr 4, 2025 • 5

authored 2 papers about 2 years ago

Generalization in Healthcare AI: Evaluation of a Clinical Large Language Model

Paper • 2402.10965 • Published Feb 14, 2024 • 1

Understanding Disparities in Post Hoc Machine Learning Explanation

Paper • 2401.14539 • Published Jan 25, 2024

authored 4 papers over 2 years ago

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77

A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

Paper • 2310.17750 • Published Oct 26, 2023 • 9

Evaluating Cognitive Maps and Planning in Large Language Models with CogEval

Paper • 2309.15129 • Published Sep 25, 2023 • 7

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

Paper • 2309.15098 • Published Sep 26, 2023 • 7