Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training
Abstract
Q-RAG enables efficient multi-step retrieval for large language models through reinforcement learning fine-tuning of embedder models, achieving state-of-the-art performance on long-context benchmarks.
Retrieval-Augmented Generation (RAG) methods enhance LLM performance by efficiently filtering relevant context for LLMs, reducing hallucinations and inference cost. However, most existing RAG methods focus on single-step retrieval, which is often insufficient for answering complex questions that require multi-step search. Recently, multi-step retrieval approaches have emerged, typically involving the fine-tuning of small LLMs to perform multi-step retrieval. This type of fine-tuning is highly resource-intensive and does not enable the use of larger LLMs. In this work, we propose Q-RAG, a novel approach that fine-tunes the Embedder model for multi-step retrieval using reinforcement learning (RL). Q-RAG offers a competitive, resource-efficient alternative to existing multi-step retrieval methods for open-domain question answering and achieves state-of-the-art results on the popular long-context benchmarks BabiLong and RULER for contexts up to 10M tokens. Code is available at https://github.com/griver/Q-RAG
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Retrieval from Within: An Intrinsic Capability of Attention-Based Models (2026)
- Latent Abstraction for Retrieval-Augmented Generation (2026)
- OThink-SRR1: Search, Refine and Reasoning with Reinforced Learning for Large Language Models (2026)
- CoSearch: Joint Training of Reasoning and Document Ranking via Reinforcement Learning for Agentic Search (2026)
- LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG (2026)
- Optimizing RAG Rerankers with LLM Feedback via Reinforcement Learning (2026)
- MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2511.07328 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper
