SOTAVerified

RAG

Retrieval-Augmented Generation (RAG) is a task that combines the strengths of both retrieval-based models and generation-based models. In this approach, a retrieval system selects relevant documents or passages from a large corpus, and a generation model, typically a neural language model, uses the retrieved information to generate a response. This method enhances the accuracy and coherence of generated text, especially in tasks requiring detailed knowledge or long context handling.

RAG is particularly useful in open-domain question answering, knowledge-grounded dialogue, and summarization tasks. The retrieval step helps the model to access and incorporate external information, making it less reliant on memorized knowledge and better suited for generating responses based on the latest or domain-specific information.

The performance of RAG systems is usually measured using metrics such as precision, recall, F1 score, BLEU score, and exact match. Some popular datasets for evaluating RAG models include Natural Questions, MS MARCO, TriviaQA, and SQuAD.

Title	Date	Tasks	Status	Hype
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems	Nov 16, 2023	RAGRetrieval	CodeCode Available	2
Benchmarking Large Language Models in Retrieval-Augmented Generation	Sep 4, 2023	Benchmarkingcounterfactual	CodeCode Available	2
Huatuo-26M, a Large-scale Chinese Medical QA Dataset	May 2, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
SimpleDoc: Multi-Modal Document Understanding with Dual-Cue Page Retrieval and Iterative Refinement	Jun 16, 2025	document understandingQuestion Answering	CodeCode Available	1
Constructing and Evaluating Declarative RAG Pipelines in PyTerrier	Jun 12, 2025	Natural QuestionsRAG	CodeCode Available	1
FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation	Jun 10, 2025	RAGRetrieval	CodeCode Available	1
DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs	Jun 10, 2025	RAGRetrieval-augmented Generation	CodeCode Available	1
SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design	Jun 9, 2025	Code GenerationRAG	CodeCode Available	1
Joint-GCG: Unified Gradient-Based Poisoning Attacks on Retrieval-Augmented Generation Systems	Jun 6, 2025	RAGRetrieval	CodeCode Available	1
LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table	Jun 5, 2025	RAG	CodeCode Available	1

Title

Status

Hype

ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems

CodeCode Available

Benchmarking Large Language Models in Retrieval-Augmented Generation

CodeCode Available

Huatuo-26M, a Large-scale Chinese Medical QA Dataset

CodeCode Available

SimpleDoc: Multi-Modal Document Understanding with Dual-Cue Page Retrieval and Iterative Refinement

CodeCode Available

Constructing and Evaluating Declarative RAG Pipelines in PyTerrier