SOTAVerified

RAG

Retrieval-Augmented Generation (RAG) is a task that combines the strengths of both retrieval-based models and generation-based models. In this approach, a retrieval system selects relevant documents or passages from a large corpus, and a generation model, typically a neural language model, uses the retrieved information to generate a response. This method enhances the accuracy and coherence of generated text, especially in tasks requiring detailed knowledge or long context handling.

RAG is particularly useful in open-domain question answering, knowledge-grounded dialogue, and summarization tasks. The retrieval step helps the model to access and incorporate external information, making it less reliant on memorized knowledge and better suited for generating responses based on the latest or domain-specific information.

The performance of RAG systems is usually measured using metrics such as precision, recall, F1 score, BLEU score, and exact match. Some popular datasets for evaluating RAG models include Natural Questions, MS MARCO, TriviaQA, and SQuAD.

Papers

Showing 301310 of 2111 papers

TitleStatusHype
JuDGE: Benchmarking Judgment Document Generation for Chinese Legal SystemCode1
Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene UnderstandingCode1
SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity ReductionCode1
GPIoT: Tailoring Small Language Models for IoT Program Synthesis and DevelopmentCode1
SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation ModelsCode1
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point ThinkingCode1
LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation ConversationCode1
Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix FactorizationCode1
EgoNormia: Benchmarking Physical Social Norm UnderstandingCode1
ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language ModelsCode1
Show:102550
← PrevPage 31 of 212Next →

No leaderboard results yet.