SOTAVerified

TriviaQA

Papers

Showing 5175 of 124 papers

TitleStatusHype
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic DataCode0
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-JudgesCode0
CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAGCode0
RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented GenerationCode0
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language ModelsCode0
Accurate and Nuanced Open-QA Evaluation Through Textual EntailmentCode0
KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering0
Mitigating LLM Hallucinations via Conformal Abstention0
FIT-RAG: Black-Box RAG with Factual Information and Token Reduction0
Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents0
Fine-Grained Self-Endorsement Improves Factuality and Reasoning0
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate0
Attendre: Wait To Attend By Retrieval With Evicted Queries in Memory-Based Transformers for Long Context Processing0
Efficient Transformer Knowledge Distillation: A Performance Review0
Noisy Pair Corrector for Dense Retrieval0
A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative ModelsCode0
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference0
When to Read Documents or QA History: On Unified and Selective Open-domain QA0
RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question AnsweringCode0
Allies: Prompting Large Language Model with Beam SearchCode0
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human FeedbackCode0
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive CritiquingCode0
Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders0
Dense Sparse Retrieval: Using Sparse Language Models for Inference Efficient Dense Retrieval0
CLAM: Selective Clarification for Ambiguous Questions with Generative Language Models0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.