SOTAVerified

TriviaQA

Papers

Showing 150 of 124 papers

TitleStatusHype
Retrieval-Augmented Generation as Noisy In-Context Learning: A Unified Theory and Risk Bounds0
GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language ModelsCode0
HASH-RAG: Bridging Deep Hashing with Retriever for Efficient, Fine Retrieval and Augmented Generation0
Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models0
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures0
ShED-HD: A Shannon Entropy Distribution Framework for Lightweight Hallucination Detection on Edge Devices0
CacheFocus: Dynamic Cache Re-Positioning for Efficient Retrieval-Augmented Generation0
Cost-Saving LLM Cascades with Early Abstention0
Self-Training Large Language Models for Tool-Use Without Demonstrations0
Vision-centric Token Compression in Large Language Model0
ASRank: Zero-Shot Re-Ranking with Answer Scent for Document Retrieval0
Improving Generated and Retrieved Knowledge Combination Through Zero-shot Generation0
DragonVerseQA: Open-Domain Long-Form Context-Aware Question-AnsweringCode0
Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity0
Addressing Uncertainty in LLMs to Enhance Reliability in Generative AI0
Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation0
KV Prediction for Improved Time to First TokenCode0
Exploring Hint Generation Approaches in Open-Domain Question AnsweringCode1
SFR-RAG: Towards Contextually Faithful LLMs0
FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence SelectionCode1
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting0
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic DataCode0
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-JudgesCode0
CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAGCode0
RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented GenerationCode0
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMsCode1
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language ModelsCode0
Accurate and Nuanced Open-QA Evaluation Through Textual EntailmentCode0
LayerSkip: Enabling Early Exit Inference and Self-Speculative DecodingCode3
KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering0
Mitigating LLM Hallucinations via Conformal Abstention0
Multi-Granularity Guided Fusion-in-DecoderCode1
FIT-RAG: Black-Box RAG with Factual Information and Token Reduction0
Unfamiliar Finetuning Examples Control How Language Models HallucinateCode1
Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question AnsweringCode1
Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents0
Fine-Grained Self-Endorsement Improves Factuality and Reasoning0
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate0
Attendre: Wait To Attend By Retrieval With Evicted Queries in Memory-Based Transformers for Long Context Processing0
Efficient Transformer Knowledge Distillation: A Performance Review0
Noisy Pair Corrector for Dense Retrieval0
A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative ModelsCode0
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference0
Generator-Retriever-Generator Approach for Open-Domain Question AnsweringCode1
When to Read Documents or QA History: On Unified and Selective Open-domain QA0
Exploiting Abstract Meaning Representation for Open-Domain Question AnsweringCode1
RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question AnsweringCode0
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human FeedbackCode0
Allies: Prompting Large Language Model with Beam SearchCode0
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive CritiquingCode0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.