SOTAVerified

TriviaQA

Papers

Showing 76100 of 124 papers

TitleStatusHype
RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering0
Relation-Guided Pre-Training for Open-Domain Question Answering0
Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents0
Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation0
Self-Training Large Language Models for Tool-Use Without Demonstrations0
SFR-RAG: Towards Contextually Faithful LLMs0
ShED-HD: A Shannon Entropy Distribution Framework for Lightweight Hallucination Detection on Edge Devices0
Simple and Effective Semi-Supervised Question Answering0
SKILL: Structured Knowledge Infusion for Large Language Models0
Smarnet: Teaching Machines to Read and Comprehend Like Human0
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference0
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting0
Studying Strategically: Learning to Mask for Closed-book QA0
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate0
Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering0
UnitedQA: A Hybrid Approach for Open Domain Question Answering0
Vision-centric Token Compression in Large Language Model0
When to Read Documents or QA History: On Unified and Selective Open-domain QA0
Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models0
Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity0
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language ModelsCode0
KV Prediction for Improved Time to First TokenCode0
RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented GenerationCode0
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human FeedbackCode0
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-JudgesCode0
Show:102550
← PrevPage 4 of 5Next →

No leaderboard results yet.