SOTAVerified

TruthfulQA

Papers

Showing 6170 of 80 papers

TitleStatusHype
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language ModelsCode0
A test suite of prompt injection attacks for LLM-based machine translationCode0
Steering Without Side Effects: Improving Post-Deployment Control of Language ModelsCode0
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition DynamicsCode0
When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language ModelsCode0
(WhyPHI) Fine-Tuning PHI-3 for Multiple-Choice Question Answering: Methodology, Results, and ChallengesCode0
Multi-Agent Reinforcement Learning with Focal Diversity OptimizationCode0
Measuring Reliability of Large Language Models through Semantic ConsistencyCode0
metabench -- A Sparse Benchmark to Measure General Ability in Large Language ModelsCode0
Instruction Tuning with Human CurriculumCode0
Show:102550
← PrevPage 7 of 8Next →

No leaderboard results yet.