SOTAVerified

TruthfulQA

Papers

Showing 3140 of 80 papers

TitleStatusHype
metabench -- A Sparse Benchmark to Measure General Ability in Large Language ModelsCode0
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language ModelsCode0
Truth Knows No Language: Evaluating Truthfulness Beyond EnglishCode0
Truth NeuronsCode0
DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning AbilityCode0
Unsupervised Elicitation of Language ModelsCode0
VarBench: Robust Language Model Benchmarking Through Dynamic Variable PerturbationCode0
Teaching language models to support answers with verified quotes0
Towards Multilingual LLM Evaluation for European Languages0
TruthFlow: Truthful LLM Generation via Representation Flow Correction0
Show:102550
← PrevPage 4 of 8Next →

No leaderboard results yet.