SOTAVerified

TruthfulQA

Papers

Showing 3140 of 80 papers

TitleStatusHype
CHAIR -- Classifier of Hallucination as ImproverCode0
A test suite of prompt injection attacks for LLM-based machine translationCode0
VarBench: Robust Language Model Benchmarking Through Dynamic Variable PerturbationCode0
When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language ModelsCode0
Self-Evaluation Improves Selective Generation in Large Language Models0
Semantic Consistency for Assuring Reliability of Large Language Models0
Shadows in the Attention: Contextual Perturbation and Representation Drift in the Dynamics of Hallucination in LLMs0
SkillAggregation: Reference-free LLM-Dependent Aggregation0
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency0
Teaching language models to support answers with verified quotes0
Show:102550
← PrevPage 4 of 8Next →

No leaderboard results yet.