TruthfulQA

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 80 papers

Title	Date	Tasks	Status	Hype
Unsupervised Elicitation of Language Models	Jun 11, 2025	GSM8KTruthfulQA	—Unverified	0
Model Unlearning via Sparse Autoencoder Subspace Guided Projections	May 30, 2025	Adversarial Robustnessfeature selection	—Unverified	0
Shadows in the Attention: Contextual Perturbation and Representation Drift in the Dynamics of Hallucination in LLMs	May 22, 2025	HallucinationTruthfulQA	—Unverified	0
Truth Neurons	May 18, 2025	TruthfulQA	CodeCode Available	0
Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2	May 9, 2025	ARCBelebele	—Unverified	0
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures	Apr 29, 2025	MambaTriviaQA	—Unverified	0
Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer	Apr 17, 2025	Conformal PredictionTruthfulQA	—Unverified	0
Sample, Don't Search: Rethinking Test-Time Alignment for Language Models	Apr 4, 2025	GSM8KMathematical Reasoning	—Unverified	0
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency	Apr 4, 2025	BenchmarkingGSM8K	—Unverified	0
More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment	Apr 3, 2025	ARCHellaSwag	—Unverified	0
When Persuasion Overrides Truth in Multi-Agent LLM Debates: Introducing a Confidence-Weighted Persuasion Override Rate (CW-POR)	Apr 1, 2025	Language ModelingLanguage Modelling	—Unverified	0
DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability	Mar 4, 2025	GSM8KLogical Reasoning	CodeCode Available	0
Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models	Feb 20, 2025	HellaSwagMemorization	—Unverified	0
Cost-Saving LLM Cascades with Early Abstention	Feb 13, 2025	GSM8KMMLU	—Unverified	0
Truth Knows No Language: Evaluating Truthfulness Beyond English	Feb 13, 2025	InformativenessMachine Translation	CodeCode Available	0
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models	Feb 12, 2025	Mathematical ReasoningMMLU	—Unverified	0
Multi-Agent Reinforcement Learning with Focal Diversity Optimization	Feb 6, 2025	DiversityMulti-agent Reinforcement Learning	CodeCode Available	0
TruthFlow: Truthful LLM Generation via Representation Flow Correction	Feb 6, 2025	HallucinationTruthfulQA	—Unverified	0
CHAIR -- Classifier of Hallucination as Improver	Jan 5, 2025	HallucinationMMLU	CodeCode Available	0
(WhyPHI) Fine-Tuning PHI-3 for Multiple-Choice Question Answering: Methodology, Results, and Challenges	Jan 3, 2025	Multiple-choiceQuestion Answering	CodeCode Available	0
Monty Hall and Optimized Conformal Prediction to Improve Decision-Making with LLMs	Dec 31, 2024	Conformal PredictionDecision Making	—Unverified	0
Mitigating Adversarial Attacks in LLMs through Defensive Suffix Generation	Dec 18, 2024	TruthfulQA	—Unverified	0
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages	Dec 1, 2024	ARCMultiple-choice	—Unverified	0
Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity	Nov 15, 2024	Contrastive LearningHallucination	—Unverified	0
Maintaining Informative Coherence: Migrating Hallucinations in Large Language Models via Absorbing Markov Chains	Oct 27, 2024	Text GenerationTruthfulQA	—Unverified	0
A Debate-Driven Experiment on LLM Hallucinations and Accuracy	Oct 25, 2024	Fact CheckingHallucination	—Unverified	0
Evaluating Consistencies in LLM responses through a Semantic Clustering of Question Answering	Oct 20, 2024	Language ModellingLarge Language Model	—Unverified	0
Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning	Oct 16, 2024	Contrastive Learninggraph construction	—Unverified	0
SkillAggregation: Reference-free LLM-Dependent Aggregation	Oct 14, 2024	ChatbotHallucination	—Unverified	0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts	Oct 11, 2024	Holdout SetMisconceptions	—Unverified	0
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models	Oct 11, 2024	Multiple-choiceTruthfulQA	CodeCode Available	0
Towards Multilingual LLM Evaluation for European Languages	Oct 11, 2024	ARCGSM8K	—Unverified	0
A test suite of prompt injection attacks for LLM-based machine translation	Oct 7, 2024	Machine TranslationTranslation	CodeCode Available	0
Efficiently Deploying LLMs with Controlled Risk	Oct 3, 2024	MMLUTruthfulQA	—Unverified	0
Integrative Decoding: Improve Factuality via Implicit Self-consistency	Oct 2, 2024	TruthfulQA	CodeCode Available	1
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs	Sep 30, 2024	ARCDiversity	—Unverified	0
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models	Sep 7, 2024	MMLUTruthfulQA	—Unverified	0
Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused	Aug 16, 2024	HallucinationTruthfulQA	—Unverified	0
LokiLM: Technical Report	Jul 10, 2024	Knowledge DistillationLanguage Modeling	—Unverified	0
metabench -- A Sparse Benchmark to Measure General Ability in Large Language Models	Jul 4, 2024	ARCGSM8K	CodeCode Available	0
VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation	Jun 25, 2024	ARCBenchmarking	CodeCode Available	0
Steering Without Side Effects: Improving Post-Deployment Control of Language Models	Jun 21, 2024	Red TeamingTruthfulQA	CodeCode Available	0
Enhancing Language Model Factuality via Activation-Based Confidence Calibration and Guided Decoding	Jun 19, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models	May 31, 2024	TriviaQATruthfulQA	CodeCode Available	0
Multi-Reference Preference Optimization for Large Language Models	May 26, 2024	GSM8KTruthfulQA	—Unverified	0
Machine Unlearning in Large Language Models	May 24, 2024	Machine UnlearningTruthfulQA	CodeCode Available	1
Instruction Tuning With Loss Over Instructions	May 23, 2024	HumanEvalMMLU	CodeCode Available	1
RLHF Workflow: From Reward Modeling to Online RLHF	May 13, 2024	ChatbotHumanEval	CodeCode Available	5
Harmonic LLMs are Trustworthy	Apr 30, 2024	HallucinationTruthfulQA	—Unverified	0
Student Data Paradox and Curious Case of Single Student-Tutor Model: Regressive Side Effects of Training LLMs for Personalized Learning	Apr 23, 2024	ARCCommon Sense Reasoning	—Unverified	0

Show:10 25 50

← PrevPage 1 of 2Next →

No leaderboard results yet.