| Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback | May 24, 2023 | TriviaQATruthfulQA | —Unverified | 0 | 0 |
| Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity | Nov 15, 2024 | Contrastive LearningHallucination | —Unverified | 0 | 0 |
| LokiLM: Technical Report | Jul 10, 2024 | Knowledge DistillationLanguage Modeling | —Unverified | 0 | 0 |
| Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused | Aug 16, 2024 | HallucinationTruthfulQA | —Unverified | 0 | 0 |
| Maintaining Informative Coherence: Migrating Hallucinations in Large Language Models via Absorbing Markov Chains | Oct 27, 2024 | Text GenerationTruthfulQA | —Unverified | 0 | 0 |
| Mitigating Adversarial Attacks in LLMs through Defensive Suffix Generation | Dec 18, 2024 | TruthfulQA | —Unverified | 0 | 0 |
| Model Unlearning via Sparse Autoencoder Subspace Guided Projections | May 30, 2025 | Adversarial Robustnessfeature selection | —Unverified | 0 | 0 |
| Monty Hall and Optimized Conformal Prediction to Improve Decision-Making with LLMs | Dec 31, 2024 | Conformal PredictionDecision Making | —Unverified | 0 | 0 |
| More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment | Apr 3, 2025 | ARCHellaSwag | —Unverified | 0 | 0 |
| Multi-Reference Preference Optimization for Large Language Models | May 26, 2024 | GSM8KTruthfulQA | —Unverified | 0 | 0 |