| Cost-Saving LLM Cascades with Early Abstention | Feb 13, 2025 | GSM8KMMLU | —Unverified | 0 |
| LLMAuditor: A Framework for Auditing Large Language Models Using Human-in-the-Loop | Feb 14, 2024 | HallucinationTruthfulQA | —Unverified | 0 |
| DYNAMAX: Dynamic computing for Transformers and Mamba based architectures | Apr 29, 2025 | MambaTriviaQA | —Unverified | 0 |
| A Debate-Driven Experiment on LLM Hallucinations and Accuracy | Oct 25, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer | Apr 17, 2025 | Conformal PredictionTruthfulQA | —Unverified | 0 |
| Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2 | May 9, 2025 | ARCBelebele | —Unverified | 0 |
| Evaluating Consistencies in LLM responses through a Semantic Clustering of Question Answering | Oct 20, 2024 | Language ModellingLarge Language Model | —Unverified | 0 |
| GRATH: Gradual Self-Truthifying for Large Language Models | Jan 22, 2024 | TruthfulQA | —Unverified | 0 |
| Harmonic LLMs are Trustworthy | Apr 30, 2024 | HallucinationTruthfulQA | —Unverified | 0 |
| Instruction Tuning with Human Curriculum | Oct 14, 2023 | ARCMMLU | —Unverified | 0 |