| LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models | May 31, 2024 | TriviaQATruthfulQA | CodeCode Available | 0 | 5 |
| Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback | May 24, 2023 | TriviaQATruthfulQA | CodeCode Available | 0 | 5 |
| A test suite of prompt injection attacks for LLM-based machine translation | Oct 7, 2024 | Machine TranslationTranslation | CodeCode Available | 0 | 5 |
| Steering Without Side Effects: Improving Post-Deployment Control of Language Models | Jun 21, 2024 | Red TeamingTruthfulQA | CodeCode Available | 0 | 5 |
| NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models | Oct 11, 2024 | Multiple-choiceTruthfulQA | CodeCode Available | 0 | 5 |
| SaGE: Evaluating Moral Consistency in Large Language Models | Feb 21, 2024 | Decision MakingHellaSwag | CodeCode Available | 0 | 5 |
| PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics | Apr 6, 2024 | BenchmarkingHallucination | CodeCode Available | 0 | 5 |
| Instruction Tuning with Human Curriculum | Oct 14, 2023 | ARCMMLU | CodeCode Available | 0 | 5 |
| CHAIR -- Classifier of Hallucination as Improver | Jan 5, 2025 | HallucinationMMLU | CodeCode Available | 0 | 5 |
| Measuring Reliability of Large Language Models through Semantic Consistency | Nov 10, 2022 | Text GenerationTruthfulQA | CodeCode Available | 0 | 5 |