| NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models | Oct 11, 2024 | Multiple-choiceTruthfulQA | CodeCode Available | 0 |
| A test suite of prompt injection attacks for LLM-based machine translation | Oct 7, 2024 | Machine TranslationTranslation | CodeCode Available | 0 |
| Steering Without Side Effects: Improving Post-Deployment Control of Language Models | Jun 21, 2024 | Red TeamingTruthfulQA | CodeCode Available | 0 |
| PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics | Apr 6, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models | Apr 14, 2024 | TruthfulQA | CodeCode Available | 0 |
| (WhyPHI) Fine-Tuning PHI-3 for Multiple-Choice Question Answering: Methodology, Results, and Challenges | Jan 3, 2025 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| Multi-Agent Reinforcement Learning with Focal Diversity Optimization | Feb 6, 2025 | DiversityMulti-agent Reinforcement Learning | CodeCode Available | 0 |
| Measuring Reliability of Large Language Models through Semantic Consistency | Nov 10, 2022 | Text GenerationTruthfulQA | CodeCode Available | 0 |
| metabench -- A Sparse Benchmark to Measure General Ability in Large Language Models | Jul 4, 2024 | ARCGSM8K | CodeCode Available | 0 |
| Instruction Tuning with Human Curriculum | Oct 14, 2023 | ARCMMLU | CodeCode Available | 0 |