| NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models | Oct 11, 2024 | Multiple-choiceTruthfulQA | CodeCode Available | 0 |
| Towards Multilingual LLM Evaluation for European Languages | Oct 11, 2024 | ARCGSM8K | —Unverified | 0 |
| A test suite of prompt injection attacks for LLM-based machine translation | Oct 7, 2024 | Machine TranslationTranslation | CodeCode Available | 0 |
| Efficiently Deploying LLMs with Controlled Risk | Oct 3, 2024 | MMLUTruthfulQA | —Unverified | 0 |
| Integrative Decoding: Improve Factuality via Implicit Self-consistency | Oct 2, 2024 | TruthfulQA | CodeCode Available | 1 |
| Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs | Sep 30, 2024 | ARCDiversity | —Unverified | 0 |
| Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models | Sep 7, 2024 | MMLUTruthfulQA | —Unverified | 0 |
| Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused | Aug 16, 2024 | HallucinationTruthfulQA | —Unverified | 0 |
| LokiLM: Technical Report | Jul 10, 2024 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| metabench -- A Sparse Benchmark to Measure General Ability in Large Language Models | Jul 4, 2024 | ARCGSM8K | CodeCode Available | 0 |