| Evaluating Language Model Finetuning Techniques for Low-resource Languages | Jun 30, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AuditWen:An Open-Source Large Language Model for Audit | Oct 9, 2024 | Answer GenerationLanguage Modeling | CodeCode Available | 1 |
| EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees | Mar 11, 2025 | ChatbotLanguage Modeling | CodeCode Available | 1 |
| An Engorgio Prompt Makes Large Language Model Babble on | Dec 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Evaluating Human-Language Model Interaction | Dec 19, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational Curricula | Aug 8, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 |
| Language Model Unalignment: Parametric Red-Teaming to Expose Hidden Harms and Biases | Oct 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Language Model Uncertainty Quantification with Attention Chain | Mar 24, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 |
| Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation | Sep 19, 2023 | Language Model EvaluationLanguage Modeling | CodeCode Available | 1 |
| BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding | Mar 27, 2025 | FormLanguage Modeling | CodeCode Available | 1 |