| Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models | Jul 3, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| AstroAgents: A Multi-Agent AI for Hypothesis Generation from Mass Spectrometry Data | Mar 29, 2025 | Large Language Model | CodeCode Available | 1 | 5 |
| CityBench: Evaluating the Capabilities of Large Language Models for Urban Tasks | Jun 20, 2024 | General KnowledgeHuman Dynamics | CodeCode Available | 1 | 5 |
| DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity Environments | May 31, 2025 | Large Language Model | CodeCode Available | 1 | 5 |
| Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences | Jan 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MEPNet: Medical Entity-balanced Prompting Network for Brain CT Report Generation | Mar 22, 2025 | AnatomyLarge Language Model | CodeCode Available | 1 | 5 |
| Motif: Intrinsic Motivation from Artificial Intelligence Feedback | Sep 29, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 | 5 |
| CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing | Feb 4, 2025 | Collaborative InferenceLanguage Modeling | CodeCode Available | 1 | 5 |
| Citekit: A Modular Toolkit for Large Language Model Citation Generation | Aug 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Explaining Relationships Between Scientific Documents | Feb 2, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |