| LLM-BABYBENCH: Understanding and Evaluating Grounded Planning and Reasoning in LLMs | May 17, 2025 | Task 2 | CodeCode Available | 0 |
| TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers | May 13, 2025 | Natural Language UnderstandingTask 2 | —Unverified | 0 |
| Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs | Apr 29, 2025 | Machine TranslationTask 2 | —Unverified | 0 |
| Feature Fusion Revisited: Multimodal CTR Prediction for MMCTR Challenge | Apr 26, 2025 | Click-Through Rate PredictionInformation Retrieval | CodeCode Available | 0 |
| BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts | Apr 24, 2025 | Backdoor AttackMixture-of-Experts | —Unverified | 0 |
| Data Augmentation Using Neural Acoustic Fields With Retrieval-Augmented Pre-training | Apr 19, 2025 | Data AugmentationRetrieval | —Unverified | 0 |
| HausaNLP at SemEval-2025 Task 2: Entity-Aware Fine-tuning vs. Prompt Engineering in Entity-Aware Machine Translation | Mar 25, 2025 | Machine TranslationPrompt Engineering | —Unverified | 0 |
| Towards Universal Learning-based Model for Cardiac Image Reconstruction: Summary of the CMRxRecon2024 Challenge | Mar 5, 2025 | BenchmarkingImage Reconstruction | CodeCode Available | 0 |
| Bridging vision language model (VLM) evaluation gaps with a framework for scalable and cost-effective benchmark generation | Feb 21, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Fine-Tuning Open-Source Large Language Models to Improve Their Performance on Radiation Oncology Tasks: A Feasibility Study to Investigate Their Potential Clinical Applications in Radiation Oncology | Jan 28, 2025 | DiagnosticTask 2 | —Unverified | 0 |