| MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks | May 6, 2025 | BenchmarkingMultiple-choice | CodeCode Available | 0 |
| ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant | May 6, 2025 | DescriptiveMultiple-choice | CodeCode Available | 0 |
| Unlearning vs. Obfuscation: Are We Truly Removing Knowledge? | May 5, 2025 | Multiple-choice | —Unverified | 0 |
| Developing A Framework to Support Human Evaluation of Bias in Generated Free Response Text | May 5, 2025 | Multiple-choice | —Unverified | 0 |
| LLM-based Text Simplification and its Effect on User Comprehension and Cognitive Load | May 4, 2025 | ArticlesMultiple-choice | —Unverified | 0 |
| LookAlike: Consistent Distractor Generation in Math MCQs | May 3, 2025 | Distractor GenerationMath | —Unverified | 0 |
| Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors | May 2, 2025 | High School PhysicsMisconceptions | CodeCode Available | 0 |
| Adaptive Wizard for Removing Cross-Tier Misconfigurations in Active Directory | May 2, 2025 | Multiple-choice | —Unverified | 0 |
| SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning | Apr 22, 2025 | Multiple-choicereinforcement-learning | —Unverified | 0 |
| LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception | Apr 21, 2025 | MathMMLU | —Unverified | 0 |