| Benchmarking MOEAs for solving continuous multi-objective RL problems | May 19, 2025 | BenchmarkingEvolutionary Algorithms | CodeCode Available | 0 |
| SzCORE as a benchmark: report from the seizure detection challenge at the 2025 AI in Epilepsy and Neurological Disorders Conference | May 19, 2025 | BenchmarkingEEG | —Unverified | 0 |
| HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems Immunity | May 19, 2025 | Benchmarkingfeature selection | CodeCode Available | 0 |
| Disambiguation in Conversational Question Answering in the Era of LLM: A Survey | May 18, 2025 | BenchmarkingConversational Question Answering | —Unverified | 0 |
| ChemPile: A 250GB Diverse and Curated Dataset for Chemical Foundation Models | May 18, 2025 | ArticlesBenchmarking | —Unverified | 0 |
| Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind | May 18, 2025 | BenchmarkingScene Understanding | —Unverified | 0 |
| CompBench: Benchmarking Complex Instruction-guided Image Editing | May 18, 2025 | BenchmarkingInstruction Following | —Unverified | 0 |
| OSS-Bench: Benchmark Generator for Coding LLMs | May 18, 2025 | Benchmarking | CodeCode Available | 0 |
| GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation | May 17, 2025 | Benchmarking | —Unverified | 0 |
| SoftPQ: Robust Instance Segmentation Evaluation via Soft Matching and Tunable Thresholds | May 17, 2025 | BenchmarkingBinary Classification | CodeCode Available | 0 |