| WCEbleedGen: A wireless capsule endoscopy dataset and its benchmarking for automatic bleeding classification, detection, and segmentation | Aug 22, 2024 | BenchmarkingClassification | CodeCode Available | 0 |
| Advances in Preference-based Reinforcement Learning: A Review | Aug 21, 2024 | Benchmarkingreinforcement-learning | —Unverified | 0 |
| SimBench: A Rule-Based Multi-Turn Interaction Benchmark for Evaluating an LLM's Ability to Generate Digital Twins | Aug 21, 2024 | Benchmarking | CodeCode Available | 0 |
| WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain | Aug 21, 2024 | Answer GenerationBenchmarking | —Unverified | 0 |
| ISLES'24: Improving final infarct prediction in ischemic stroke using multimodal imaging and clinical data | Aug 20, 2024 | Benchmarking | —Unverified | 0 |
| UKAN: Unbound Kolmogorov-Arnold Network Accompanied with Accelerated Library | Aug 20, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Benchmarking Large Language Models for Math Reasoning Tasks | Aug 20, 2024 | BenchmarkingIn-Context Learning | CodeCode Available | 0 |
| PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis | Aug 20, 2024 | Benchmarking | CodeCode Available | 2 |
| RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands | Aug 20, 2024 | BenchmarkingContact-rich Manipulation | —Unverified | 0 |
| QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning | Aug 20, 2024 | BenchmarkingLanguage Modelling | —Unverified | 0 |