| MarineGym: A High-Performance Reinforcement Learning Platform for Underwater Robotics | Mar 12, 2025 | BenchmarkingGPU | —Unverified | 0 |
| SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models | Mar 12, 2025 | BenchmarkingFairness | —Unverified | 0 |
| Ev-Layout: A Large-scale Event-based Multi-modal Dataset for Indoor Layout Estimation and Tracking | Mar 11, 2025 | Benchmarking | —Unverified | 0 |
| Comprehensive Benchmarking of Machine Learning Methods for Risk Prediction Modelling from Large-Scale Survival Data: A UK Biobank Study | Mar 11, 2025 | Benchmarking | —Unverified | 0 |
| Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and Challenges | Mar 11, 2025 | Benchmarking | CodeCode Available | 0 |
| Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models | Mar 11, 2025 | BenchmarkingHyperparameter Optimization | CodeCode Available | 0 |
| ResBench: Benchmarking LLM-Generated FPGA Designs with Resource Awareness | Mar 11, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies | Mar 10, 2025 | BenchmarkingEthics | —Unverified | 0 |
| Skelite: Compact Neural Networks for Efficient Iterative Skeletonization | Mar 10, 2025 | BenchmarkingComputational Efficiency | CodeCode Available | 0 |
| Towards Large Language Models that Benefit for All: Benchmarking Group Fairness in Reward Models | Mar 10, 2025 | AllBenchmarking | —Unverified | 0 |