| A Roadmap for Improving Data Reliability and Sharing in Crosslinking Mass Spectrometry | Apr 9, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Unsupervised Single Image Deraining with Self-supervised Constraints | Nov 21, 2018 | BenchmarkingGenerative Adversarial Network | —Unverified | 0 | 0 |
| Robust 2D/3D Vehicle Parsing in CVIS | Mar 11, 2021 | BenchmarkingData Augmentation | —Unverified | 0 | 0 |
| A Risk Taxonomy for Evaluating AI-Powered Psychotherapy Agents | May 21, 2025 | BenchmarkingDecompensation | —Unverified | 0 | 0 |
| A rigorous benchmarking of methods for SARS-CoV-2 lineage abundance estimation in wastewater | Sep 29, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Unsupervised Spectral Demosaicing with Lightweight Spectral Attention Networks | Jul 5, 2023 | BenchmarkingDemosaicking | —Unverified | 0 | 0 |
| Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM | Nov 13, 2019 | BenchmarkingPose Estimation | —Unverified | 0 | 0 |
| Robust measurement of innovation performances in Europe with a hierarchy of interacting composite indicators | May 18, 2019 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| Robust Medical Instrument Segmentation Challenge 2019 | Mar 23, 2020 | BenchmarkingInstance Segmentation | —Unverified | 0 | 0 |
| RobustMQ: Benchmarking Robustness of Quantized Models | Aug 4, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 | 0 |
| Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition | Jun 13, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Robustness of Reinforcement Learning-Based Traffic Signal Control under Incidents: A Comparative Study | Jun 16, 2025 | BenchmarkingTraffic Signal Control | —Unverified | 0 | 0 |
| A Review of Reinforcement Learning in Financial Applications | Nov 1, 2024 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| Robust Salient Object Detection on Compressed Images Using Convolutional Neural Networks | Sep 20, 2024 | Benchmarkingobject-detection | —Unverified | 0 | 0 |
| A Review of Intelligent Music Generation Systems | Nov 16, 2022 | BenchmarkingMusic Generation | —Unverified | 0 | 0 |
| RobustSpring: Benchmarking Robustness to Image Corruptions for Optical Flow, Scene Flow and Stereo | May 14, 2025 | BenchmarkingOptical Flow Estimation | —Unverified | 0 | 0 |
| Robust Vision Challenge 2020 -- 1st Place Report for Panoptic Segmentation | Aug 23, 2020 | BenchmarkingPanoptic Segmentation | —Unverified | 0 | 0 |
| A review of faithfulness metrics for hallucination assessment in Large Language Models | Dec 31, 2024 | BenchmarkingHallucination | —Unverified | 0 | 0 |
| A Review of Deep Reinforcement Learning in Serverless Computing: Function Scheduling and Resource Auto-Scaling | Oct 5, 2023 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| Unsupervised Synthetic Image Refinement via Contrastive Learning and Consistent Semantic-Structural Constraints | Apr 25, 2023 | BenchmarkingContrastive Learning | —Unverified | 0 | 0 |
| A Review of Bayesian Uncertainty Quantification in Deep Probabilistic Image Segmentation | Nov 25, 2024 | Active LearningBayesian Inference | —Unverified | 0 | 0 |
| A Review of 315 Benchmark and Test Functions for Machine Learning Optimization Algorithms and Metaheuristics with Mathematical and Visual Descriptions | Jun 13, 2024 | Benchmarking | —Unverified | 0 | 0 |
| A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics | Nov 8, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Are SNNs Truly Energy-efficient? - A Hardware Perspective | Sep 6, 2023 | Benchmarking | —Unverified | 0 | 0 |
| WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution | Apr 28, 2025 | BenchmarkingImage Attribution | —Unverified | 0 | 0 |
| RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands | Aug 20, 2024 | BenchmarkingContact-rich Manipulation | —Unverified | 0 | 0 |
| A Report on the 2020 Sarcasm Detection Shared Task | May 12, 2020 | BenchmarkingSarcasm Detection | —Unverified | 0 | 0 |
| RRSIS: Referring Remote Sensing Image Segmentation | Jun 14, 2023 | BenchmarkingImage Segmentation | —Unverified | 0 | 0 |
| A Report on the 2018 VUA Metaphor Detection Shared Task | Jun 1, 2018 | Benchmarking | —Unverified | 0 | 0 |
| Arena-Web -- A Web-based Development and Benchmarking Platform for Autonomous Navigation Approaches | Feb 6, 2023 | Autonomous NavigationBenchmarking | —Unverified | 0 | 0 |
| RT-Pose: A 4D Radar Tensor-based 3D Human Pose Estimation and Localization Benchmark | Jul 18, 2024 | 3D Human Pose EstimationBenchmarking | —Unverified | 0 | 0 |
| Unveiling the potential of large language models in generating semantic and cross-language clones | Sep 12, 2023 | BenchmarkingCode Generation | —Unverified | 0 | 0 |
| Arena 4.0: A Comprehensive ROS2 Development and Benchmarking Platform for Human-centric Navigation Using Generative-Model-based Environment Generation | Sep 19, 2024 | BenchmarkingSocial Navigation | —Unverified | 0 | 0 |
| Rule-based Data Selection for Large Language Models | Oct 7, 2024 | BenchmarkingMath | —Unverified | 0 | 0 |
| A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach | Mar 10, 2022 | BenchmarkingSentence | —Unverified | 0 | 0 |
| Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs | Nov 1, 2023 | BenchmarkingQuestion Answering | —Unverified | 0 | 0 |
| RxRx3-core: Benchmarking drug-target interactions in High-Content Microscopy | Mar 26, 2025 | BenchmarkingRepresentation Learning | —Unverified | 0 | 0 |
| A Reinforcement Learning Environment for Directed Quantum Circuit Synthesis | Jan 13, 2024 | Benchmarkingreinforcement-learning | —Unverified | 0 | 0 |
| UPREVE: An End-to-End Causal Discovery Benchmarking System | Jul 25, 2023 | BenchmarkingCausal Discovery | —Unverified | 0 | 0 |
| Urania: Differentially Private Insights into AI Use | Jun 5, 2025 | BenchmarkingChatbot | —Unverified | 0 | 0 |
| Sadeed: Advancing Arabic Diacritization Through Small Language Model | Apr 30, 2025 | Arabic Text DiacritizationBenchmarking | —Unverified | 0 | 0 |
| Safe Load Balancing in Software-Defined-Networking | Oct 22, 2024 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces | Mar 8, 2025 | Benchmarkingcounterfactual | —Unverified | 0 | 0 |
| A Real-time Spatio-Temporal Trajectory Planner for Autonomous Vehicles with Semantic Graph Optimization | Feb 25, 2025 | Autonomous VehiclesBenchmarking | —Unverified | 0 | 0 |
| MAPS: Multi-Fidelity AI-Augmented Photonic Simulation and Inverse Design Infrastructure | Mar 2, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Are All Steps Equally Important? Benchmarking Essentiality Detection of Events | Oct 8, 2022 | AllBenchmarking | —Unverified | 0 | 0 |
| A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification | Jul 16, 2024 | BenchmarkingFew-Shot Learning | —Unverified | 0 | 0 |
| SAIBench: A Structural Interpretation of AI for Science Through Benchmarks | Nov 29, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 | 0 |
| SAIBench: Benchmarking AI for Science | Jun 11, 2022 | BenchmarkingFriction | —Unverified | 0 | 0 |
| Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics | Apr 27, 2017 | AllBenchmarking | —Unverified | 0 | 0 |