| On the Use of Quality Diversity Algorithms for The Traveling Thief Problem | Dec 16, 2021 | BenchmarkingDiversity | —Unverified | 0 | 0 |
| On the Utility of Equivariance and Symmetry Breaking in Deep Learning Architectures on Point Clouds | Jan 1, 2025 | Benchmarking | —Unverified | 0 | 0 |
| On the Value of ML Models | Dec 13, 2021 | Benchmarking | —Unverified | 0 | 0 |
| TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation | Jul 1, 2025 | BenchmarkingMachine Translation | —Unverified | 0 | 0 |
| ACT-Bench: Towards Action Controllable World Models for Autonomous Driving | Dec 6, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images | Apr 17, 2023 | 3D Pose EstimationBenchmarking | —Unverified | 0 | 0 |
| OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations | Dec 3, 2024 | BenchmarkingFace Recognition | —Unverified | 0 | 0 |
| Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics | Sep 17, 2021 | AttributeBenchmarking | —Unverified | 0 | 0 |
| OOD-Speech: A Large Bengali Speech Recognition Dataset for Out-of-Distribution Benchmarking | May 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Benchmarking and Validation of Sub-mW 30GHz VG-LNAs in 22nm FDSOI CMOS for 5G/6G Phased-Array Receivers | Sep 11, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing | May 22, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking and Performance Modelling of MapReduce Communication Pattern | May 23, 2020 | Benchmarking | —Unverified | 0 | 0 |
| TransOpt: Transformer-based Representation Learning for Optimization Problem Classification | Nov 29, 2023 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Benchmarking and Optimization of Gradient Boosting Decision Tree Algorithms | Sep 12, 2018 | Bayesian OptimizationBenchmarking | —Unverified | 0 | 0 |
| Open-CD: A Comprehensive Toolbox for Change Detection | Jul 22, 2024 | BenchmarkingChange Detection | —Unverified | 0 | 0 |
| Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation | Dec 15, 2024 | 3D GenerationBenchmarking | —Unverified | 0 | 0 |
| OpenContrails: Benchmarking Contrail Detection on GOES-16 ABI | Apr 4, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Open Datasets for Satellite Radio Resource Control | Apr 22, 2024 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors | Sep 29, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 | 0 |
| OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation | Apr 18, 2025 | Benchmarking | —Unverified | 0 | 0 |
| TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models | Jan 9, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Relation Extraction Across Entire Books to Reconstruct Community Networks: The AffilKG Datasets | May 16, 2025 | BenchmarkingKnowledge Graphs | —Unverified | 0 | 0 |
| OpenDPD: An Open-Source End-to-End Learning & Benchmarking Framework for Wideband Power Amplifier Modeling and Digital Pre-Distortion | Jan 16, 2024 | Benchmarking | —Unverified | 0 | 0 |
| OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety | Mar 18, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 | 0 |
| Benchmarking and Improving Generator-Validator Consistency of Language Models | Oct 3, 2023 | BenchmarkingInstruction Following | —Unverified | 0 | 0 |