| Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models | Mar 30, 2025 | BenchmarkingRelational Reasoning | —Unverified | 0 |
| Benchmarking symbolic regression constant optimization schemes | Dec 3, 2024 | Benchmarkingregression | —Unverified | 0 |
| Benchmarking Surrogate-Assisted Genetic Recommender Systems | Aug 8, 2019 | BenchmarkingEvolutionary Algorithms | —Unverified | 0 |
| A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values | Jun 5, 2025 | Benchmarking | —Unverified | 0 |
| A large-scale, physically-based synthetic dataset for satellite pose estimation | Jun 15, 2025 | BenchmarkingDataset Generation | —Unverified | 0 |
| Benchmarking Super-Resolution Algorithms on Real Data | Sep 8, 2017 | BenchmarkingSuper-Resolution | —Unverified | 0 |
| A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models | Feb 21, 2024 | BenchmarkingImage to text | —Unverified | 0 |
| A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects | Jun 16, 2025 | BenchmarkingInstance Segmentation | —Unverified | 0 |
| Stereotype Detection in LLMs: A Multiclass, Explainable, and Benchmark-Driven Approach | Apr 2, 2024 | BenchmarkingCommon Sense Reasoning | —Unverified | 0 |
| Benchmarking Sub-Genre Classification For Mainstage Dance Music | Sep 10, 2024 | BenchmarkingClassification | —Unverified | 0 |
| A large-scale heterogeneous 3D magnetic resonance brain imaging dataset for self-supervised learning | Jun 17, 2025 | BenchmarkingSelf-Supervised Learning | —Unverified | 0 |
| Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations | Aug 3, 2024 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 |
| Geometry-Based Next Frame Prediction from Monocular Video | Sep 20, 2016 | Autonomous DrivingBenchmarking | —Unverified | 0 |
| Geospatial Foundation Models to Enable Progress on Sustainable Development Goals | May 30, 2025 | BenchmarkingEarth Observation | —Unverified | 0 |
| Global Rice Multi-Class Segmentation Dataset (RiceSEG): A Comprehensive and Diverse High-Resolution RGB-Annotated Images for the Development and Benchmarking of Rice Segmentation Algorithms | Apr 2, 2025 | BenchmarkingSemantic Segmentation | —Unverified | 0 |
| Variational Laplace for Bayesian neural networks | Nov 20, 2020 | BenchmarkingVariational Inference | —Unverified | 0 |
| Benchmarking state-of-the-art gradient boosting algorithms for classification | May 26, 2023 | Bayesian OptimizationBenchmarking | —Unverified | 0 |
| Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture | Apr 21, 2025 | Benchmarkingclass-incremental learning | —Unverified | 0 |
| Benchmarking State-of-the-Art Deep Learning Software Tools | Aug 25, 2016 | BenchmarkingCPU | —Unverified | 0 |
| A Large-Scale Evaluation of Speech Foundation Models | Apr 15, 2024 | Benchmarking | —Unverified | 0 |
| Benchmarking Spiking Neural Network Learning Methods with Varying Locality | Feb 1, 2024 | Benchmarking | —Unverified | 0 |
| A Large-scale Evaluation of Pretraining Paradigms for the Detection of Defects in Electroluminescence Solar Cell Images | Feb 27, 2024 | BenchmarkingDefect Detection | —Unverified | 0 |
| A2Perf: Real-World Autonomous Agents Benchmark | Mar 4, 2025 | BenchmarkingCombinatorial Optimization | —Unverified | 0 |
| A 28-nm Convolutional Neuromorphic Processor Enabling Online Learning with Spike-Based Retinas | May 13, 2020 | BenchmarkingEdge-computing | —Unverified | 0 |
| Benchmarking sparse system identification with low-dimensional chaos | Feb 4, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking SMT Performance for Farsi Using the TEP++ Corpus | May 1, 2015 | BenchmarkingMachine Translation | —Unverified | 0 |
| A Two-Step Framework for Multi-Material Decomposition of Dual Energy Computed Tomography from Projection Domain | Oct 31, 2023 | BenchmarkingDiagnostic | —Unverified | 0 |
| Benchmarking Smoothness and Reducing High-Frequency Oscillations in Continuous Control Policies | Oct 22, 2024 | Benchmarkingcontinuous-control | —Unverified | 0 |
| A Two-Stage Neural-Filter Pareto Front Extractor and the need for Benchmarking | Sep 29, 2021 | BenchmarkingMulti-Task Learning | —Unverified | 0 |
| Benchmarking Single-Image Reflection Removal Algorithms | Oct 1, 2017 | BenchmarkingReflection Removal | —Unverified | 0 |
| A tutorial on multi-view autoencoders using the multi-view-AE library | Mar 12, 2024 | Benchmarking | —Unverified | 0 |
| Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking | Jan 8, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 |
| Benchmarking simulated and physical quantum processing units using quantum and hybrid algorithms | Nov 28, 2022 | Benchmarking | —Unverified | 0 |
| A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking | Jun 21, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| Benchmarking Shadow Removal for Facial Landmark Detection and Beyond | Nov 27, 2021 | BenchmarkingBlocking | —Unverified | 0 |
| A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs | Apr 22, 2025 | BenchmarkingClass-level Code Generation | —Unverified | 0 |
| Benchmarking Sensitivity of Continual Graph Learning for Skeleton-Based Action Recognition | Jan 31, 2024 | Action RecognitionBenchmarking | —Unverified | 0 |
| GenSpace: Benchmarking Spatially-Aware Image Generation | May 30, 2025 | BenchmarkingImage Generation | —Unverified | 0 |
| A Large-Scale Analysis on Self-Supervised Video Representation Learning | Jun 9, 2023 | BenchmarkingRepresentation Learning | —Unverified | 0 |
| A Large-scale Benchmark on Geological Fault Delineation Models: Domain Shift, Training Dynamics, Generalizability, Evaluation and Inferential Behavior | May 13, 2025 | BenchmarkingSeismic Interpretation | —Unverified | 0 |
| On the Evaluation of Engineering Artificial General Intelligence | May 15, 2025 | Benchmarking | —Unverified | 0 |
| Genicious: Contextual Few-shot Prompting for Insights Discovery | Mar 15, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks | Sep 29, 2024 | Benchmarking | —Unverified | 0 |
| Benchmarking Scientific Image Forgery Detectors | May 26, 2021 | Benchmarking | —Unverified | 0 |
| Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam | Apr 9, 2021 | BenchmarkingScene Text Recognition | —Unverified | 0 |
| Benchmarking Sample Selection Strategies for Batch Reinforcement Learning | Sep 29, 2021 | BenchmarkingImitation Learning | —Unverified | 0 |
| A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking | Feb 28, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models | Feb 4, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models | Jun 7, 2024 | BenchmarkingDenoising | —Unverified | 0 |
| GeoGebra Tools with Proof Capabilities | Mar 3, 2016 | Automated Theorem ProvingBenchmarking | —Unverified | 0 |