| Decisions and Performance Under Bounded Rationality: A Computational Benchmarking Approach | May 26, 2020 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| Transfer of Knowledge through Reverse Annealing: A Preliminary Analysis of the Benefits and What to Share | Jan 27, 2025 | BenchmarkingTransfer Learning | —Unverified | 0 | 0 |
| What Will it Take to Fix Benchmarking in Natural Language Understanding? | Apr 5, 2021 | BenchmarkingNatural Language Understanding | —Unverified | 0 | 0 |
| Transformed Subspace Clustering | Dec 10, 2019 | BenchmarkingClustering | —Unverified | 0 | 0 |
| On the Evaluation of Speech Foundation Models for Spoken Language Understanding | Jun 14, 2024 | BenchmarkingPrediction | —Unverified | 0 | 0 |
| On the Evaluation of User Privacy in Deep Neural Networks using Timing Side Channel | Aug 1, 2022 | Benchmarkingimage-classification | —Unverified | 0 | 0 |
| Transformers in Protein: A Survey | May 26, 2025 | BenchmarkingDrug Discovery | —Unverified | 0 | 0 |
| Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics | Apr 21, 2022 | AttributeBenchmarking | —Unverified | 0 | 0 |
| On the Impact of Data Heterogeneity in Federated Learning Environments with Application to Healthcare Networks | Apr 29, 2024 | BenchmarkingFederated Learning | —Unverified | 0 | 0 |
| Broadening the Scope of Neural Network Potentials through Direct Inclusion of Additional Molecular Attributes | Mar 22, 2024 | Benchmarking | —Unverified | 0 | 0 |
| On the Interaction of Belief Bias and Explanations | Jun 29, 2021 | Benchmarking | —Unverified | 0 | 0 |
| Visual Anomaly Detection under Complex View-Illumination Interplay: A Large-Scale Benchmark | May 16, 2025 | Anomaly DetectionBenchmarking | —Unverified | 0 | 0 |
| On the Performance of Multimodal Language Models | Oct 4, 2023 | BenchmarkingBinary Classification | —Unverified | 0 | 0 |
| On the Potential of Large Language Models to Solve Semantics-Aware Process Mining Tasks | Apr 29, 2025 | Anomaly DetectionBenchmarking | —Unverified | 0 | 0 |
| On the project risk baseline: integrating aleatory uncertainty into project scheduling | May 31, 2024 | BenchmarkingScheduling | —Unverified | 0 | 0 |
| On the Real-Time Semantic Segmentation of Aphid Clusters in the Wild | Jul 17, 2023 | BenchmarkingReal-Time Semantic Segmentation | —Unverified | 0 | 0 |
| On the reduction of Linear Parameter-Varying State-Space models | Apr 2, 2024 | BenchmarkingDimensionality Reduction | —Unverified | 0 | 0 |
| On the relationship between Benchmarking, Standards and Certification in Robotics and AI | Sep 21, 2023 | Benchmarking | —Unverified | 0 | 0 |
| On the Reliability and Validity of Detecting Approval of Political Actors in Tweets | Nov 1, 2020 | BenchmarkingSentiment Analysis | —Unverified | 0 | 0 |
| On the Robustness of Human-Object Interaction Detection against Distribution Shift | Jun 22, 2025 | BenchmarkingData Augmentation | —Unverified | 0 | 0 |
| On the role of benchmarking data sets and simulations in method comparison studies | Aug 2, 2022 | Benchmarking | —Unverified | 0 | 0 |
| Optimizer Benchmarking Needs to Account for Hyperparameter Tuning | Oct 25, 2019 | Benchmarking | —Unverified | 0 | 0 |
| Transformers Utilization in Chart Understanding: A Review of Recent Advances & Future Trends | Oct 5, 2024 | BenchmarkingChart Understanding | —Unverified | 0 | 0 |
| Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning | Oct 14, 2024 | Atari GamesBenchmarking | —Unverified | 0 | 0 |
| Translation Canvas: An Explainable Interface to Pinpoint and Analyze Translation Systems | Oct 7, 2024 | BenchmarkingMachine Translation | —Unverified | 0 | 0 |
| On the Use of Quality Diversity Algorithms for The Traveling Thief Problem | Dec 16, 2021 | BenchmarkingDiversity | —Unverified | 0 | 0 |
| On the Utility of Equivariance and Symmetry Breaking in Deep Learning Architectures on Point Clouds | Jan 1, 2025 | Benchmarking | —Unverified | 0 | 0 |
| On the Value of ML Models | Dec 13, 2021 | Benchmarking | —Unverified | 0 | 0 |
| TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation | Jul 1, 2025 | BenchmarkingMachine Translation | —Unverified | 0 | 0 |
| ACT-Bench: Towards Action Controllable World Models for Autonomous Driving | Dec 6, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images | Apr 17, 2023 | 3D Pose EstimationBenchmarking | —Unverified | 0 | 0 |
| OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations | Dec 3, 2024 | BenchmarkingFace Recognition | —Unverified | 0 | 0 |
| Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics | Sep 17, 2021 | AttributeBenchmarking | —Unverified | 0 | 0 |
| OOD-Speech: A Large Bengali Speech Recognition Dataset for Out-of-Distribution Benchmarking | May 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Benchmarking and Validation of Sub-mW 30GHz VG-LNAs in 22nm FDSOI CMOS for 5G/6G Phased-Array Receivers | Sep 11, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing | May 22, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking and Performance Modelling of MapReduce Communication Pattern | May 23, 2020 | Benchmarking | —Unverified | 0 | 0 |
| TransOpt: Transformer-based Representation Learning for Optimization Problem Classification | Nov 29, 2023 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Benchmarking and Optimization of Gradient Boosting Decision Tree Algorithms | Sep 12, 2018 | Bayesian OptimizationBenchmarking | —Unverified | 0 | 0 |
| Open-CD: A Comprehensive Toolbox for Change Detection | Jul 22, 2024 | BenchmarkingChange Detection | —Unverified | 0 | 0 |
| Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation | Dec 15, 2024 | 3D GenerationBenchmarking | —Unverified | 0 | 0 |
| OpenContrails: Benchmarking Contrail Detection on GOES-16 ABI | Apr 4, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Open Datasets for Satellite Radio Resource Control | Apr 22, 2024 | BenchmarkingDecision Making | —Unverified | 0 | 0 |
| Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors | Sep 29, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 | 0 |
| OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation | Apr 18, 2025 | Benchmarking | —Unverified | 0 | 0 |
| TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models | Jan 9, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Relation Extraction Across Entire Books to Reconstruct Community Networks: The AffilKG Datasets | May 16, 2025 | BenchmarkingKnowledge Graphs | —Unverified | 0 | 0 |
| OpenDPD: An Open-Source End-to-End Learning & Benchmarking Framework for Wideband Power Amplifier Modeling and Digital Pre-Distortion | Jan 16, 2024 | Benchmarking | —Unverified | 0 | 0 |
| OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety | Mar 18, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 | 0 |
| Benchmarking and Improving Generator-Validator Consistency of Language Models | Oct 3, 2023 | BenchmarkingInstruction Following | —Unverified | 0 | 0 |