| Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data | Jun 20, 2024 | Animal Pose EstimationBenchmarking | —Unverified | 0 | 0 |
| TOTOPO: Classifying univariate and multivariate time series with Topological Data Analysis | Oct 10, 2020 | BenchmarkingTime Series | —Unverified | 0 | 0 |
| LMFormer: Lane based Motion Prediction Transformer | Apr 14, 2025 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| Benchmarking Modern Named Entity Recognition Techniques for Free-text Health Record De-identification | Mar 25, 2021 | BenchmarkingDecoder | —Unverified | 0 | 0 |
| LMME3DHF: Benchmarking and Evaluating Multimodal 3D Human Face Generation with LMMs | Apr 29, 2025 | BenchmarkingFace Generation | —Unverified | 0 | 0 |
| LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models | Jul 17, 2024 | BenchmarkingLanguage Modelling | —Unverified | 0 | 0 |
| Load-independent Metrics for Benchmarking Force Controllers | May 13, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking Mobile Device Control Agents across Diverse Configurations | Apr 25, 2024 | BenchmarkingImitation Learning | —Unverified | 0 | 0 |
| Local Data Quantity-Aware Weighted Averaging for Federated Learning with Dishonest Clients | Apr 17, 2025 | BenchmarkingFederated Learning | —Unverified | 0 | 0 |
| XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis | Jun 26, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 | 0 |
| Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework | Jun 9, 2025 | BenchmarkingFairness | —Unverified | 0 | 0 |
| Benchmarking Middle-Trained Language Models for Neural Search | Jun 5, 2023 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| Logically at Factify 2: A Multi-Modal Fact Checking System Based on Evidence Retrieval techniques and Transformer Encoder Architecture | Jan 9, 2023 | AvgBenchmarking | —Unverified | 0 | 0 |
| Logically at Factify 2022: Multimodal Fact Verification | Dec 16, 2021 | BenchmarkingFact Checking | —Unverified | 0 | 0 |
| Toward an ImageNet Library of Functions for Global Optimization Benchmarking | Jun 27, 2022 | Benchmarkingglobal-optimization | —Unverified | 0 | 0 |
| Benchmarking Meta-heuristic Optimization | Jul 27, 2020 | BenchmarkingEvolutionary Algorithms | —Unverified | 0 | 0 |
| Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models | Jun 25, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Toward end-to-end interpretable convolutional neural networks for waveform signals | May 3, 2024 | BenchmarkingEmotion Recognition | —Unverified | 0 | 0 |
| Benchmarking MedMNIST dataset on real quantum hardware | Feb 18, 2025 | Benchmarkingimage-classification | —Unverified | 0 | 0 |
| Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets | Jun 1, 2015 | BenchmarkingMachine Translation | —Unverified | 0 | 0 |
| Benchmarking Continuous Time Models for Predicting Multiple Sclerosis Progression | Feb 15, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking Machine Learning Robustness in Covid-19 Spike Sequence Classification | Sep 29, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 | 0 |
| Benchmarking Machine Learning Models to Predict Corporate Bankruptcy | Dec 22, 2022 | Benchmarking | —Unverified | 0 | 0 |
| LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation | Jan 9, 2025 | 2k8k | —Unverified | 0 | 0 |
| Long Range Arena : A Benchmark for Efficient Transformers | Jan 1, 2021 | 16kBenchmarking | —Unverified | 0 | 0 |
| Benchmarking machine learning models for predicting aerofoil performance | Apr 22, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking Machine Learning Models for Quantum Error Correction | Nov 18, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models | Feb 17, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage | Dec 20, 2024 | AttributeBenchmarking | —Unverified | 0 | 0 |
| Look, Read and Feel: Benchmarking Ads Understanding with Multimodal Multitask Learning | Dec 21, 2019 | BenchmarkingPrediction | —Unverified | 0 | 0 |
| WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking | Nov 14, 2024 | BenchmarkingDrug Discovery | —Unverified | 0 | 0 |
| LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers | Apr 19, 2025 | BenchmarkingDiagnostic | —Unverified | 0 | 0 |
| Benchmarking machine learning models for quantum state classification | Sep 14, 2023 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Towards a Benchmark for Scientific Understanding in Humans and Machines | Apr 20, 2023 | BenchmarkingInformation Retrieval | —Unverified | 0 | 0 |
| Benchmarking Machine Learning Methods for Distributed Acoustic Sensing | Mar 26, 2025 | BenchmarkingData Augmentation | —Unverified | 0 | 0 |
| Benchmarking Machine Learning: How Fast Can Your Algorithms Go? | Jan 8, 2021 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 | 0 |
| Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym | Sep 29, 2023 | Bayesian OptimizationBenchmarking | —Unverified | 0 | 0 |
| GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors | Jun 9, 2025 | BenchmarkingModel extraction | —Unverified | 0 | 0 |
| Low-Density 3D Point Cloud Classification | Oct 30, 2024 | 3D Point Cloud ClassificationAutonomous Driving | —Unverified | 0 | 0 |
| Low Dynamic Range for RIS-aided Bistatic Integrated Sensing and Communication | Nov 9, 2024 | BenchmarkingIntegrated sensing and communication | —Unverified | 0 | 0 |
| Low-resource Neural Machine Translation: Benchmarking State-of-the-art Transformer for Wolof<->French | Jun 1, 2022 | BenchmarkingLow Resource Neural Machine Translation | —Unverified | 0 | 0 |
| LSTM-based Whisper Detection | Sep 20, 2018 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking M6 Competitors: An Analysis of Financial Metrics and Discussion of Incentives | Jun 27, 2024 | Benchmarking | —Unverified | 0 | 0 |
| LucidDreaming: Controllable Object-Centric 3D Generation | Nov 30, 2023 | 3D GenerationBenchmarking | —Unverified | 0 | 0 |
| Benchmarking LLMs on the Semantic Overlap Summarization Task | Feb 26, 2024 | BenchmarkingDocument Summarization | —Unverified | 0 | 0 |
| LUND-PROBE -- LUND Prostate Radiotherapy Open Benchmarking and Evaluation dataset | Feb 6, 2025 | BenchmarkingComputed Tomography (CT) | —Unverified | 0 | 0 |
| Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders | Mar 7, 2025 | BenchmarkingClick-Through Rate Prediction | —Unverified | 0 | 0 |
| Towards a Human-Centred Cognitive Model of Visuospatial Complexity in Everyday Driving | May 29, 2020 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data | Sep 15, 2024 | Benchmarkingtext annotation | —Unverified | 0 | 0 |
| M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes | Oct 9, 2024 | BenchmarkingMotion Generation | —Unverified | 0 | 0 |