| Yet Another ADNI Machine Learning Paper? Paving The Way Towards Fully-reproducible Research on Classification of Alzheimer's Disease | Sep 21, 2017 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Understanding the Limits of Lifelong Knowledge Editing in LLMs | Mar 7, 2025 | Benchmarkingknowledge editing | —Unverified | 0 | 0 |
| Who Wins the Game of Thrones? How Sentiments Improve the Prediction of Candidate Choice | Feb 29, 2020 | BenchmarkingHoldout Set | —Unverified | 0 | 0 |
| Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective | Jun 19, 2024 | BenchmarkingContinual Pretraining | —Unverified | 0 | 0 |
| Audio-Visual Class-Incremental Learning for Fish Feeding intensity Assessment in Aquaculture | Apr 21, 2025 | Benchmarkingclass-incremental learning | —Unverified | 0 | 0 |
| A Two-Step Framework for Multi-Material Decomposition of Dual Energy Computed Tomography from Projection Domain | Oct 31, 2023 | BenchmarkingDiagnostic | —Unverified | 0 | 0 |
| R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models | Jun 3, 2024 | BenchmarkingCode Completion | —Unverified | 0 | 0 |
| R2H: Building Multimodal Navigation Helpers that Respond to Help Requests | May 23, 2023 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation | May 29, 2025 | BenchmarkingImage Generation | —Unverified | 0 | 0 |
| R3L: Connecting Deep Reinforcement Learning to Recurrent Neural Networks for Image Denoising via Residual Recovery | Jul 12, 2021 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 | 0 |
| A Two-Stage Neural-Filter Pareto Front Extractor and the need for Benchmarking | Sep 29, 2021 | BenchmarkingMulti-Task Learning | —Unverified | 0 | 0 |
| RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR | Nov 23, 2021 | BenchmarkingComputed Tomography (CT) | —Unverified | 0 | 0 |
| A tutorial on multi-view autoencoders using the multi-view-AE library | Mar 12, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Understanding the User: An Intent-Based Ranking Dataset | Aug 30, 2024 | BenchmarkingInformation Retrieval | —Unverified | 0 | 0 |
| RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems | Jun 25, 2024 | BenchmarkingRAG | —Unverified | 0 | 0 |
| Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking | Jan 8, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 | 0 |
| A Theory of Dynamic Benchmarks | Oct 6, 2022 | Benchmarking | —Unverified | 0 | 0 |
| RAG-Reward: Optimizing RAG with Reward Modeling and RLHF | Jan 22, 2025 | BenchmarkingHallucination | —Unverified | 0 | 0 |
| Rail-5k: a Real-World Dataset for Rail Surface Defects Detection | Jun 28, 2021 | 4kBenchmarking | —Unverified | 0 | 0 |
| On the Evaluation of Engineering Artificial General Intelligence | May 15, 2025 | Benchmarking | —Unverified | 0 | 0 |
| A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality | Apr 5, 2022 | BenchmarkingSelf-Supervised Learning | —Unverified | 0 | 0 |
| RAN-GNNs: breaking the capacity limits of graph neural networks | Mar 29, 2021 | AttributeBenchmarking | —Unverified | 0 | 0 |
| ATG: Benchmarking Automated Theorem Generation for Generative Language Models | May 5, 2024 | Automated Theorem ProvingBenchmarking | —Unverified | 0 | 0 |
| A Comparison of Cryptocurrency Volatility-benchmarking New and Mature Asset Classes | Apr 7, 2024 | Benchmarking | —Unverified | 0 | 0 |
| Atari-GPT: Benchmarking Multimodal Large Language Models as Low-Level Policies in Atari Games | Aug 28, 2024 | Atari GamesBenchmarking | —Unverified | 0 | 0 |