| Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark Detection | Jun 25, 2024 | BenchmarkingPrompt Learning | CodeCode Available | 1 | 5 |
| AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery | Oct 31, 2024 | BenchmarkingCloud Removal | CodeCode Available | 1 | 5 |
| Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers | Jul 3, 2020 | BenchmarkingDeep Learning | CodeCode Available | 1 | 5 |
| A Ladder of Causal Distances | May 5, 2020 | BenchmarkingCausal Discovery | CodeCode Available | 1 | 5 |
| Automatic sleep stage classification with deep residual networks in a mixed-cohort setting | Aug 21, 2020 | Automatic Sleep Stage ClassificationBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit | Sep 7, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations | Jul 4, 2018 | Adversarial DefenseBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering | May 22, 2025 | BenchmarkingEvidence Selection | CodeCode Available | 1 | 5 |
| ATOMMIC: An Advanced Toolbox for Multitask Medical Imaging Consistency to facilitate Artificial Intelligence applications from acquisition to analysis in Magnetic Resonance Imaging | Apr 30, 2024 | BenchmarkingImage Reconstruction | CodeCode Available | 1 | 5 |
| Autonomous Microscopy Experiments through Large Language Model Agents | Dec 18, 2024 | BenchmarkingExperimental Design | CodeCode Available | 1 | 5 |
| Atom-Level Optical Chemical Structure Recognition with Limited Supervision | Apr 2, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks | Aug 18, 2019 | BenchmarkingImage Classification | CodeCode Available | 1 | 5 |
| Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery | Mar 24, 2025 | BenchmarkingHumanitarian | CodeCode Available | 1 | 5 |
| Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers | Jan 1, 2021 | BenchmarkingDeep Learning | CodeCode Available | 1 | 5 |
| A Critical Assessment of State-of-the-Art in Entity Alignment | Oct 30, 2020 | BenchmarkingEntity Alignment | CodeCode Available | 1 | 5 |
| Benchmarking Offline Reinforcement Learning on Real-Robot Hardware | Jul 28, 2023 | Benchmarkingreinforcement-learning | CodeCode Available | 1 | 5 |
| DFGC 2021: A DeepFake Game Competition | Jun 2, 2021 | BenchmarkingDeepFake Detection | CodeCode Available | 1 | 5 |
| DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 | Mar 20, 2023 | BenchmarkingDe-identification | CodeCode Available | 1 | 5 |
| Benchmarking Language Models for Code Syntax Understanding | Oct 26, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking | Feb 19, 2021 | BenchmarkingOpenAI Gym | CodeCode Available | 1 | 5 |
| BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models | Jun 2, 2023 | BenchmarkingLanguage Acquisition | CodeCode Available | 1 | 5 |
| Benchmarking: Past, Present and Future | Aug 1, 2021 | BenchmarkingReading Comprehension | CodeCode Available | 1 | 5 |
| Benchmarking Large Language Models for Automated Verilog RTL Code Generation | Dec 13, 2022 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method | May 22, 2023 | BenchmarkingHallucination | CodeCode Available | 1 | 5 |
| A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain | Jun 1, 2022 | BenchmarkingEmotion Recognition | CodeCode Available | 1 | 5 |