| Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations? | Apr 29, 2024 | Answer GenerationBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models? | Aug 14, 2023 | BenchmarkingDrug Design | CodeCode Available | 1 | 5 |
| Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization | Nov 15, 2023 | BenchmarkingInstruction Following | CodeCode Available | 1 | 5 |
| A multi-schematic classifier-independent oversampling approach for imbalanced datasets | Jul 15, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models | Nov 27, 2024 | BenchmarkingEarth Observation | CodeCode Available | 1 | 5 |
| CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks | Feb 4, 2023 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 | 5 |
| Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT | Jul 9, 2021 | BenchmarkingDocument Classification | CodeCode Available | 1 | 5 |
| AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling | Nov 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |
| A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models | Aug 2, 2022 | BenchmarkingSynthetic Data Generation | CodeCode Available | 1 | 5 |
| M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object Detection | May 16, 2025 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |