| From Code to Play: Benchmarking Program Search for Games Using Large Language Models | Dec 5, 2024 | Atari GamesBenchmarking | —Unverified | 0 |
| Asynchronous Batch Bayesian Optimization with Pipelining Evaluations for Experimental Resourcex2013constrained Conditions | Dec 5, 2024 | Bayesian OptimizationBenchmarking | CodeCode Available | 0 |
| Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models | Dec 5, 2024 | BenchmarkingFeature Importance | —Unverified | 0 |
| ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage | Dec 5, 2024 | Benchmarking | —Unverified | 0 |
| AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations? | Dec 4, 2024 | BenchmarkingVisual Question Answering (VQA) | —Unverified | 0 |
| Benchmarking Attention Mechanisms and Consistency Regularization Semi-Supervised Learning for Post-Flood Building Damage Assessment in Satellite Images | Dec 4, 2024 | BenchmarkingBuilding Damage Assessment | —Unverified | 0 |
| Benchmarking terminology building capabilities of ChatGPT on an English-Russian Fashion Corpus | Dec 4, 2024 | Benchmarking | —Unverified | 0 |
| Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy | Dec 4, 2024 | AnatomyBenchmarking | —Unverified | 0 |
| Benchmarking Harmonized Tariff Schedule Classification Models | Dec 4, 2024 | BenchmarkingClassification | —Unverified | 0 |
| OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations | Dec 3, 2024 | BenchmarkingFace Recognition | —Unverified | 0 |