| Fast Benchmarking of Asynchronous Multi-Fidelity Optimization on Zero-Cost Benchmarks | Mar 4, 2024 | Benchmarking | CodeCode Available | 0 |
| a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification | Mar 3, 2024 | BenchmarkingSpeaker Verification | CodeCode Available | 0 |
| A Bayesian Committee Machine Potential for Oxygen-containing Organic Compounds | Mar 2, 2024 | BenchmarkingPosition | —Unverified | 0 |
| Benchmarking Segmentation Models with Mask-Preserved Attribute Editing | Mar 2, 2024 | AttributeBenchmarking | CodeCode Available | 1 |
| SINDy vs Hard Nonlinearities and Hidden Dynamics: a Benchmarking Study | Mar 1, 2024 | Benchmarking | —Unverified | 0 |
| Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms | Mar 1, 2024 | BenchmarkingStochastic Optimization | —Unverified | 0 |
| Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into its near-SoTA performance | Mar 1, 2024 | BenchmarkingStance Detection | —Unverified | 0 |
| Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models | Mar 1, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| TRUCE: Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs | Mar 1, 2024 | Benchmarking | CodeCode Available | 1 |
| Imitation Learning Datasets: A Toolkit For Creating Datasets, Training Agents and Benchmarking | Mar 1, 2024 | BenchmarkingImitation Learning | —Unverified | 0 |