| Advancing Histopathology with Deep Learning Under Data Scarcity: A Decade in Review | Oct 18, 2024 | BenchmarkingDeep Learning | —Unverified | 0 |
| LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs | Oct 18, 2024 | BenchmarkingFairness | —Unverified | 0 |
| Benchmarking Deep Reinforcement Learning for Navigation in Denied Sensor Environments | Oct 18, 2024 | Autonomous NavigationBenchmarking | CodeCode Available | 1 |
| MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems | Oct 18, 2024 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all | Oct 17, 2024 | AllBenchmarking | CodeCode Available | 1 |
| UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models | Oct 17, 2024 | Benchmarking | CodeCode Available | 0 |
| Sum Secrecy Rate Maximization for Full Duplex ISAC Systems | Oct 17, 2024 | BenchmarkingIntegrated sensing and communication | —Unverified | 0 |
| Ab Initio Nonparametric Variable Selection for Scalable Symbolic Regression with Large p | Oct 17, 2024 | Benchmarkingregression | CodeCode Available | 0 |
| debiaSAE: Benchmarking and Mitigating Vision-Language Model Bias | Oct 17, 2024 | BenchmarkingBias Detection | CodeCode Available | 0 |
| Trust but Verify: Programmatic VLM Evaluation in the Wild | Oct 17, 2024 | BenchmarkingLanguage Modelling | —Unverified | 0 |