| QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers | Oct 8, 2024 | Benchmarking | CodeCode Available | 0 |
| Named Clinical Entity Recognition Benchmark | Oct 7, 2024 | BenchmarkingDecoder | CodeCode Available | 0 |
| Precise Model Benchmarking with Only a Few Observations | Oct 7, 2024 | Benchmarkingmodel | —Unverified | 0 |
| Rule-based Data Selection for Large Language Models | Oct 7, 2024 | BenchmarkingMath | —Unverified | 0 |
| TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models | Oct 7, 2024 | BenchmarkingSegmentation | CodeCode Available | 0 |
| Translation Canvas: An Explainable Interface to Pinpoint and Analyze Translation Systems | Oct 7, 2024 | BenchmarkingMachine Translation | —Unverified | 0 |
| Adjusting Pretrained Backbones for Performativity | Oct 6, 2024 | BenchmarkingDeep Learning | CodeCode Available | 0 |
| ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection | Oct 6, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| Implicit to Explicit Entropy Regularization: Benchmarking ViT Fine-tuning under Noisy Labels | Oct 5, 2024 | Benchmarking | —Unverified | 0 |
| Transformers Utilization in Chart Understanding: A Review of Recent Advances & Future Trends | Oct 5, 2024 | BenchmarkingChart Understanding | —Unverified | 0 |