| Multilingual European Language Models: Benchmarking Approaches and Challenges | Feb 18, 2025 | BenchmarkingQuestion Answering | —Unverified | 0 |
| STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models | Feb 18, 2025 | BenchmarkingLarge Language Model | —Unverified | 0 |
| A deep learning framework for efficient pathology image analysis | Feb 18, 2025 | BenchmarkingDeep Learning | CodeCode Available | 4 |
| Benchmarking Automatic Speech Recognition coupled LLM Modules for Medical Diagnostics | Feb 18, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Text2World: Benchmarking Large Language Models for Symbolic World Model Generation | Feb 18, 2025 | Benchmarking | —Unverified | 0 |
| LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation | Feb 18, 2025 | BenchmarkingText Generation | —Unverified | 0 |
| Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope? | Feb 18, 2025 | BenchmarkingBlocking | CodeCode Available | 1 |
| Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis | Feb 18, 2025 | BenchmarkingMamba | CodeCode Available | 0 |
| EquiBench: Benchmarking Large Language Models' Understanding of Program Semantics via Equivalence Checking | Feb 18, 2025 | BenchmarkingBinary Classification | —Unverified | 0 |
| Benchmarking MedMNIST dataset on real quantum hardware | Feb 18, 2025 | Benchmarkingimage-classification | —Unverified | 0 |