| Is Single-View Mesh Reconstruction Ready for Robotics? | May 23, 2025 | 3D ReconstructionBenchmarking | —Unverified | 0 |
| Chart-to-Experience: Benchmarking Multimodal LLMs for Predicting Experiential Impact of Charts | May 23, 2025 | Benchmarking | —Unverified | 0 |
| JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models | May 23, 2025 | BenchmarkingDiversity | CodeCode Available | 0 |
| Twin-2K-500: A dataset for building digital twins of over 2,000 people based on their answers to over 500 questions | May 23, 2025 | 2kBenchmarking | CodeCode Available | 1 |
| SEvoBench : A C++ Framework For Evolutionary Single-Objective Optimization Benchmarking | May 23, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Wildfire spread forecasting with Deep Learning | May 23, 2025 | BenchmarkingDeep Learning | CodeCode Available | 0 |
| Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph | May 23, 2025 | BenchmarkingManagement | CodeCode Available | 1 |
| Semantic Correspondence: Unified Benchmarking and a Strong Baseline | May 23, 2025 | BenchmarkingSemantic correspondence | CodeCode Available | 1 |
| DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes | May 22, 2025 | BenchmarkingRAG | —Unverified | 0 |
| Learning collective multi-cellular dynamics from temporal scRNA-seq via a transformer-enhanced Neural SDE | May 22, 2025 | BenchmarkingTime Series | CodeCode Available | 0 |