| SydneyScapes: Image Segmentation for Australian Environments | Apr 10, 2025 | Autonomous VehiclesBenchmarking | —Unverified | 0 |
| NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark | Apr 10, 2025 | Benchmarking | CodeCode Available | 0 |
| Benchmarking Multi-Organ Segmentation Tools for Multi-Parametric T1-weighted Abdominal MRI | Apr 10, 2025 | BenchmarkingOrgan Segmentation | —Unverified | 0 |
| Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge | Apr 10, 2025 | Adversarial RobustnessBenchmarking | CodeCode Available | 0 |
| Benchmarking Image Embeddings for E-Commerce: Evaluating Off-the Shelf Foundation Models, Fine-Tuning Strategies and Practical Trade-offs | Apr 10, 2025 | BenchmarkingContrastive Learning | —Unverified | 0 |
| Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program | Apr 9, 2025 | Benchmarking | CodeCode Available | 0 |
| TabKAN: Advancing Tabular Data Analysis using Kolmogorov-Arnold Network | Apr 9, 2025 | BenchmarkingDeep Learning | —Unverified | 0 |
| Evolutionary Generation of Random Surreal Numbers for Benchmarking | Apr 9, 2025 | Benchmarking | CodeCode Available | 1 |
| A Roadmap for Improving Data Reliability and Sharing in Crosslinking Mass Spectrometry | Apr 9, 2025 | Benchmarking | —Unverified | 0 |
| RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration | Apr 9, 2025 | 3D Semantic SegmentationBenchmarking | —Unverified | 0 |