| A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management | Nov 29, 2017 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 |
| Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents | May 30, 2025 | BenchmarkingCode Repair | —Unverified | 0 |
| A new pathway to generative artificial intelligence by minimizing the maximum entropy | Feb 18, 2025 | Benchmarking | —Unverified | 0 |
| Exploring the Impact of a Transformer's Latent Space Geometry on Downstream Task Performance | Jun 18, 2024 | Benchmarking | —Unverified | 0 |
| BraTS-Path Challenge: Assessing Heterogeneous Histopathologic Brain Tumor Sub-regions | May 17, 2024 | BenchmarkingPrognosis | —Unverified | 0 |
| Adaptive Gradient Methods with Local Guarantees | Mar 2, 2022 | Benchmarking | —Unverified | 0 |
| Object Pose Estimation in Robotics Revisited | Jun 6, 2019 | 3D Pose Estimation6D Pose Estimation | —Unverified | 0 |
| BOX3D: Lightweight Camera-LiDAR Fusion for 3D Object Detection and Localization | Aug 27, 2024 | 3D Object DetectionBenchmarking | —Unverified | 0 |
| Scale MLPerf-0.6 models on Google TPU-v3 Pods | Sep 21, 2019 | Benchmarking | —Unverified | 0 |
| Boundary Detection Benchmarking: Beyond F-Measures | Jun 1, 2013 | BenchmarkingBoundary Detection | —Unverified | 0 |