| Label-Efficient Point Cloud Semantic Segmentation: An Active Learning Approach | Jan 18, 2021 | Active LearningBenchmarking | —Unverified | 0 | 0 |
| Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models | Dec 6, 2024 | BenchmarkingDialogue Understanding | —Unverified | 0 | 0 |
| AI Cyber Risk Benchmark: Automated Exploitation Capabilities | Oct 29, 2024 | BenchmarkingVulnerability Detection | —Unverified | 0 | 0 |
| λ: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics | Nov 28, 2024 | BenchmarkingDiversity | —Unverified | 0 | 0 |
| LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs | Oct 18, 2024 | BenchmarkingFairness | —Unverified | 0 | 0 |
| Time and Tokens: Benchmarking End-to-End Speech Dysfluency Detection | Sep 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| LAG-MMLU: Benchmarking Frontier LLM Understanding in Latvian and Giriama | Mar 14, 2025 | BenchmarkingMMLU | —Unverified | 0 | 0 |
| Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens | Feb 14, 2022 | BenchmarkingHandwriting Recognition | —Unverified | 0 | 0 |
| Time Awareness in Large Language Models: Benchmarking Fact Recall Across Time | Sep 20, 2024 | BenchmarkingWorld Knowledge | —Unverified | 0 | 0 |
| Benchmarking Online Object Trackers for Underwater Robot Position Locking Applications | Feb 23, 2025 | BenchmarkingObject Tracking | —Unverified | 0 | 0 |