| AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery | Oct 31, 2024 | BenchmarkingCloud Removal | CodeCode Available | 1 |
| CALE: Continuous Arcade Learning Environment | Oct 31, 2024 | Atari GamesBenchmarking | CodeCode Available | 7 |
| Low-Density 3D Point Cloud Classification | Oct 30, 2024 | 3D Point Cloud ClassificationAutonomous Driving | —Unverified | 0 |
| Survey of Cultural Awareness in Language Models: Text and Beyond | Oct 30, 2024 | Benchmarking | CodeCode Available | 1 |
| NCAdapt: Dynamic adaptation with domain-specific Neural Cellular Automata for continual hippocampus segmentation | Oct 30, 2024 | BenchmarkingContinual Learning | CodeCode Available | 0 |
| VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning | Oct 30, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes | Oct 30, 2024 | Benchmarking | —Unverified | 0 |
| InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models | Oct 30, 2024 | Benchmarking | CodeCode Available | 2 |
| CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation | Oct 30, 2024 | BenchmarkingPassage Retrieval | CodeCode Available | 2 |
| Evaluating Cultural and Social Awareness of LLM Web Agents | Oct 30, 2024 | BenchmarkingNavigate | —Unverified | 0 |