| LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models | Nov 1, 2024 | BenchmarkingMixture-of-Experts | CodeCode Available | 1 |
| LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators | Oct 31, 2024 | BenchmarkingText Generation | CodeCode Available | 2 |
| LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction | Oct 31, 2024 | BenchmarkingPrediction | CodeCode Available | 1 |
| IdeaBench: Benchmarking Large Language Models for Research Idea Generation | Oct 31, 2024 | Benchmarkingscientific discovery | CodeCode Available | 0 |
| Pedestrian Trajectory Prediction with Missing Data: Datasets, Imputation, and Benchmarking | Oct 31, 2024 | BenchmarkingImputation | CodeCode Available | 1 |
| Benchmark Data Repositories for Better Benchmarking | Oct 31, 2024 | Benchmarking | —Unverified | 0 |
| XRDSLAM: A Flexible and Modular Framework for Deep Learning based SLAM | Oct 31, 2024 | 3DGSBenchmarking | CodeCode Available | 3 |
| EMGBench: Benchmarking Out-of-Distribution Generalization and Adaptation for Electromyography | Oct 31, 2024 | BenchmarkingElectromyography (EMG) | CodeCode Available | 1 |
| AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents | Oct 31, 2024 | Benchmarking | CodeCode Available | 3 |
| AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery | Oct 31, 2024 | BenchmarkingCloud Removal | CodeCode Available | 1 |