| A Comprehensive Overview of Large Language Models | Jul 12, 2023 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Simulation-Based Inference | Jan 12, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models | Aug 2, 2022 | BenchmarkingSynthetic Data Generation | CodeCode Available | 1 | 5 |
| BeHonest: Benchmarking Honesty in Large Language Models | Jun 19, 2024 | BenchmarkingMisinformation | CodeCode Available | 1 | 5 |
| Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset | Jun 5, 2023 | BenchmarkingMultiple-choice | CodeCode Available | 1 | 5 |
| AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling | Nov 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 1 | 5 |
| Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson's Disease Severity in Walking Sequences | May 28, 2024 | BenchmarkingFeature Engineering | CodeCode Available | 1 | 5 |
| Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs | Sep 18, 2021 | BenchmarkingComplex Query Answering | CodeCode Available | 1 | 5 |
| AirSim Drone Racing Lab | Mar 12, 2020 | BenchmarkingOptical Flow Estimation | CodeCode Available | 1 | 5 |
| A SWAT-based Reinforcement Learning Framework for Crop Management | Feb 10, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |