| A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms | Jul 3, 2023 | BenchmarkingCamera Calibration | —Unverified | 0 |
| Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity | Jul 2, 2023 | BenchmarkingData Integration | —Unverified | 0 |
| SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency | Jul 1, 2023 | BenchmarkingData Augmentation | —Unverified | 0 |
| InstructEval: Systematic Evaluation of Instruction Selection Methods | Jul 1, 2023 | BenchmarkingIn-Context Learning | —Unverified | 0 |
| Learning Environment Models with Continuous Stochastic Dynamics | Jun 29, 2023 | AcrobotBenchmarking | —Unverified | 0 |
| Benchmarking Large Language Model Capabilities for Conditional Generation | Jun 29, 2023 | BenchmarkingFew-Shot Learning | —Unverified | 0 |
| Principles and Guidelines for Evaluating Social Robot Navigation Algorithms | Jun 29, 2023 | BenchmarkingRobot Navigation | —Unverified | 0 |
| Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors | Jun 29, 2023 | Benchmarking | —Unverified | 0 |
| Uncovering the Limits of Machine Learning for Automatic Vulnerability Detection | Jun 28, 2023 | BenchmarkingData Augmentation | CodeCode Available | 1 |
| Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity | Jun 28, 2023 | BenchmarkingImage Captioning | —Unverified | 0 |