| Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning | Oct 15, 2023 | BenchmarkingSpatial Reasoning | —Unverified | 0 |
| Randomized Benchmarking of Local Zeroth-Order Optimizers for Variational Quantum Systems | Oct 14, 2023 | Benchmarking | CodeCode Available | 0 |
| Benchmarking the Sim-to-Real Gap in Cloth Manipulation | Oct 14, 2023 | BenchmarkingMuJoCo | —Unverified | 0 |
| Mirage: Model-Agnostic Graph Distillation for Graph Classification | Oct 14, 2023 | BenchmarkingClassification | CodeCode Available | 0 |
| "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters | Oct 13, 2023 | BenchmarkingFairness | CodeCode Available | 1 |
| pose-format: Library for Viewing, Augmenting, and Handling .pose Files | Oct 13, 2023 | BenchmarkingManagement | CodeCode Available | 1 |
| BanglaNLP at BLP-2023 Task 2: Benchmarking different Transformer Models for Sentiment Analysis of Bangla Social Media Posts | Oct 13, 2023 | BenchmarkingSentiment Analysis | CodeCode Available | 0 |
| Welfare Diplomacy: Benchmarking Language Model Cooperation | Oct 13, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| MetaBox: A Benchmark Platform for Meta-Black-Box Optimization with Reinforcement Learning | Oct 12, 2023 | Benchmarking | CodeCode Available | 1 |
| GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shifts | Oct 12, 2023 | Benchmarking | CodeCode Available | 1 |