| Share, Collaborate, Benchmark: Advancing Travel Demand Research through rigorous open-source collaboration | Jun 9, 2023 | BenchmarkingTime Series | —Unverified | 0 |
| Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework | Jun 8, 2023 | Benchmarking | CodeCode Available | 0 |
| FedSecurity: Benchmarking Attacks and Defenses in Federated Learning and Federated LLMs | Jun 8, 2023 | BenchmarkingFederated Learning | CodeCode Available | 0 |
| DynamoRep: Trajectory-Based Population Dynamics for Classification of Black-box Optimization Problems | Jun 8, 2023 | BenchmarkingDescriptive | CodeCode Available | 0 |
| FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems | Jun 8, 2023 | BenchmarkingEdge-computing | —Unverified | 0 |
| DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models | Jun 8, 2023 | BenchmarkingFairness | CodeCode Available | 0 |
| RD-Suite: A Benchmark for Ranking Distillation | Jun 7, 2023 | Benchmarking | —Unverified | 0 |
| Self-Adjusting Weighted Expected Improvement for Bayesian Optimization | Jun 7, 2023 | Bayesian OptimizationBenchmarking | CodeCode Available | 0 |
| Benchmarking Foundation Models with Language-Model-as-an-Examiner | Jun 7, 2023 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| ICON^2: Reliably Benchmarking Predictive Inequity in Object Detection | Jun 7, 2023 | AttributeAutonomous Driving | —Unverified | 0 |