| Datasets and Benchmarks for Offline Safe Reinforcement Learning | Jun 15, 2023 | Autonomous DrivingBenchmarking | CodeCode Available | 2 |
| MUBen: Benchmarking the Uncertainty of Molecular Representation Models | Jun 14, 2023 | BenchmarkingDrug Discovery | CodeCode Available | 0 |
| RRSIS: Referring Remote Sensing Image Segmentation | Jun 14, 2023 | BenchmarkingImage Segmentation | —Unverified | 0 |
| A Cloud-based Machine Learning Pipeline for the Efficient Extraction of Insights from Customer Reviews | Jun 13, 2023 | BenchmarkingKeyword Extraction | —Unverified | 0 |
| detrex: Benchmarking Detection Transformers | Jun 12, 2023 | Benchmarkingobject-detection | —Unverified | 0 |
| Benchmarking Neural Network Training Algorithms | Jun 12, 2023 | Benchmarking | CodeCode Available | 4 |
| Contribution à l'Optimisation d'un Comportement Collectif pour un Groupe de Robots Autonomes | Jun 10, 2023 | BenchmarkingDiversity | —Unverified | 0 |
| Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception | Jun 10, 2023 | 3D Object DetectionBenchmarking | CodeCode Available | 2 |
| NeuroGraph: Benchmarks for Graph Machine Learning in Brain Connectomics | Jun 9, 2023 | BenchmarkingDataset Generation | CodeCode Available | 1 |
| Share, Collaborate, Benchmark: Advancing Travel Demand Research through rigorous open-source collaboration | Jun 9, 2023 | BenchmarkingTime Series | —Unverified | 0 |
| A Large-Scale Analysis on Self-Supervised Video Representation Learning | Jun 9, 2023 | BenchmarkingRepresentation Learning | —Unverified | 0 |
| DynamoRep: Trajectory-Based Population Dynamics for Classification of Black-box Optimization Problems | Jun 8, 2023 | BenchmarkingDescriptive | CodeCode Available | 0 |
| FedSecurity: Benchmarking Attacks and Defenses in Federated Learning and Federated LLMs | Jun 8, 2023 | BenchmarkingFederated Learning | CodeCode Available | 0 |
| Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML | Jun 8, 2023 | BenchmarkingKidney Function | CodeCode Available | 1 |
| DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models | Jun 8, 2023 | BenchmarkingFairness | CodeCode Available | 0 |
| FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems | Jun 8, 2023 | BenchmarkingEdge-computing | —Unverified | 0 |
| Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework | Jun 8, 2023 | Benchmarking | CodeCode Available | 0 |
| On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing | Jun 7, 2023 | BenchmarkingPrompt Engineering | CodeCode Available | 1 |
| Improved statistical benchmarking of digital pathology models using pairwise frames evaluation | Jun 7, 2023 | BenchmarkingClassification | —Unverified | 0 |
| RD-Suite: A Benchmark for Ranking Distillation | Jun 7, 2023 | Benchmarking | —Unverified | 0 |
| Knowing-how & Knowing-that: A New Task for Machine Comprehension of User Manuals | Jun 7, 2023 | BenchmarkingMachine Reading Comprehension | CodeCode Available | 0 |
| Benchmarking Foundation Models with Language-Model-as-an-Examiner | Jun 7, 2023 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Self-Adjusting Weighted Expected Improvement for Bayesian Optimization | Jun 7, 2023 | Bayesian OptimizationBenchmarking | CodeCode Available | 0 |
| ICON^2: Reliably Benchmarking Predictive Inequity in Object Detection | Jun 7, 2023 | AttributeAutonomous Driving | —Unverified | 0 |
| Benchmarking Robustness of AI-Enabled Multi-sensor Fusion Systems: Challenges and Opportunities | Jun 6, 2023 | BenchmarkingDepth Completion | —Unverified | 0 |