| Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning | May 9, 2024 | BenchmarkingFederated Learning | —Unverified | 0 |
| Aequitas Flow: Streamlining Fair ML Experimentation | May 9, 2024 | BenchmarkingFairness | CodeCode Available | 4 |
| OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs | May 9, 2024 | BenchmarkingFact Checking | CodeCode Available | 2 |
| Benchmarking Educational Program Repair | May 8, 2024 | BenchmarkingProgram Repair | CodeCode Available | 0 |
| Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking | May 7, 2024 | BenchmarkingModel Selection | —Unverified | 0 |
| AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets | May 7, 2024 | BenchmarkingCancer Classification | CodeCode Available | 1 |
| Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning | May 7, 2024 | BenchmarkingContrastive Learning | CodeCode Available | 0 |
| ACEGEN: Reinforcement learning of generative chemical agents for drug discovery | May 7, 2024 | BenchmarkingDecision Making | CodeCode Available | 3 |
| UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images | May 6, 2024 | Benchmarking | —Unverified | 0 |
| ATG: Benchmarking Automated Theorem Generation for Generative Language Models | May 5, 2024 | Automated Theorem ProvingBenchmarking | —Unverified | 0 |