| Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models | Jul 16, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 |
| SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation On Diverse Modalities | Jul 16, 2024 | BenchmarkingDomain Adaptation | CodeCode Available | 1 |
| On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction | Jul 15, 2024 | Active LearningBenchmarking | —Unverified | 0 |
| Separable Operator Networks | Jul 15, 2024 | BenchmarkingGPU | CodeCode Available | 1 |
| CIBench: Evaluating Your LLMs with a Code Interpreter Plugin | Jul 15, 2024 | Benchmarking | CodeCode Available | 1 |
| AstroMLab 1: Who Wins Astronomy Jeopardy!? | Jul 15, 2024 | AstronomyBenchmarking | —Unverified | 0 |
| Benchmarking Vision Language Models for Cultural Understanding | Jul 15, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| ConvBench: A Comprehensive Benchmark for 2D Convolution Primitive Evaluation | Jul 15, 2024 | Benchmarking | —Unverified | 0 |
| When Heterophily Meets Heterogeneity: Challenges and a New Large-Scale Graph Benchmark | Jul 15, 2024 | BenchmarkingGraph Learning | CodeCode Available | 1 |
| Experimental Benchmarking of Energy-saving Sub-Optimal Sliding Mode Control | Jul 14, 2024 | Benchmarking | —Unverified | 0 |