| Benchmarking Cognitive Biases in Large Language Models as Evaluators | Sep 29, 2023 | BenchmarkingIn-Context Learning | CodeCode Available | 1 |
| MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data | Sep 29, 2023 | BenchmarkingContrastive Learning | CodeCode Available | 1 |
| G4SATBench: Benchmarking and Advancing SAT Solving with Graph Neural Networks | Sep 29, 2023 | Benchmarking | CodeCode Available | 1 |
| FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things | Sep 29, 2023 | BenchmarkingFederated Learning | CodeCode Available | 1 |
| Revisiting Neural Program Smoothing for Fuzzing | Sep 28, 2023 | BenchmarkingCPU | CodeCode Available | 1 |
| FORB: A Flat Object Retrieval Benchmark for Universal Image Embedding | Sep 28, 2023 | BenchmarkingImage Retrieval | CodeCode Available | 1 |
| The Trickle-down Impact of Reward (In-)consistency on RLHF | Sep 28, 2023 | Benchmarking | CodeCode Available | 1 |
| LagrangeBench: A Lagrangian Fluid Mechanics Benchmarking Suite | Sep 28, 2023 | Benchmarking | CodeCode Available | 1 |
| NLPBench: Evaluating Large Language Models on Solving NLP Problems | Sep 27, 2023 | BenchmarkingMath | CodeCode Available | 1 |
| OceanBench: The Sea Surface Height Edition | Sep 27, 2023 | BenchmarkingSensor Fusion | CodeCode Available | 1 |
| Node-Aligned Graph-to-Graph (NAG2G): Elevating Template-Free Deep Learning Approaches in Single-Step Retrosynthesis | Sep 27, 2023 | BenchmarkingGraph Generation | CodeCode Available | 1 |
| Unified Long-Term Time-Series Forecasting Benchmark | Sep 27, 2023 | BenchmarkingTime Series | CodeCode Available | 1 |
| Benchmarking Local Robustness of High-Accuracy Binary Neural Networks for Enhanced Traffic Sign Recognition | Sep 25, 2023 | Autonomous DrivingBenchmarking | CodeCode Available | 1 |
| Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Shape Reconstruction | Sep 24, 2023 | 3D Shape ReconstructionAnatomy | CodeCode Available | 1 |
| Grad DFT: a software library for machine learning enhanced density functional theory | Sep 23, 2023 | Benchmarking | CodeCode Available | 1 |
| Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation | Sep 21, 2023 | BenchmarkingClassification | CodeCode Available | 1 |
| An Image Dataset for Benchmarking Recommender Systems with Raw Pixels | Sep 13, 2023 | BenchmarkingRecommendation Systems | CodeCode Available | 1 |
| Formalizing Multimedia Recommendation through Multimodal Deep Learning | Sep 11, 2023 | BenchmarkingDeep Learning | CodeCode Available | 1 |
| FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions | Sep 10, 2023 | 3D Human Pose Estimation3D Pose Estimation | CodeCode Available | 1 |
| RecAD: Towards A Unified Library for Recommender Attack and Defense | Sep 9, 2023 | BenchmarkingRecommendation Systems | CodeCode Available | 1 |
| Evaluation of large language models for discovery of gene set function | Sep 7, 2023 | BenchmarkingLanguage Modelling | CodeCode Available | 1 |
| A skeletonization algorithm for gradient-based optimization | Sep 5, 2023 | BenchmarkingDeep Learning | CodeCode Available | 1 |
| Benchmarking Autoregressive Conditional Diffusion Models for Turbulent Flow Simulation | Sep 4, 2023 | Benchmarking | CodeCode Available | 1 |
| Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering | Aug 31, 2023 | BenchmarkingDataset Generation | CodeCode Available | 1 |
| Benchmarking the Generation of Fact Checking Explanations | Aug 29, 2023 | Abstractive Text SummarizationArticles | CodeCode Available | 1 |