| Working Memory Capacity of ChatGPT: An Empirical Study | Apr 30, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE Detection | Mar 12, 2025 | BenchmarkingCode Classification | CodeCode Available | 1 | 5 |
| Benchmarking Simulation-Based Inference | Jan 12, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Large Language Models for Automated Verilog RTL Code Generation | Dec 13, 2022 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| Grounding Descriptions in Images informs Zero-Shot Visual Recognition | Dec 5, 2024 | AttributeBenchmarking | CodeCode Available | 1 | 5 |
| A Reinforcement Learning Environment for Multi-Service UAV-enabled Wireless Systems | May 11, 2021 | BenchmarkingEdge-computing | CodeCode Available | 1 | 5 |
| CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery | Oct 3, 2023 | BenchmarkingCausal Discovery | CodeCode Available | 1 | 5 |
| Causality for Tabular Data Synthesis: A High-Order Structure Causal Benchmark Framework | Jun 12, 2024 | BenchmarkingCausal Inference | CodeCode Available | 1 | 5 |
| Hierarchical graph neural nets can capture long-range interactions | Jul 15, 2021 | BenchmarkingMolecular Property Prediction | CodeCode Available | 1 | 5 |
| Benchmarking Language Models for Code Syntax Understanding | Oct 26, 2022 | Benchmarking | CodeCode Available | 1 | 5 |