| DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 | Mar 20, 2023 | BenchmarkingDe-identification | CodeCode Available | 1 | 5 |
| RobFR: Benchmarking Adversarial Robustness on Face Recognition | Jul 8, 2020 | Adversarial RobustnessBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Large Language Models for Automated Verilog RTL Code Generation | Dec 13, 2022 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| Deep learning model solves change point detection for multiple change types | Apr 15, 2022 | BenchmarkingChange Point Detection | CodeCode Available | 1 | 5 |
| Deep Learning-Based Synchronization for Uplink NB-IoT | May 22, 2022 | BenchmarkingDeep Learning | CodeCode Available | 1 | 5 |
| Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL | Apr 28, 2020 | AllBenchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Language Model Creativity: A Case Study on Code Generation | Jul 12, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 | 5 |
| A Computed Tomography Vertebral Segmentation Dataset with Anatomical Variations and Multi-Vendor Scanner Data | Mar 10, 2021 | AnatomyBenchmarking | CodeCode Available | 1 | 5 |
| Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory | Jul 20, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |
| Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT | Apr 3, 2024 | BenchmarkingGeneral Knowledge | CodeCode Available | 1 | 5 |
| Benchmarking Language Models for Code Syntax Understanding | Oct 26, 2022 | Benchmarking | CodeCode Available | 1 | 5 |
| Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions | Feb 28, 2024 | BenchmarkingMultiple-choice | CodeCode Available | 1 | 5 |
| Decoding the Underlying Meaning of Multimodal Hateful Memes | May 28, 2023 | BenchmarkingHateful Meme Classification | CodeCode Available | 1 | 5 |
| Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers | Jul 3, 2020 | BenchmarkingDeep Learning | CodeCode Available | 1 | 5 |
| Dataset and Benchmark: Novel Sensors for Autonomous Vehicle Perception | Jan 24, 2024 | Benchmarking | CodeCode Available | 1 | 5 |
| AudioMarkBench: Benchmarking Robustness of Audio Watermarking | Jun 11, 2024 | Benchmarkingtext-to-speech | CodeCode Available | 1 | 5 |
| Benchmarking human visual search computational models in natural scenes: models comparison and reference datasets | Dec 10, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| DataRec: A Python Library for Standardized and Reproducible Data Management in Recommender Systems | Oct 30, 2024 | BenchmarkingManagement | CodeCode Available | 1 | 5 |
| Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets | Apr 11, 2022 | Action Triplet RecognitionBenchmarking | CodeCode Available | 1 | 5 |
| A Large-Scale Dataset for Benchmarking Elevator Button Segmentation and Character Recognition | Mar 16, 2021 | BenchmarkingPosition | CodeCode Available | 1 | 5 |
| Data-Driven Denoising of Stationary Accelerometer Signals | Jun 13, 2022 | BenchmarkingDenoising | CodeCode Available | 1 | 5 |
| A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection | Mar 5, 2022 | BenchmarkingCopy Detection | CodeCode Available | 1 | 5 |
| Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation | Feb 18, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| DACBench: A Benchmark Library for Dynamic Algorithm Configuration | May 18, 2021 | Benchmarking | CodeCode Available | 1 | 5 |
| Data Generating Process to Evaluate Causal Discovery Techniques for Time Series Data | Apr 16, 2021 | BenchmarkingCausal Discovery | CodeCode Available | 1 | 5 |