| ScandEval: A Benchmark for Scandinavian Natural Language Processing | Apr 3, 2023 | BenchmarkingCross-Lingual Transfer | CodeCode Available | 1 |
| ENRICH: Multi-purposE dataset for beNchmaRking In Computer vision and pHotogrammetry | Apr 1, 2023 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 1 |
| What Makes for Effective Few-shot Point Cloud Classification? | Mar 31, 2023 | BenchmarkingClassification | CodeCode Available | 1 |
| A Scale-Invariant Sorting Criterion to Find a Causal Order in Additive Noise Models | Mar 31, 2023 | BenchmarkingCausal Discovery | CodeCode Available | 1 |
| ImageNet-E: Benchmarking Neural Network Robustness via Attribute Editing | Mar 30, 2023 | AttributeBenchmarking | CodeCode Available | 1 |
| MGTBench: Benchmarking Machine-Generated Text Detection | Mar 26, 2023 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| MEGA: Multilingual Evaluation of Generative AI | Mar 22, 2023 | Benchmarking | CodeCode Available | 1 |
| DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 | Mar 20, 2023 | BenchmarkingDe-identification | CodeCode Available | 1 |
| Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored Clustering Regularized Self-Training | Mar 20, 2023 | BenchmarkingClustering | CodeCode Available | 1 |
| CCTV-Gun: Benchmarking Handgun Detection in CCTV Images | Mar 19, 2023 | Benchmarkingobject-detection | CodeCode Available | 1 |
| COVID-19 event extraction from Twitter via extractive question answering with continuous prompts | Mar 19, 2023 | BenchmarkingEvent Extraction | CodeCode Available | 1 |
| TransNetR: Transformer-based Residual Network for Polyp Segmentation with Multi-Center Out-of-Distribution Testing | Mar 13, 2023 | BenchmarkingDecoder | CodeCode Available | 1 |
| What Can We Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers | Feb 23, 2023 | BenchmarkingOut-of-Distribution Detection | CodeCode Available | 1 |
| Revisiting the Gumbel-Softmax in MADDPG | Feb 23, 2023 | BenchmarkingMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| A framework for benchmarking class-out-of-distribution detection and its application to ImageNet | Feb 23, 2023 | BenchmarkingKnowledge Distillation | CodeCode Available | 1 |
| A SWAT-based Reinforcement Learning Framework for Crop Management | Feb 10, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| SurgT challenge: Benchmark of Soft-Tissue Trackers for Robotic Surgery | Feb 6, 2023 | BenchmarkingCamera Calibration | CodeCode Available | 1 |
| CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks | Feb 4, 2023 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 |
| Benchmarking Algorithms for Submodular Optimization Problems Using IOHProfiler | Feb 2, 2023 | BenchmarkingEvolutionary Algorithms | CodeCode Available | 1 |
| Rethinking low-cost microscopy workflow: Image enhancement using deep based Extended Depth of Field methods | Feb 1, 2023 | BenchmarkingImage Deblurring | CodeCode Available | 1 |
| Benchmarking Large Language Models for News Summarization | Jan 31, 2023 | BenchmarkingNews Summarization | CodeCode Available | 1 |
| Benchmarking Robustness to Adversarial Image Obfuscations | Jan 30, 2023 | Benchmarking | CodeCode Available | 1 |
| TemporAI: Facilitating Machine Learning Innovation in Time Domain Tasks for Medicine | Jan 28, 2023 | BenchmarkingCausal Inference | CodeCode Available | 1 |
| BiBench: Benchmarking and Analyzing Network Binarization | Jan 26, 2023 | BenchmarkingBinarization | CodeCode Available | 1 |
| Young Labeled Faces in the Wild (YLFW): A Dataset for Children Faces Recognition | Jan 13, 2023 | BenchmarkingFace Recognition | CodeCode Available | 1 |