| MGTBench: Benchmarking Machine-Generated Text Detection | Mar 26, 2023 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| MEGA: Multilingual Evaluation of Generative AI | Mar 22, 2023 | Benchmarking | CodeCode Available | 1 |
| Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored Clustering Regularized Self-Training | Mar 20, 2023 | BenchmarkingClustering | CodeCode Available | 1 |
| DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 | Mar 20, 2023 | BenchmarkingDe-identification | CodeCode Available | 1 |
| CCTV-Gun: Benchmarking Handgun Detection in CCTV Images | Mar 19, 2023 | Benchmarkingobject-detection | CodeCode Available | 1 |
| COVID-19 event extraction from Twitter via extractive question answering with continuous prompts | Mar 19, 2023 | BenchmarkingEvent Extraction | CodeCode Available | 1 |
| TransNetR: Transformer-based Residual Network for Polyp Segmentation with Multi-Center Out-of-Distribution Testing | Mar 13, 2023 | BenchmarkingDecoder | CodeCode Available | 1 |
| What Can We Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers | Feb 23, 2023 | BenchmarkingOut-of-Distribution Detection | CodeCode Available | 1 |
| Revisiting the Gumbel-Softmax in MADDPG | Feb 23, 2023 | BenchmarkingMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| A framework for benchmarking class-out-of-distribution detection and its application to ImageNet | Feb 23, 2023 | BenchmarkingKnowledge Distillation | CodeCode Available | 1 |