| Explainable AI using expressive Boolean formulas | Jun 6, 2023 | BenchmarkingExplainable Artificial Intelligence (XAI) | —Unverified | 0 |
| Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models | Jun 6, 2023 | BenchmarkingEthics | —Unverified | 0 |
| Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging | Jun 6, 2023 | BenchmarkingSentence | —Unverified | 0 |
| LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning | Jun 5, 2023 | Benchmarking | CodeCode Available | 3 |
| Str2Str: A Score-based Framework for Zero-shot Protein Conformation Sampling | Jun 5, 2023 | BenchmarkingDenoising | CodeCode Available | 1 |
| N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition | Jun 5, 2023 | Arabic Speech RecognitionBenchmarking | —Unverified | 0 |
| Benchmarking Middle-Trained Language Models for Neural Search | Jun 5, 2023 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset | Jun 5, 2023 | BenchmarkingMultiple-choice | CodeCode Available | 1 |
| LibAUC: A Deep Learning Library for X-Risk Optimization | Jun 5, 2023 | BenchmarkingClassification | CodeCode Available | 2 |
| RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems | Jun 5, 2023 | BenchmarkingC++ code | CodeCode Available | 1 |
| EfficientSRFace: An Efficient Network with Super-Resolution Enhancement for Accurate Face Detection | Jun 4, 2023 | BenchmarkingFace Detection | —Unverified | 0 |
| MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning | Jun 4, 2023 | BenchmarkingContrastive Learning | —Unverified | 0 |
| TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain | Jun 3, 2023 | BenchmarkingDecoder | CodeCode Available | 1 |
| Benchmarking Robustness of Adaptation Methods on Pre-trained Vision-Language Models | Jun 3, 2023 | Benchmarking | —Unverified | 0 |
| ACI-BENCH: a Novel Ambient Clinical Intelligence Dataset for Benchmarking Automatic Visit Note Generation | Jun 3, 2023 | Benchmarking | —Unverified | 0 |
| Multilingual Conceptual Coverage in Text-to-Image Models | Jun 2, 2023 | Benchmarking | CodeCode Available | 1 |
| BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models | Jun 2, 2023 | BenchmarkingLanguage Acquisition | CodeCode Available | 1 |
| Spatially Resolved Gene Expression Prediction from H&E Histology Images via Bi-modal Contrastive Learning | Jun 2, 2023 | BenchmarkingContrastive Learning | CodeCode Available | 1 |
| Break a Lag: Triple Exponential Moving Average for Enhanced Optimization | Jun 2, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Hybrid Long Document Summarization using C2F-FAR and ChatGPT: A Practical Study | Jun 1, 2023 | ArticlesBenchmarking | —Unverified | 0 |
| The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI | Jun 1, 2023 | BenchmarkingBrain Tumor Segmentation | —Unverified | 0 |
| Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment | Jun 1, 2023 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |
| End-to-end Knowledge Retrieval with Multi-modal Queries | Jun 1, 2023 | BenchmarkingCross-Modal Retrieval | CodeCode Available | 1 |
| Speech Self-Supervised Representation Benchmarking: Are We Doing it Right? | Jun 1, 2023 | BenchmarkingDecoder | CodeCode Available | 0 |
| Improving and Benchmarking Offline Reinforcement Learning Algorithms | Jun 1, 2023 | AttributeBenchmarking | CodeCode Available | 1 |