| Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram | Feb 2, 2024 | DiagnosticECG Classification | CodeCode Available | 2 |
| MVBench: A Comprehensive Multi-modal Video Understanding Benchmark | Nov 28, 2023 | 3D Question Answering (3D-QA)Diagnostic | CodeCode Available | 2 |
| Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review | Nov 3, 2023 | Diagnostic | CodeCode Available | 2 |
| HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models | Oct 23, 2023 | DiagnosticHallucination | CodeCode Available | 2 |
| BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models | Sep 12, 2023 | DiagnosticNatural Language Understanding | CodeCode Available | 2 |
| Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling | Jul 16, 2023 | DiagnosticLanguage Modelling | CodeCode Available | 2 |
| Evaluating AI systems under uncertain ground truth: a case study in dermatology | Jul 5, 2023 | DiagnosticMedical Diagnosis | CodeCode Available | 2 |
| A Transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics | Jun 1, 2023 | DiagnosticRepresentation Learning | CodeCode Available | 2 |
| Perception Test: A Diagnostic Benchmark for Multimodal Video Models | May 23, 2023 | DiagnosticGrounded Video Question Answering | CodeCode Available | 2 |
| DeepEdit: Deep Editable Learning for Interactive Segmentation of 3D Medical Images | May 18, 2023 | Active LearningDiagnostic | CodeCode Available | 2 |